Monday, September 19, 2016

Brain mechanisms and minimalism

I just read a very interesting shortish paper by Dehaene and associates (Dehaene, Meyniel, Wacongne, Wang and Pallier (DMWWP) that appeared in Neuron. I did not find an open source link, but you can use this one if you are university affiliated. I recommend it highly, not the least reason being that Neuron is a very fancy journal and GG gets very good press there. There is a rumor running around that Cog Neuro types have dismissed the findings of GG as of little interest or consequence to brain research. DMWWP puts paid to this and notes, quite rightly, that the problem lies less with GG than with the current state of brain science. This is a decidedly Gallistel inspired theme (i.e. the cog part of cog-neuro is in many domains (e.g. language) healthier and more compelling than the neuro part and it is time for the neuro types to pay attention and try to find mechanisms adequate for dealing with the well grounded cog stuff that has been discovered rather than think it msut be false because the inadequate and primitive neuro models (i.e. neural net/connectionist) don’t have ways of dealing with it) and the more places it gets said the greater the likelihood that CN types will pay attention. So, this is a very good piece for the likes of us (or at least me).

The goal of the paper is to get Cog-Neuro Science (CNS) people to start taking the integration of behavioral, computational and neural as CNS’s main central concern. Here is the abstract:

A sequence of images, sounds, or words can be stored at several levels of detail, from specific items and their timing to abstract structure. We propose a taxonomy of five distinct cerebral mechanisms for sequence coding: transitions and timing knowledge, chunking, ordinal knowledge, algebraic patterns, and nested tree structures. In each case, we review the available experimental paradigms and list the behavioral and neural signatures of the systems involved. Tree structures require a specific recursive neural code, as yet unidentified by electrophysiology, possibly unique to humans, and which may explain the singularity of human language and cognition.

I found the paper interesting in at least three ways.

First, it focuses on mechanisms, not phenomena. So, the paper identifies five kinds of basic operations that reasonably underlies a variety of mental phenomena and takes the aim of CNS to (i) find where in the brain these operations are executed, (ii) provide descriptions of circuits/computational operations that could execute such operations and (iii) investigate how these circuits might be/are neutrally realized.

Second, it shows how phenomena can be and have been used to probe the structure of these mechanisms. This is very well done for the first three kinds of mechanisms: (i) approximate timing of one item relative to the proceeding one, (ii) chunking items into larger units, and (iii) the ordinal ranking of items. Things get more speculative (in a good way, I might add) for the more “abstract” operations: the coding of “algebraic” patterns and nested generated structures.

Third, it gives you a good sense of the kinds of things that CNS types want from linguistics and why minimalism is such a good fit for these desires.

Let me say a word about each.

The review of the literature on coding time relations is a useful pedagogical case. DMWWP reviews the kind of evidence used to show that organisms “maintain internal representations of elapsed time” (3). It then look for “a characteristic signature” of this representation and the “killer” data that supports the representational claim. It then reviews the various brain locations that respond to these signature properties and review the kind of circuit that could code this kind of representation, arguing that “predictive coding” (i.e. ones that “form an internal model of input sequences”) is the right one in that it alone accommodates the basic behavioral facts (4) (basically minsmatched negativity effects without an overt mismatch). Next, it discusses a specific “spiking neuron model” of predictive coding (4) that “requires a neurophysiological mechanism of “time stamp” neurons that are tuned to specific temporal intervals,”  which have, in fact, been found in various parts of the brain. So, in this case we get the full Monte: a task that implicates signature properties of the mechanism, that demands certain kinds of computational circuits, realized by specific neuronal models, realized in neurons of a particular kind, found in different parts of the brain. It is not quite the Barn Owl (see here), but it is very very good.

DMWWP do this more or less again for chunking, though in this case “the precise neural mechanisms of chunk formulation remain unknown” (6). And then again for ordinal representations. Here there are models for how this kind of information might be neutrally coded in terms of “conjunctive cells jointly sensitive to ordinal information and stimulus identity” (8). These kinds of conjunctive neurons seem to be all over the place, with potential application, DMWWP suggests, as neuronal mechanisms for thematic saturation.

The last two kinds of mechanisms, those that would be required to represent algebraic patterns and hierarchical tree-like structures are behaviorally very well-established but currently pose very serious challenges on the neuro side. DMWWP observes that humans, even very young ones, demonstrate amazing facility in tracking such patterns. Monkeys also appear able to exploit similar abstract structures, though DMWWP suggests that their algebraic representations are not quite like ours (9). DMWWP further correctly notes that these sorts of patterns and the neural mechanisms underlying them are of “great interest” as “language, music and mathematics” are replete with such. So, it is clear that humans can deploy algebraic patters which “abstract away from the specific identity and timing of the sequence patterns and to grasp their underlying pattern,” and maybe other animals can too. However, to date there is “no accepted neural network mechanism to accomplish this and it looks like “all current neural network models seem too limited to account for abstract rule-extraction abilities” (9). So, the problem for CNS is that it is absolutely clear that human (and maybe monkey) brains have algebraic competence though it is completely unclear how to model this in wet ware. Now, that is the right way to put matters!

This last reiterates conclusions that Gallistel and Marcus have made in great detail elsewhere. Algebraic knowledge requires the capacity to distinguish variables from values of variables. This is easy to do in standard computer architectures but is not at all trivial in connectionist/neural net frameworks (as Gallistel has argued at length (e.g. see here)). Indeed, one of Gallistel’s main arguments with such neural architectures is their inability to distinguish variables from their values, and to store them separately and call them as needed. Neural nets don’t do this well (e.g. they cannot store a value and later retrieve it), and that is the problem because we do and we do it a lot and easily. DMWWP basically endorses this position.

The last mechanism required is one sufficient to code the dependencies in a nested tree.[1] One of the nice things about DMWWP is that it recognizes that linguistics has demonstrated that the brain codes for these kinds of data structures. This is obvious to us, but the position is not common in the CNS community and the fact that DMWWP is making this case in Neuron is a big deal. As in the case of algebraic patterns, there is no good models of how these kinds of (unbounded) hierarchical dependencies might be neurally coded. The DMWWP conclusion? The CNS community should start working on the problem. To repeat, this is very different from the standard CNS reaction to these facts, which is to dismiss the linguistic data because there are no known mechanisms for dealing with it.

Before ending I want to make a couple of observations.

First, this kind of approach, looking for basic computational mechanisms that are implicated in a variety of behaviors, fits well with the aims of the minimalist program (MP). How so? Well, IMO, MP has two immediate theoretical goals: to show that the standard kinds of dependencies characteristic of linguistic competence are all different manifestations of the same underlying mechanism (e.g. are all instances of Merge). Were it possible to unify the various modules (binding, movement, control, selection, case, theta, etc) as different faces of the same Merge relation and were we able to find the neural “merge” circuit then we would have found the neural basis for linguistic competence. So if all grammatical relations are really just ones built out of merges, then CNSers of language could look for these and thereby discover the neural basis for syntax. In this sense, MP is the kind of theory that CNSers of language should hope is correct. Find one circuit and you’ve solved the basic problem. DMWWP clearly has bought into this hope.

Second, it suggests what GGers with cognitive ambitions should be looking for theoretically. We should be trying to extract basic operations from our grammatical analyses as these will be what CNSers will be interested in trying to find. In other words, the interesting result from a CNS perspective is not a specification of how a complicated set of interactions work, but isolating the core mechanisms that are doing the interacting. And this implies, I believe, trying to unify the various kinds of operations and modules and entities we find (e.g. in a theory like GB) to a very small number of core operations (in the best case just one). DMWWP’s program aims at this level of grain, as does MP and that is why they look like a good fit.

Third, as any MPer knows, FL is not just Merge. There are other operations. It is useful to consider how we might analyze linguistic phenomena that are Merge recalcitrant in these terms. Feature checking and algebraic structures seem made for each other. Maybe memory limitations could undergird something like phases (see DMWWP discussion of a Marcus suggestion on p. 11 that something like phases chunk large trees into “overlapping but incompletely bound subtrees”). At any rate, getting comfortable with the kinds of mental mechanisms extant in other parts of cognition and perception might help linguists focus on the central MP question: what basic operations are linguistically proprietary? One answer is: those operations required in addition to those that other animals have (e.g. time interval determination, ordinal sequencing, chunking, etc.).

This is a good paper, especially so because of where it appears (a very leading brain journal) and because it treats linguistic work as obviously relevant to the CNS of language. The project is basically Marr’s, and unlike so much CNS work, it does not try to shoehorn cognition (including language) into some predetermined conception of neural mechanism which effectively pretends that what we have discovered over the last 60 years does not exist.

[1] DMWWP notes that the real problem is dependencies in an unbounded nested tree. It is not merely the hierarchy, but the unboundedness (i.e. recursion) as well.


  1. I like the way Dehaene et al. pose the questions of representation and computation, derived from linguistics and Minimalism, as challenges for neuroscience. Super job on that point. I do have some concerns about their characterization of the language network in the brain.

    It's overwhelmingly clear that Broca's area / IFG cannot be critical or fundamental for sentence processing. If you destroy that area, people are pretty much fine with respect to basic sentence comprehension and acceptability judgments (Mohr et al., 1978: Linebarger et al., 1983). They do not cite any of this (old) literature, because it's obviously a massive red flag for their account. There may be ways to address this literature and preserve what they're saying, but they don't even try to raise the issue. This is a problematic oversight.

  2. This comment has been removed by the author.