Monday, March 14, 2016

The deep difference between acceptable and grammatical

Up comes a linguist in the street interviewer and asks: “So NH, what would grad students at UMD find to be one of your more annoying habits?” I would answer: my unrelenting obsession with forever banishing from the linguistics lexicon the phrase “grammaticality judgment.” As I never tire of making clear, usually in a flurry of red ball-point scribbles and exclamation marks, the correct term is “acceptability judgment,” at least when used, as it almost invariably is, to describe how speakers rate some bit of data. “Acceptability” is the name of the scale along which such speaker judgments array. “Grammaticality” is how linguists explain (or partly explain) these acceptability judgments. Linguists make grammaticality judgments when advancing one or another analysis of some bit of acceptability data. I doubt that there is an interesting scale for such theoretical assessments.

Why the disregard for this crucial difference among practicing linguists? Here’s a benign proposal. A sentence’s acceptability is prima facie evidence that it is grammatical and that a descriptively adequate G should generate it. A sentence’s unacceptability is prima facie evidence that a descriptively adequate G should not generate it. Given this, using the terms interchangeably is no big deal. Of course, not all facies are prima and we recognize that there are unacceptable sentences that an adequate G should generate and that some things that are judged acceptable nonetheless should not be generated. We thus both recognize the difference between the two notions, despite their intimate intercourse, and interchange them guilt free.

On this benign view, my OCD behavior is simple pedantry, a sign of my inexorable aging and decline. However, I recently read a paper by Katz and Bever (K&B) that vindicates my sensitivities (see here), which, of course, I like very much and am writing to recommend to you (I would nominate it for classic status).[1] Of relevance here, K&B argues that the distinction between grammaticality and acceptability is an important one and that blurring it often reflects the baleful influence of that most pernicious intellectual habit of mind, EMPIRICISM! I have come to believe that K&B is right about this (as well, I should add, about many other things, though not all). So before getting into the argument regarding acceptability and Empiricism, let me recommend it to you again. Like an earlier paper by Bever that I posted about recently (here), this is a Whig History of an interesting period of GG research. There is a sustained critical discussion of early Generative Semantics that is worth looking at, especially given the recent rise of interest in these kinds of ideas. But this is not what I want to discuss here. For the remainder, let me zero in on one or two particular points in K&B that got me thinking.

Let’s start with the acceptability vs grammaticality distinction. K&B spend a lot of time contrasting Chomsky’s understanding of Gs and Transformations with Zellig Harris’s. For Harris, Gs were seen as compact ways to cataloguing linguistic corpora. Here is K&B (15):

 …grammars came to be viewed as efficient data catalogues of linguistic corpora, and linguistic theory took the form of a mechanical discovery procedure for cataloguing linguistic data.

Bloomfieldian structuralism concentrated on analyzing phonology and morphology in these terms. Harris’s contribution was to propose a way of extending these methods to syntax (16):

Harris’s particular achievement was to find a way of setting up substitution frames for sentences so that sentences could be grouped according to the environments they share, similar to the way that phonemes or morphemes were grouped by shared environments…Discourse analysis was…the product of this attempt to extend the range of taxanomic analysis beyond the level of immediate constituents.

Harris proposed two important conceptual innovations to extend Structuralist taxonomic techniques to sentences; kernel sentences and transformations. Kernels are a small “well-defined set of forms” and transformations, when applied to kernels, “yields all the sentence constructions of the language” (17). The coocurrence restrictions that lie at the heart of the taxanomy are stated at the level of kernel sentences. Transformations of a given kernel define an equivalence class of sentences that share the same discourse “constituency.” K&B put this nicely, albeit in a footnote (16:#3):

In discourse analysis, transformations serve as the means of normalizing texts, that is of converting the sentences of the text into a standard form so that they can be compared and intersentence properties [viz. their coocuurences, NH] discovered.

So, for Harris, kernel sentences and transformations are ways of compressing a text’s distributional regularities (i.e. “cataloguing the data of a corpus” (12)).[2]

This is entirely unlike the modern GG conception due to Chomsky, as you all know. But in case you need a refresher, for modern GG, Gs are mental objects internalized in brains of native speakers and which underlie their ability to produce and understand an effectively unbounded number of sentences, most of which have never before been encountered (aka; linguistic creativity). Transformations are a species of rule these mental Gs contain that map meaning relevant levels of G information to sound relevant (or articulator relevant) levels of G information. Importantly, on this view, Gs are not ways of characterizing the distributional properties of texts or speech. They are (intended) descriptions of mental structures.

As K&B note, so understood, much of the structure of Gs is not surface visible. The consequence?

The input to the language acquisition process no longer seems rich enough and the output no longer simple enough for the child to obtain its knowledge of the latter by inductive inferences that generalize the distributional regularities found in speech. For now the important properties of the language lie hidden beneath the surface form of sentences and the grammatical structure to be acquired is seen as an extremely complex system of highly intricate rules relating the underlying levels of sentences to their surface phonetic form. (12)

In other words, once one treats Gs as mental constructs the possibility of an Empiricist understanding of what lies behind human linguistic facility disappears as a reasonable prospect and is replaced by a Rationalist conception of mind. This is what made Chomsky’s early writings on language so important. They served to discredit empiricism in the behavioral sciences (though ‘discredit’ is too weak a word for what happened). Or as K&B nicely summarize matters (12):

From the general intellectual viewpoint, the most significant aspect of the transformationalist revolution is that it is a decisive defeat of empiricism in an influential social science.  The natural position for an empiricist to adopt on the question of the nature of grammars is the structuralist theory of taxanomic grammar, since on this theory every property essential to a language is characterizable on the basis of observable features of the surface form of its sentences. Hence, everything that must be acquired in gaining mastery of a language is “out in the open”; moreover, it can be learned on the basis of procedures for segmenting and classifying speech that presupposes only inductive generalizations from observable distributional regularities. On the structuralist theory of taxanomic grammar, the environmental input to language acquisition is rich enough, relative to the presumed richness of the grammatical structure of the language, for this acquisition process to take place without the help of innate principles about the universal structure of language…

Give up the idea that Gs are just generalizations of the surface properties of speech, and the plausibility of Empiricism rapidly fades. Thus, enter Chomsky and Rationalism, exit taxonomy and Empiricism.

The shift from the Harris Structuralist, to the Chomsky mentalist, conception of Gs naturally shifts interest to the kinds of rules that Gs contain and to the generative properties of these rules. And importantly, from a rule-based perspective it is possible to define a notion of ‘grammaticality’ that is purely formal: a sentence is grammatical iff it is generated by the grammar. This, K&B note is not dependent on the distribution of forms in a corpus. It is a purely formal notion, which, given the Chomsky understanding of Gs, is central to understanding human linguistic facility. Moreover, it allows for several conceptions of well-formedness (phonological, syntactic, semantic etc.) that together contribute along with other factors to a notion of acceptability, but are not reducible to it. So, given the rationalist conception, it is easy and natural to distinguish various ingredients of acceptability.

A view that takes grammaticality to just be a representation of acceptability, the Harris view, finds this to be artificial at best and ill-founded at worst (see K&B quote of Harris p. 20). On a corpus-based view of Gs, sentences are expected to vary in acceptability along a cline (reflecting, for example, how likely they are to be found in a certain text environment). After all, Gs are just compact representations of precisely such facts. And this runs together all sorts factors that appear diverse from the standard GG perspective. As K&B put it (21):

Statements of the likelihood of new forms occurring under certain conditions must express every feature of the situation that exerts an influence on likelihood of occurrence. This means that all sorts of grammatically extraneous features are reflected on a par with genuine grammatical constraints. For example, complexity of constituent structure, length of sentences, social mores, and so on often exerts a real influence on the probability that a certain n-tuple of morphemes will occur in the corpus.

Or to put this another way: a Rationalist conception of G allows for a “sharp and absolute distinction between the grammatical and the ungrammatical, and between the competence principles that determine the grammatical and anything else that combines with them to produce performance” (29). So, a Rationalist conception understands linguistic performance to be a complex interaction effect of discrete interacting systems. Grammaticality does not track the linguistic environment. Linguistic experience is gradient. It does not reflect the algebraic nature of the underlying sub-systems. ‘Acceptability’ tracks the gradiance, ‘grammaticality’ the discrete algebra. Confusing the two threatens a return to structuralism and its attendant Empiricism.

Let me mention one other point that K&B makes that I found very helpful. They outline what a Structuralist discovery procedure (DP) is (15). It is “explicit procedures for segmenting and classifying utterances that would automatically apply to a corpus to organize it in a form that meets” four conditions:

1.     The G is a hierarchy of classes; lower units being temporal segments of speech event, the higher are classes or sequences of classes.
2.     The elements of each level are determined by their distributional features together with their representations at the immediately lower level.
3.     Information in the construction of a G flows “upward” from level to level, i.e. no information at a higher level can be used to determine an analysis at a lower level.
4.     The main distributional principles for determining class memberships at level Li are complementary distribution and free variation at level Li-1.

Noting the structure of a discovery procedure (DP) in (1-4) allows us to appreciate why Chomsky stressed the autonomy of levels in his early work. If, for example, the syntactic level is autonomous (i.e. not inferable from the distributional properties of other levels) then the idea that DPs could be adequate accounts of language learning evaporates.[3] And once one focuses on the rules relating articulation and interpretation the plausibility of a DP for language with the properties in (1-4) becomes very implausible, or, as K&B nicely put it (33):

Given that actual speech is so messy, heterogeneous, fuzzy and filled with one or another performance error, the empiricist’s explanation of Chomskyan rules, as having been learned as a purely inductive generalization of a sample of actual speech is hard to take seriously to say the very least.[4]

So, if one is an Empiricist, then one will have to deny the idea of a G as a rule based system of the GG variety. Thus, it is no surprise that Empiricists discussing language like to emphasize the acceptability gradients characteristic of actual speech. Or, to put this in terms relevant to the discussion above, why Empiricists will understand ‘grammaticality’ as the limiting case of ‘acceptability.’

Ok, this post, once again, is far too long. Look at the paper. It’s really good and useful. It also is a useful prophylactic against recurring Empiricism and, unfortunately, we cannot have too much of that.

[1] Sadly, pages 18-19 are missing from the online version. It would be nice to repair this sometime in the future. If there is a student of Tom’s at U of Arizona reading this, maybe you can fix it.
[2] I would note that in this context corpus linguistics makes sense as an enterprise. It is entirely unclear whether it makes any sense once one gives up this structuralist perspective and adopts a Chomsky view of Gs and transformations. Furthermore, I am very skeptical that there exist Harris-like regularities over texts, even if normalized to kernel sentences. Chomsky’s observation that sentences are not “stimulus bound” if accurate (and IMO they are) undermine the view that we can say anything at all about the distributions of sentences in texts. We cannot predict with any reliability what someone will say next (unless, of course, it is your mother), and even if we could in some stylized texts, it would tell us nothing about how the sentence could be felicitously used. In other words, there would be precious little generalizations across texts. Thus, I doubt that there is any interesting statistical regularities regarding the distribution of sentences in texts (at least understood as stretches of discourse).
            Btw, there is some evidence for this. It is well known that machines trained on one kind of corpus do a piss poor job of generalizing to a different kind of corpus. This is quite unexpected if figuring out the distribution of sentences in one text gave you a good idea of what would take place in others. Understanding how to order in a restaurant or make airline reservations does not carry over well to a discussion of Trump’s (and the rest of the GOP’s) execrable politics, for example. A long time ago, in a galaxy far far away I very polemically discussed these issues in a paper with Elan Dresher that still gives me chuckles when I read it. See Cognition 1976, 4, pp.32l‑398.
[3] That this schema looks so much like those characteristic of Deep Learning suggests that it cannot be a correct general theory of language acquisition. It just won’t work, and we know this because it was tried before.
[4] That speech is messy is a problem, but not the only problem. The bigger one is that there is virtually no evidence in the PLD for may of G properties (e.g. ECP effects, island effects, binding effects etc.). Thus the data is both degenerate and deficient.


  1. »my unrelenting obsession with forever banishing from the linguistics lexicon the phrase “grammaticality judgment.”«

    Thanks, Norbert, for saying this -- I share your obsession. It's amazing (and embarassing) how often you see that oxymoron used in the professional literature; it's even in the subtitle of Schütze's recently re-released book on the matter ( I think this sloppiness says a lot about how much linguists tend to care about the conceptual foundations of what they're doing, namely very little, in many cases.

    I also think the problem is deeper than you suggest here. Grammaticality is defined theory-internally, as a property of grammatical representations. Acceptability, on the other hand, is a purely intuitive notion that to my knowledge has no reasonably sharp, theoretically relevant definition, and that applies to vaguely defined stimuli ("sentences"). It's an entirely informal notion, but we're customarily ignoring this fact (vide all the fancy experiments designed to "measure acceptability"), while at the same time everybody, I think, concedes that all kinds of intractable cognitive factors enter into such judgments, so the default assumption should be that they tell you next to nothing of any interest. I don't know of any reason to suppose that there is any interesting correlation whatsoever between grammaticality and acceptability, whatever the latter may be, and it's hard to see how there could be given that they apply to very different things (representations vs. "inputs" or something like that).

    The obsession of our field with this undefined notion of "acceptability" goes back, I would speculate, to the early days of GG, when some sort of idealization was (tacitly?) accepted that essentially equated the speaker with an automaton that either accepts of rejects a given string. But back then, grammaticality was actually defined over strings (grammars defined sets of "well-formed" strings), so perhaps there was some foundation for making this leap of faith from competence to performance. But nowadays we don't understand natural-language grammars to generate strings ("well-formed sentences"), but rather hierarchical structures and derivative form-meaning pairs, which then feed other cognitive systems. So what justification do we have to assume that the "acceptability" (whatever that is) of stimuli bears any significant relation to the logically distinct notion of grammaticality?

    I should add that I don't think the notion of "acceptability" doesn't have a justified usage, namely when we use it to describe the fact that some externalized form does or doesn't permit a particular interpretation. Often, saying that something is "unacceptable" is just a shorthand for that (e.g., we say "What does John like and apples?" is unacceptable, but we really mean that it doesn't have the expected interpretation "which x : John likes x and apples"), and this seems to me to be closer to the kind of data that we want our theory to model (form-meaning correspondence). But it's not supposed to be a model of "acceptability" per se, and couldn't possibly be given that a grammar is by definition a competence system.

    I think all of this raises non-trivial questions for our field and the kinds of data it relies on, which hardly anybody cares to address, it seems.

    1. I find the suggested OK usage of 'acceptable' problematic because there are so many intepretable sentences that nonetheless have some kind of problem that can traditionally be described as 'grammatical', and which generative grammar has quite a lot to say about. Like when I asked the shop girl in Reykjavík for 'tveir kókar' and she looked at my archly and said 'tvær' (I had successfully guessed the plural of 'coke', but not the gender).

      Indeed I think you can rescue the concept of 'ungrammatical sentence' by reserving it for cases where either (a) there is no interpretation what so ever (b) there is an accessible interpretation, but has an observable acceptability problem of some kind, and your theory of that is that it violates a rule or principle of grammar, but not in such a severe way that no interpretation at all is available. But 'grammaticality judgement' is indeed a mistake.

    2. Avery: on your first point, I agree, but only once we are more precise about the "some kind of problem" part do we generate actual data for our theory.

      Not sure I understand the second point. "Ungrammatical sentence" is as much an oxymoron as "grammaticality judgment", unless we understand sentence to mean something very technical and theory-dependent (say, structured object with sound and meaning properties, or whatever kinds of objects your grammar generates). Why would "no interpretation" necessarily imply ungrammaticality? I know this is often assumed, but I've never seen it justified. Take "Who did John kiss Mary?". Does it have an interpretation (a natural one rather than one we arrive at by treating it as a puzzle)? Is it grammatical? That depends on the theory (e.g., if the grammar constructs it but Full Interpretation is a C-I principle, then the answer is yes). So I don't see how the inference from "no interpretation" (whatever that means!) to "ungrammatical" could be an a priori assumption.

    3. On point 1, we (or, more precisely, Chomsky in the early-mid 1950s) started with a fairly traditional idea of what 'grammar' was, & pushed things along from there on the basis of what seemed to make sense. But not necessarily in ways that can't be questioned to some extent. For example, we traditionally think of Island Constraints as part of 'grammar' aka 'competence', and the constraints on center embedding as something else, aka 'performance', but, (drawing on a convo I had with Howard Lasnik a rather long time ago), to the extent that the Island Constraints are universal, this division can be questioned.

      Because if the Island Constraints aren't learned, then they can be regarded as part of the computational infrastructure that grammars run on, similarly to the evidently limited and somewhat flaky pushdown store needed to run recursion, and both seem to have some points in common:
      a) they are plausibly explained in terms of processing limitations
      b) arguably, both can be somewhat overcome by practice, Swedes getting practice in ignoring Island Constraints, the authors of Ancient Greek literature and Early Modern German laws getting practice in violating the center embedding constraints.

      But nothing terrible is going to happen if the borders between grammar/competence and performance get pushed around in various ways, as long as people are trying to produce sensible accounts of the phenomena, rather than relegating them to wastebaskets or boxes in the garage to be ignored.

      As for 2, I take 'no interpretation' as an intuitive rather than a theoretical concept. For example, if a student wrote 'what did John disprove Mary' in a term paper (changing the verb to suit the context), you probably wouldn't know how to correct it, due to being unable to find any interpretation. Whereas papers by good non-native English-speaking students tend to have a certain number of easy to fix 'grammatical' errors, but nothing that is uninterpretable (at least not more than the native English speaking students produce). An interesting case wrt having/lacking interpretations is nonsense & surrealistic poetry, which I think can be characterized as having syntactic structures that control semantic interpretation in a normal way, but where the lexical items either lack meanings, or they don't fit together properly ('inflammable white ideas', very likely a steal from Chomsky by a Nobelista poet (Elytis, Greek)). So the notion of what an 'interpretation' is could clearly use some refinement.

      So, I think theory can elaborate the boundaries and shift them around, but I don't see it as necessary to get started. That might be part of the problem here ... presumably anybody can cook up something over the weekend that can be told to an introductory class, and you can in fact go a very long way with what can be cooked up in this way, to the extent that hardly anybody can be bothered to spend the time to work things out more carefully and extensively. Since we all have to be productive enough ... even retired people, in order to retain their institutional affiliations.

    4. I think I can push this a bit farther, and at least partially salvage 'grammaticality judgement'. On the basis that, when you concoct a sentence and submit it to your native speaker friends for a 'grammaticality judgement', you've hopefully done your best to make sure that it isn't unacceptable for 'nongrammatical' reasons, such as being contradictory, surrealistic, jabberwocky language, obscene or otherwise taboo or socially objectionable, containing stylistic clashes that that you don't have enough literary skills in the language to get away with, etc. etc.

      So what are the grammatical reasons? My semi-flippant proposal is that they are the things that you can't easily explain to your intelligent relatives who haven't taken a syntax course. Everything on the list of 'non-grammatical' reasons is highly accessible to 'common sense' by anybody clever enough to get a BA, regardless of whether they have actually done this or not, whereas the 'grammatical' reasons why something might be bad are not.

    5. Thanks for your very interesting replies, Avery. As for point 2 of your first response, this is precisely the problem: notions like "(un)acceptable" or "no interpretation" are intuitive, and presumably reflect some underlying causes, but I don't see how this is any real reason (rather than a bet we're placing) to assume any meaningful connections to/correlations with technical notions like (un)grammaticality. Perhaps something interesting can be said in this contexts about your "easily correctible" errors like inflectional mistakes as opposed to unintelligible (?) sentences like "what did John disprove Mary", but even for the latter I see no justification to draw any conclusions about grammaticality.

      This relates to the second point you're raising, which outlines a view that (I suspect) many people would subscribe to: control for the plausibly 'nongrammatical' factors and you're left with the (more or less) direct expression of grammaticality. But that doesn't change the fact that the objects people 'judge' (sentences/strings/stimuli) are not equivalent to objects that could be (un)grammatical.

      And we cannot know in advance what the 'nongrammatical' factors are; perhaps your "semi-flippant proposal" is entirely correct, but it's certainly not obvious. To recycle my example from above: the ban against vacuous operators that we presumably see at work in cases like "what did John disprove Mary" (or the identity condition governing ellipsis, or Fox's Scope Economy condition, or...) are surely not "highly accessible to common sense", but if Chomsky is right and these are constraints imposed by external systems then they're not grammatical factors at all. (I'm not saying that Chomsky *is* right, I'm just using his claims as an example of how what we consider 'grammatical factors' is crucially dependent on the general architecture of grammar and interacting systems we assume.)

    6. I share some of Avery's thinking on this.

      There are examples routinely marked as * (e.g., *he donated the library some books, "*wang" as opposed to 'winged", "*clinged" as opposed to "clung"), but they seem different from those such as "*car the" (for English speakers), and presumably universally, "*whom did you see John and __", "*Is the man who tall is happy?.

      For me, the former has a great deal with learning lexical idiosyncrasies: somehow we/children know they are idiosyncratic and that variation is to a great extent arbitrary. This is presumably the reaction of the Icelandic shopkeeper at Avery's grammatical gender gaffe. Speakers often say that although they themselves may not produce those strings, they can see other people doing it. But the latter concerns hard UG principles or language-specific properties of functional categories that are presumably established very early and uniformly across speakers. They seem to elicit a "harder" sense of violation: No one could conceivably talk that way.

    7. Trying to generalize/bloviate on about Charles' & Dennis' comments, if either Chomsky (or Everett!!) is right, language-as-we-encounter it on the street is a sort of cognitive contraption made out of bits and pieces with various origins (some of which may be truly specific to language in some interesting way, but I don't think that this has been demonstrated adequately).

      My intuitive notion of "ungrammatical = bad for reasons that you can't easily explain for your relatives" is I think OK to get started, but proper work ought to deliver

      1. a more refined classification of reasons why a form+interpretation pair can be wrong (which might be based on core grammar vs periphery, or something completely differnet).
      2. an account of how people choose the interpretation they impose on an 'intelligible but unacceptable sentence'
      3. an account of what 'intelligibility' really is, given that the linguistic world includes jabberwocky and surrealistic utterances (and other strange things).

      I suggest that 3 is the easiest ... for young children and L2 learners, many utterances contain unknown words, or previously encountered words used in ways currently novel to the learner, so there need to be facilities for dealing with them (recall also that in many societies, multilingualism is very common, for a variety of reasons, and they're not all learned in childhood). I think that 2 is really more psychology than linguistics, and that linguists can get by with the ideas that (a) people try to minimalize the number of errors in the structure (b) maximize the plausibility of the interpretation.

      Leaving 1 as the job for us.

  2. This comment has been removed by the author.