Before Chomsky, researchers saw their job almost exclusively as the collection of data. All languages were seen to be composed of a set of meaningful sentences, each composed of a set of words, each of which was in turn composed of phonemes and morphemes. Each language also had a grammar which determined the ways in which words could be correctly combined to form sentences, and how the sentences were to be understood and pronounced. It was held that the best way to understand the over 2,500 languages said to exist was to collect and sort data about them so that eventually the patterns characterising the grammar of each language would emerge, and that then, interesting differences among different languages, and even groups of languages, might also emerge.
Chomsky’s revolutionary argument, begun in Syntactic Structures (1957), and consequently developed in Aspects of the Theory of Syntax (1965) and Knowledge of Language (1986) was that all human beings are born with an innate grammar – a fixed set of mental rules that enables children to create and utter sentences they have never heard before. Chomsky asserted that language learning was a uniquely human capacity, a result of Homo Sapiens’s possession of what Chomsky referred to as a Language Acquisition Device. Chomsky eventually claimed that language consists of a set of abstract principles that characterise the core grammars of all natural languages, and that the task of learning one’s L1 is thus simplified since one has an innate mechanism that constrains possible grammar formation. Children do not have to learn those features of the particular language to which they are exposed that are universal, because they know them already. The job of the linguistic was to describe this generative, or universal, grammar, as rigorously as possible.
It is important to emphasise that Chomsky’s theory has gone through various quite radical changes since 1957 (see Cook and Newson, 1996: 41). Perhaps the most stable part of Chomsky’s theory is expressed in his “Principles and Parameters” model, outlined in the early 1980s, and this will be discussed below.
The arguments for Universal Grammar (UG) start with the poverty of the stimulus argument, often referred to as the “logical problem of language learning”: children learning their first language cannot induce rules of grammar from the input they receive, the knowledge of language which they manifest cannot be explained by appealing to the language they are exposed to. On the basis of degenerate input children produce language which is far more complex and rule-based than could be expected, and which is very similar to that of other adult native speakers of the same language variety, at an age when they have difficulty grasping abstract concepts. That their production is rule-based and not mere imitation as the behaviourist view held, is shown by the fact that they frequently invent well-formed utterances of their own. That they have an innate capacity to discern well-formed utterances is shown by a number of different studies, for example, the often-cited (White, 1989) study of L1 English learners’ use of “wanna”, where input does not explain how children know when the use of “wanna” is correct or not.
Chomsky’s model of language distinguished between competence and performance, between the description of underlying knowledge, and the use of language, influenced as the latter is by limits in the availability of computational resources, stress, tiredness, alcohol, etc. Chomsky is concerned with “the rules that specify the well-formed strings of minimal syntactically functioning units” and with “an ideal speaker-listener, in a completely homogenous speech-community, who knows his language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of the language in actual performance” (Chomsky, 1965: 3).
As to the innateness of the language faculty, Chomsky points to the fact that language acquisition has nothing to do with intelligence, and that, despite the enormous complexities of this abstract knowledge, the vast majority of children successfully reach full linguistic competence by the age of five. Language, it is claimed, is separate from other aspects of cognition, and, according to Chomsky, is looked after by a special module of the mind. Thanks to this faculty of mind, language develops in more or less the same natural way as teeth or internal organs or limbs do.
Chomsky’s radical new approach to linguistics marked the beginning of an important shift not only in linguistics but also in SLA research. The shift was away from behaviourist assumptions and structural linguistics, away from the emphasis on the pedagogical implications of research, and towards an explanation of the phenomena of SLA as a research project in itself. Under the influence of Chomsky, a number of academics decided to deliberately ignore the implications for teaching in favour of developing a more rigorous theory that could deal more adequately with the phenomena of SLA.
In a series of steps (Chomsky 1980, 1981a, 1981b, 1986, 1987) Chomsky developed his Principles and Parameters Model, which, until fairly recently, was seen as the mature expression of his theory of Universal Grammar. The theory attempts to explain what linguistic knowledge consists of, and how it is acquired.
When a child experiences linguistic input, the parameter values of the universal grammar are set and this allows the child to understand and produce the specific language corresponding to the particular parameter settings. “Principles” are the universal, invariant design features of all human languages, while “parameters” constrain the limited possibilities for variation allowed. A parameter can have two or more values, and particular languages make different choices among the values allowed.
Chomsky’s “principles and parameters” model can be seen as an answer to the limitations of phrase structure grammars, which assume that sentences consist of phrases that have certain structures. Cook (1989) gives this lucid summary of structure-dependency. “To take an English example sentence “Max played the drums with Charlie Parker”, principles of phrase structure require every phrase in it to have a head of a syntactic category and permit it to have complements of various types; A Verb Phrase such as “played the drums” must have a head that is a verb, “play”, and may have a complement “the drums”, a Prepositional Phrase such as “with Charlie Parker” must have a head that is a preposition, “with”, and a complement “Charlie Parker”; Noun Phrases such as “Max”, “the drums”, and “Charlie Parker” must have noun heads and may, but in this case do not, have complements. This is not true only of English; the phrases of all languages consist of heads and possible complements – Japanese, Catalan, Gboudi and so on. The difference between the phrase structures of different languages lies in the order in which head and complement occur within the phrase; in English the head verb comes before the complement, the head preposition comes before its complement, while Japanese is the opposite. This variation in languages is captured by the head parameter, which has two settings “head first” and “head last” according to whether the head comes before or after the complement in the phrases of the language.”
Complementary to these phrase structure principles is the Projection Principle which claims that syntax and the lexicon are closely tied together. As well as knowledge of where the complement goes in the phrase, we need to know whether a complement is actually allowed, and this depends upon the lexical item that is used; hence the Projection Principle states that the English verb “play” must be specified as taking a complement (i.e. it is normally transitive); the lexical entry for the verb “faint” must specify it has no complement (i.e. it is intransitive), while that for the verb “give” must specify that it has two complements (i.e. direct and indirect objects). The question of whether the phrase structure of a sentence is grammatical is a matter not just of whether it conforms to the overall possible structures in the language but also whether it conforms to the particular structures associated with the lexical items in it; “Max played the drums” is grammatical because the verb occurs in the correct head-first position, compared to “Max the drums played” and because the verb “play” has an Object Noun Phrase following it, compared to “Max played”. (Cook, 1989: 169-170)
Other elements of UG include:
• Subjacency, which constrains the movement of categories. See Cook and Newson, 1996: 258-261.
• Case Theory, which constrains S structures. See Cook and Newson, 1996: 222-227.
• C-command and Government Theory, which constrain a number of the subsystems, such as Case Theory. See Cook and Newson, 1996: 234-239.
• Binding Theory, which constrains the formation of NPs. See Cook and Newson, 1996: 250-256.
To sum up: “UG consists of a highly structured and restrictive system of principles with certain open parameters to be fixed by experience. As these parameters are fixed, a grammar is determined, what we may call a `core grammar'” (Chomsky 1980, 67).
The principles are universal properties of syntax which constrain learners’ grammars, while parameters account for cross-linguistic syntactic variation, and parameter setting leads to the construction of a core grammar where all relevant UG principles are instantiated.
How is knowledge of the core grammar of language acquired? We have already noted that Chomsky takes a “nativist” approach – he sees language as an innate faculty of mind, a natural human endowment. This native endowment has been called by Chomsky the “Language Acquisition Device” (LAD), which can be seen as a “ black box”: children receive a certain amount of language input from their environment which is processed in some way by the LAD so that they end up with their linguistic competence. As Cook (1993) points out, the UG theory “fleshes out” the LAD “by establishing the crucial features of the input, the contents of the black box, and the properties of the resultant grammar.” (Cook, 1993: 200) The consequence, as already indicated, is that the “what” and “how” questions merge and the process of language acquisition is simply one of selecting, rather than learning. We know the principles of grammar innately, and parameter settings are triggered by input. We should remember that the final steady state does not comprise only the knowledge of UG. Pragmatic competence, knowledge of peripheral grammar and of lexis are also involved, but lie outside the domain of Chomsky’s theories.
Given that the principles are already in place in the mind, learning focuses on setting parameters and acquiring vocabulary. Input is important because it acts as a trigger, but it does not in itself account for acquisition. The acquisition of grammatical competence is organised through principles and parameters. The learning of vocabulary is organised in the lexicon and guided by the Projection Principle. The properties of a language’s lexical items (stored in the lexicon) are projected onto the syntax. Language grows.
Researchers in the field of UG seek to determine whether the various logical possibilities are found across languages. Greenberg, for example, discovered the universal that all languages with verb-subject-object word order have prepositions (McLaughlin 1987: 83). This was an absolute universal because there were no exceptions. It led to the generalisation that the VSO languages have prepositions, while non-VSO languages (SVO or SOV) can occur with or without prepositions. (Some non-VSO languages may have post-positions). Another absolute universal is that all languages without exception have vowels. That all languages have nasal consonants is only a tendency, because there are a few that have no nasal consonants at all. Most languages do not have clicks, but there are a few that do.
The Chomskyan view of language differentiated core and peripheral grammar. The core grammar is that which grows in the mind, dependent upon the application of principles and setting of parameters. UG determines the core, but outside of that lies a vast range of language, for example, idiomatic expressions. Other features may be derived from other languages and other processes of language development (fashion, historical development, trends, accidents, inventions etc). The core grammar accounts for the relatively prototypical, unmarked elements of the language. The periphery contains the marked elements, including for example idioms.
This leads to a view of language acquisition that has the child acquiring the core grammar, the basic tool-kit of the language through the principles and by setting parameter choices. As a result, unmarked forms are acquired through UG while peripheral marked forms may also be learned through experience. ‘Hence we may expect to find a continuum of markedness from core to periphery’ (Cook 1985: 6). Thus emerges an explanation for the acquisition of marked/unmarked forms. UG prescribes some obligatory structures, parameters allow limited choices and the theory of markedness accounts for the acquisition of peripheral elements. This means that lexis becomes the key learning burden: ‘A large part of language learning is a matter of determining, from presented data, the elements of the lexicon and their properties’ (Chomsky 1982: 8, cited in Cook 1985: 7).
In the last twenty years, Chomsky has made a number of serious modifications to his theory of UG. The Minimalist program (Chomsky, 1995) and a view of semantics part epistemological, part ontological, which he calls “Internalism” (Chomsky, 2000), are the most obvious examples. Those interested in following these developments are directed towards Cook and Newson (1996); the works of Vivian Cook already mentioned here, and Botha (1991). Despite Chomsky’s tendency to “move the goalposts”, the “Principles and Parameters” model of Universal Grammar quickly sketched above still best represents Chomsky’s contribution to attempts to explain language acquisition. It remains true to say that UG involves the following claims:
• A theory of linguistics is concerned with describing and explaining an individual’s knowledge of certain core principles of language, not with his or her or a community’s use of language
• The main way to test such a theory is through the intuitions of native speakers about whether or not sentences in their language are well formed or not.
• The language faculty responsible for our linguistic knowledge is innate and autonomous, i.e. it is an independent cognitive module that interacts with, but does not derive from, other cognitive faculties. UG theory rests on a modular view of cognition which sees the mind not as a uniform system, but rather as containing a central processing system and a set of autonomous systems or modules that function largely independent of one another.
See Botha (1991) for an engaging account of Chomsky’s main ctritics. Three of Chomsky’s best-known critics who cover the main charges made against him will be briefly outlined here.
Piaget suggested in the 1920s (see Piaget, 1960) that a child goes through four qualitatively different stages in the process of his cognitive development. Until the age of two, the child is sorting out space, objects and causality. From around two to five years old his thought processes begin to use mental images arising from imitation or words. Language skills and reasoning from memory also develop in this second stage. From the ages of five to ten, again approximately speaking, the child can classify hierarchical structures, understand ordinal relations, and the conservation of continuous properties like weight quantity and volume. From around ten to fourteen years old the real world is seen as one of possible worlds, logical thinking improves, and the child realises that appearances can be deceiving.
How does the child manage all this? Piaget says that the child’s knowledge develops by his interaction with his environment and by the use of two strategies: assimilation, where the child fits his new experiences into the established patterns of thought, and accommodation, where changing existing patterns are used to account for novel aspects of reality. The tensions between these two strategies for dealing with new information is resolved by what Piaget calls equilibrium which balances out the competing forces.
In Piaget´s opinion, there is no modularity of mind, no innate language faculty or any other specialised mechanism at work: the child creates his own concepts through interaction with the environment. In Piaget’s view, language is just one part of the knowledge the child acquires as he goes through his stages of development, constructing his understanding of the world for himself on the basis of dynamic interplay with the world around him.
Chomsky’s reply to Piaget was made publicly at the famous 1975 conference at Royaumont, where Piaget, Chomsky, Fodor, and others gathered to discuss the limitations of the genetic contribution to culture. Criticising Piaget’s four stages of development, Chomsky suggested that if children must pass through Piaget’s first stage of development before their language development takes place, then we would expect paraplegics to have a distorted path of language development, which, in fact, is not the case. (It is worth noting here that supporters of the UG theory often cite cases of abnormal children, those with serious cognitive and/or psychological problems, and “language savants” such as Williams syndrome children, as evidence for the independent, innate nature of the language faculty.) Chomsky went on to use the favourite UG argument against Piaget, the “logical” problem: how could Piaget (or anyone else for that matter) explain the poverty of the stimulus? No generalised learning strategies can, said Chomsky, ever meet this objection.
Sampson says that Chomsky sees a new-born child as being “just like a very learned man who is asleep; the knowledge is in there, it just needs stirring up a bit before it is available for use” (Sampson, 1997: 8). Against such a view, Sampson argues, with Locke, that experience gives us knowledge, and we do not have any particular ideas or knowledge “built in.”
In addressing the question of what explains language acquisition, if not Chomsky’s innate language learning device, Sampson argues that the essential feature of languages is their hierarchical structure. Children, like our ancestors, start with relatively crude systems of verbal communication, and gradually extended syntactic structures in a pragmatic way so as to allow them to express more ideas in a more sophisticated way. The way they build up the syntax is piecemeal; they concentrate on assembling a particular part of the system from individual components, and then put together the subassemblies. This gives them low level structures which are then combined, with modifications on the basis of input, into higher level structures, and so on.
Sampson’s argument has two main strands: gradual evolutionary processes have a strong tendency to produce tree structures, and (following Popper) knowledge develops in a conjectures-and-refutations evolutionary way. Sampson claims that these two strands are enough to explain language acquisition.
Another well-known critic of Chomsky, Elizabeth Bates, challenges the modular theory of mind and, more specifically, criticises the nativists’ use of the accounts of “language savants” and those suffering from cognitive or language impairment disabilities to support their theory.
As for the poverty of the stimulus argument, Bates says “Linguists of a nativist orientation tend to recite this argument like a mantra, but we must remember that it is a conjecture not a proof.” (Bates, 2000: 6) Bates, who sees language as consisting of a network, or set of networks, says that neural network simulations of learning are still in their infancy, and that it is still not clear how much of human language learning they are able to capture, but she cites some research that challenges the poverty of the stimulus argument, and says that the neural network systems already constructed are able to generalise beyond the data and recover from error. “The point is, simply,” says Bates, “that the case for the unlearnability of language has not been settled one way or the other.” (Bates, 2000: 6)
An important criticism raised by many, and taken up by Bates, against Chomsky’s theory is that it is difficult to test. Bates argues that the introduction of parameters and parameter settings “serve to insulate UG from a rigorous empirical test.”
How does UG relate to SLA?
There are four main hypotheses:
• 1. There is no such thing as UG.
• 2. UG exists, but second language learners only have indirect access to it via the L1.
• 3. UG exists, but L2 learners only have partial access to it.
• 4. Second language learners have full access to UG.
As for hypothesis 1, those who deny the existence of UG (like Piaget, Sampson, and Bates) see no need to postulate a language module, and no need to look for linguistic universals either. O’Grady (1996) takes this approach, but a better example of such an approach is the Competition Model, first proposed by Bates and MacWhinney in 1983.
The best-known hypothesis regarding the second position, that UG exists, but that second language learners only have indirect access to it, is Bley-Vroman’s Fundamental Difference Hypothesis (Bley-Vroman, 1989a, 1989b). Brey-Vronan argues that the mind is modular, and that there exists a language faculty (UG) which is essential for the development of L1, but that UG is not directly at work in SLA. According to Bley-Vroman, adult second language learners do not have direct access to UG; what they know of universals is constructed through their L1, and they then have to use general problem-solving abilities, such as those that operate in non-modular learning tasks: hypothesis testing, inductive and deductive reasoning, analogy, etc. The Bley-Vroman approach provides an explanation for the “poverty of the stimulus”, or “logical” problem of SLA – the complex L2 knowledge or interlanguage grammar which second language learners develop is (partly) a result of UG’s influence on L1.
The third hypothesis, partial access, claims that L2 learners have access to principles but not to the full range of parameters. Schacter (1988) and Clahsen and Muysken (1989) have argued this case. It differs from the “indirect access” position in that it predicts that no evidence of “wild grammars” will be found, and that L2 learners will not reset the values of parameters of the L2 when these differ from the L1 settings.
Finally, the full access hypothesis claims that UG is an important causal factor in SLA, although not, of course, the only one. Those adopting the full access view (e.g., Flynn, 1987) claim more than that the L1 UG affects the second language learning process. They claim that principles not applicable to the second language learner’s L1, but needed for the L2, will constrain the L2 learner’s interlanguage. For example, the principle of Subjacency, which constrains the kind of wh-movement permitted, is irrelevant to languages that lack wh-movement. While those adopting the partial access approach would claim that a Korean native speaker learning English would not be affected by the Subjacency Principle, since it is irrelevant to Korean, those taking a full access stance would expect the Subjacency principle to constrain the Korean learner’s interlanguage grammar. In regard to parameter re-setting, the full access position, contrary to the partial access position, suggests that while the learner may pass through a stage where the L1 setting is applied to the L2, he will eventually attain the L2 setting, assuming a sufficient amount of relevant input.
Attempts to use theories of UG to explain SLA have met with various criticisms.
First, the empirical evidence for the various positions that argue for some role for UG in the SLA process is mixed. Here are a few examples. A study by Ritchie (1978, cited in Ellis, 2008) of Japanese students of English gave “preliminary support to the assumption that linguistic universals are intact in the adult.” White (1989) reports on a study of Japanese learners of English who, despite having no knowledge of question formation involving complex subjects, successfully acquired this knowledge in English. White argues that the learners must have had access to the principle of structural dependence. Flynn (1996, cited in Mitchell and Myles, 2004) reviewed research on Japanese learners of English, and claimed that it supported the view that UG constrains L2 acquisition. Mitchell and Myles (2004) also cite work by Thomas (1991), and by White Travis and Maclachlan (1992) in support of the full access to UG hypothesis.
On the other hand, a study by Bley-Vroman, Felix, and Ioup (1988) of Korean learners of English concluded that the results made it “extremely difficult to maintain the hypothesis that Universal Grammar is accessible to adult learners” (Bley-Vroman, Felix, and Ioup,1988, cited in Ellis, 2008). A study by Meisel in 1997 of the acquisition of negation in French and German by L1 and L2 learners (cited in Mitchell and Myles, 1998) concludes that the UG principle of structure-dependency is not available to L2 learners. Schachter’s (1989) test on Subjacency gave much more doubtful results than White’s, which she says constitute a “serious challenge” to the claim that UG is available to adult learners.
In general, then, it seems that there is conflicting evidence for all positions, although Cook and Newson claim that there is “a great deal of evidence” that knowledge of some aspect of language has been acquired in an L2 “that is not learnable from input, that was not part of the learners’ L1 and that is unlikely to have been taught by language teachers” (Cook and Newson, 1996: 293).
Let us now deal with the doubts about empirical evidence. The problem here is that L2 learners do not begin at the same stage as do very young children in L1, and nor is there any general homogeneity in their “end state” as there is in L1 acquisition. Ellis (2008) discusses various problems with grammaticality judgements that stem from the different beginning and end states in L2 learning. The first problem is how to ensure that the subjects have the requisite level of L2 proficiency to demonstrate whether or not a particular principle is operating in their interlanguage grammar – learners might violate a principle not because of non-availability of UG, but because the structure in question is beyond their present capacity. The second problem dealt with by Ellis is how to rule out the effects of the L1. If subjects act in accordance with UG this might be because they have access to it, or because they are drawing on their L1. Thus it is necessary to use subjects whose L1 does not manifest the principle under investigation. White (1989) also accepts this problem and points out that since not all UG principles operate in all languages, the problem can be solved. A third problem is that of literacy. Birdsong (1989, cited in Ellis, 2008: 441) says that grammaticality judgement tests are not appropriate for learners with poor L2 literacy, and that differences in the metalinguistic skills of literate learners will affect responses.
Second, methodology. Cook (1993) states the problem of the methodology of UG-based SLA research thus: “What can count as data for knowledge of a second language? L1 acquisition starts from the single sentences accepted by native speakers on the deliberate assumption that there is a native speaker standard. Whatever the merits of an idealisation to a normalised native speaker, this is less convincing in an L2, since there is no clear norm of what a successful L2 learner should look like, other than the monolingual; L2 learners vary extremely in the level of language they attain while L1 children do not. L2 research can therefore be based with difficulty on the same kind of single sentence evidence used with L1 research…. Most researchers therefore resort to grammaticality judgements as their main source of data – a source of evidence that has to be treated with extreme caution as it is unclear how directly it taps the individual’s knowledge of language” (Cook, 1993: 482).
Thirdly, we have the problem of the empirical adequacy of UG when applied to SLA: to what degree is the UG theory falsifiable? Gass and Selinker (1994) raise a number of objections to the work of those taking a UG approach to SLA, all concerning falsification. They argue that while UG theory is well-defined, and thus able to make more accurate predictions, because of the changing nature of the linguistic constructs on which it is based, UG-based research is difficult to falsify. Upon being confronted with data apparently contradicting the predictions of UG access, it is always possible to argue that the underlying linguistic formulation was not the correct one. Furthermore, they argue, apart from “moving the goalposts”, UG researchers have not always heeded falsifying evidence. They suggest that when predictions are not borne out, there are three options: assume a no-access to UG position, say methodological problems are to blame, or assume the theory is false. Gass and Selinker state that in their opinion the third position seems the most likely, but surprisingly, give no reasons for this opinion.
Larsen-Freeman and Long (1991), in addition to misgivings about falsifiability, question three assumptions of Chomsky’s explanation of language acquisition. The first is that learning occurs quickly and is mostly complete by age five. “In fact, a good deal of complex syntax is not mastered until much later. English dative movement, for example, is not fully learned until about age sixteen” (Larsen-Freeman and Long, 1991:236). Other examples of “late” acquisition include some WH questions and yes/no questions.
The second questionable assumption is that certain syntactic principles are unlearnable, and therefore innate. This, say the authors, is increasingly being challenged. “General cognitive notions and strategies, such as conservative hypothesis-formation, developmental sequences based on cumulative complexity, and avoidance of discontinuity are being used to re-examine such UG icons as structure-dependence, PD phenomena, Subjacency and binding principles” (Larsen-Freeman and Long ,1991: 236).
The third assumption made by Chomsky and challenged by Larsen-Freeman and Long is that the input available to learners is inadequate and thus implies innate linguistic knowledge. As with the second assumption, a different explanation can be offered by a general learning theory.
The principal problem seems to be that UG is essentially an attempt to describe a core grammar: it is not really a theory of learning at all. According to Chomsky we do not “learn” our I-Language in the usual sense of the word: we are born with linguistic competence, and all we need is some positive evidence to trigger particular parameters so that the particular version of UG corresponding to our L1 becomes instantiated in the mind. Thus the process of acquisition is not interesting, the main task is to describe the components of the core grammar. In SLA on the other hand, we are interested in explaining the language learning process. We are also interested in a variety of phenomena such as variability, fossilisation, and individual differences, all of which are deliberately ruled out of a theory of UG because they have nothing to do with the acquisition of L1 linguistic competence.
In summary then, the limited domain of Chomsky’s theory means that there are many aspects of L1 acquisition that fall outside it; Chomsky has nothing to say about pragmatics and discourse, linguistically he concentrates heavily on syntax, and even there only on core grammar; the acquisition of language-specific tense and case morphology, for example, are not included. Wolfe-Quintero comments: “UG may account for the successful acquisition of core grammar, but there is much more to language learning than that” (Wolfe-Quintero, 1996: 343). In the case of SLA, the limitations of a UG approach are even greater. Even assuming that UG exists, that UG theories of L1 acquisition are true, and that L2 learners have at least some access to UG, most of the questions that concern SLA researchers remain unanswered by Chomskian theory; indeed they are not even addressed.
References can be found in the Suggested Reading and References section, at the end, under SLA stuff.