Empiricism and Its Exciting Alternatives


There’s a growing opinion among academics and pundits in the ELT industry that exposure to language in the environment is enough to explain how we learn languages. This is a rebuttal of what we might call the cognitive paradigm established in the 60s by Chomsky, according to which our knowledge of language can’t be explained that way. Following Chomsky’s criticisms of Skinner, there’s been a generally accepted view among experts in the field for the last 50 years or so that language learning is not one more example of learning by reinforced behaviour, but rather a special case of learning which draws on a unique property of the mind to interpret linguistic  information.

Serious questions of epistemology are at issue here. They revolve around re-visting questions of whether we can speak with any sense about the mind, or if, as the empiricists insist, we can only talk, “sensibly” (geddit) about measureable things presented to our senses. The quest for reliable knowledge led the extreme empiricists, the Logical Positivists, in the 20s and early 30s to insist that all talk of the mind had to be purged, and that only a hand full of carefully-vetted sentences could be used if the chaos and confusion of normal discourse were to be overcome. They arrived at an inevitable dead end, climbed up Wittgenstein’s ladder, and fell into well-earned oblivion. The work of people like Tarski allowed the few scientists who might have been disconcerted by the positivists’ doubts to settle down, adopt a sensible, sorry, I mean common sense, “correspondence“ view of truth, and continue their work, which rested on the view that there’s an objective world out there which we can dispassionately observe and, basing ourselves on empirical (i.e. factual, non-judgemental) data, theorise about the way it works, using rules of logic to guide us.


But those studying human behaviour have, quite rightly of course, had a hard job gaining admittance to the science club. If they wanted to be scientific they’d have to base themselves on empirical observation, wouldn’t they. Hence, Skinner and behaviourism, who and which confused empiricism as a philosophical movement with empirical research. They decided that human behaviour is best studied by observing what people observably do. How do they learn, then? They learn by doing things, by reacting to their environment. They form habits based on repeating the same behaviour in response to their environment. They have bigger brains than other creatures so their learning is more sophisticated. Reasoning is no more than a sophisticated reaction to a stimulus in the environment.

Chomsky questioned Skinner’s general learning theory and you all know how. But Chomsky’s view makes use of a raft of non-observable theoretical constructs which we allow in order for him to develop his theory. Pace Larsen Freeman, they’re not metaphors, any more than gravity is a metaphor, they’re things we invent in order to explain phenomena which we’re trying to explain. Post Chomsky, the most widely accepted view of language learning is that it’s a process that goes on in the mind, itself a theoretical construct, and that it involves the processing of data. How we process the data is the stuff of lots of different theories which try to explain different bits of the process; none of the theories is complete (none offers a complete explanation of the phenomena under investigation in SLA) and none is firmly established. But most of the theories rely on Chomsky’s theory that we learn our first language thanks to an innate capacity of the mind to make sense of the linguistic data we get from the environment.


So here comes Emergentism, which returns to empiricism and its epistemological roots. It takes many forms; it’s been proposed by various academics like O’Grady, MacWhinney and N. Ellis, with varying results.  Gregg has done his usual elegant job of pointing out the weaknesses of N. Ellis’ well considered arguments  (see my post on Emergentism)  and MacWhinney seems to be making little progress. O’Grady, on the other hand, looks better every time I read his work, which I’ve only recently started to do, having got the tip off from Kevin Gregg.  I urge you, as Kevin urged me, to read O’Grady (2005) How Children Learn Language. It’s like listening to Glenn Gould play Bach; it’s crystal clear, high definition brilliance, one of the best books I’ve read in years. While it’s not based on an empiricist epistemology (far from it!), it totally rejects Chomsky’s UG and argues that a general learning  device explains  language learning. Actually, you need to read more of his stuff than just the book, but anyway, ..


And then there’s Stefano Rastelli’s (2014) Discontinuity in Second Language Acquisition. the Switch between Statistical and Grammatical Learning. Mike Long put me on to this, and it’s superb. Long has written a review of Rastelli’s book which I hope will appear soon. In the review Long notes that “recent years have seen growing research interest in the potential of statistical learning and usage-based accounts of SLA by adults”.  What Long finds so interesting is that Rastelli has dedicated a full book to his version of statistical learning, not just an article in a journal or a chapter in an edited collection.

Rastelli ‘s theory is that statistical learning is the initial way learners handle combinatorial grammar, i.e., regular co-occurrence relationships between forms that are overt in the input (not absent, like pro-drop, for example) and the meanings and functions of those forms. Combinatorial grammar comprises recurrent combinations of adjacent and non-adjacent whole words and morphemes. The form-function pairs can be stored and retrieved first as wholes, and then broken down into their component parts in order to be computed by abstract rules.

And get this: Combinatorial grammar is learned twice, first by statistical learning and then by grammatical learning. This is the meaning of ‘discontinuity’ in his hypothesis. Statistical learning prepares the ground for grammar learning

Statistics provides the L2 grammar the ‘environment’ to grow and develop. (Rastelli, 2014, p.42).

I hope Thornbury reads this; he might just find in Rastelli some long-missed support for his assertions about learning grammar for free, although that’s not exactly what Rastelli is saying.

So Rastelli rejects the notion that L2 development is continuous, a series of developmental stages as a result of increased exposure to L2 input:

  The core idea of discontinuity is that the process of adult acquisition of L2 grammar is not uniform and incremental but differentiated and redundant. To learn a second language, adults apply two different procedures to the same linguistic materials: redundancy means that the same language items may happen to be learned twice (2014: 5).

I’d like to say more, but I don’t want to steal Mike’s thunder, if I haven’t already done so. I hope that’s enough to whet your appetite.


Arguments about SLA are based partly on epistemological underpinnings that need to be declared and understood. Those like Hoey who say that we acquire all the knowledge we have about words on the basis of frequency of exposure, and Larsen Freeman who says complexity theory explains it all, are, whether they appreciate it or not, adopting an empiricist epistemology. Consequently, their theories are doomed to failure unless they create crafty loopholes. But it is, it seems, possible to argue from a different, but still rational, cognitive perspective, that the poverty of the stimulus argument is wrong, if you know what you’re doing. Good scholars like O’Grady and Rastelli do it, and it’s very exciting.

8 thoughts on “Empiricism and Its Exciting Alternatives

  1. Dear Geoff,

    I am glad to see your blog back online.

    Thank you for this summary and the recommended reading.

    You suggest that Chomsky’s constructs are true, and that Hoey is erroneously holding on to an empiricist epistemology, does he? I am not sure if Hoey ever said that all of what we know about a word comes from exposure. The term all seems to me too big. I have not heard or seen Hoey give any explanation on how humans experience and create meaning, where the interface of molecular biology and conscious can be found, etc. I think you offer a short-hand explanation for this when you write that “most of the theories rely on Chomsky’s theory that we learn our first language thanks to an innate capacity of the mind to make sense of the linguistic data we get from the environment,” although I am not quite sure if mind and sense making are called to aid to explain the structural qualities of language or the awareness / semantic part. Nevertheless, these two ideas carry the entire load here– “the mind” and “making sense.” I think you point out that both terms refer to unknowns, or better, they are created. Unlike plugging in dummy values in a known formula, here we suggest dummy variables using known values (words or language input) to explain the results (language acquisition). I think it is this appeal to the ghost in the bottle that is problematic. I wonder, and you have often insisted on soundness in reasoning, if we have made much progreses by comming up with a term, mind in this case, to provide backing for a claim, language learning. I am not sure if we are begging the question here.

    What makes Hoey interesting for me as a teacher is that, if he is right, even only partially, I have a way to gauge my teaching choices. Chomsky does not give me the same spill-over from linguistics to the applied side other than hoping to provide interesting encounters with the language. Can I look for students’ mind to help me teach (especially if it is only a theoretical construct)? Can the idea of interlanguage give me any practical help? I guess that’s why Hoey leads to Dellar (give students prefabricated chunks, rely on Corpus Linguistics for selections,…), and Chomsky to Krashen (you cannot teach grammar, grammar teaching is not effective, all you need is attractive input, etc.).



    1. Hi Thom,

      Thanks for your comments. A few points.

      1. I don’t think Chomsky’s constructs are true.

      2. You’re right: Hoey doesn’t say he adopts an empiricist epistemology and while his theory seems to imply it, maybe he doesn’t.

      3. You lose me a bit in your comments about the mind and making sense.

      4. I wonder how assuming that Hoey is right (assuming that it that subconscious priming explains language learning) helps your teaching practice. Dellar gives no justification for his assumption that getting learners to memorise lexical chunks leads to communicative competence. And, BTW, he says almost nothing about how the results of studies of corpora informs his selection of the chunks.


      1. Hi again,

        1. You object to the term, right? You would agree that they are the most adequate description at the moment.

        3. I am trying to get around this statement “…we learn our first language thanks to an innate capacity of the mind to make sense of the linguistic data…” This I find a most difficult idea. Did Chomsky get to this by way of inference, poverty of stimulus versus complexity–ergo there must be something we address with the word “mind” that makes it all possible? Or is it taken as a given before we deal with the problem, as I think you wrote in your post, “…they’re things we invent in order to explain phenomena which we’re trying to explain”.

        (I used the algebra to illustrate. Sometimes we have the formula and we plug in values to test the formula. This is the mind a priori scenario. On the other hand, we have the values at hand but lack the formula. So we look for ways to understand the underpinning relationships that can describe what we observe. I think the issue of gravity follows that route. The experience is everywhere at hand, but it took a while to express it mathematically.)

        When I read that section I felt like saying “wait, what do you mean by this.” It’s a major fault line in the entire model. Too much could be said here.

        4.1 Hoey got to priming as the responsible quality of the mind / brain that would partially explain language acquisition bottom up. Looking at text he noticed peculiar manners of how words behave. I think we can follow him on that. He started inducing principles and ended up with statements about the mind / brain. I’d suggest that the teacher can do a U-turn on that trail assuming priming, then the principles, and finally the concrete language items.

        4.2 In a way, I am also starting with the mind, or certain assumptions about the mind, it is a patterning organism, it likes to see order, chaos is discarded, noise is cancelled out, etc. Patterns are constituted by way of priming, or associative learning. (Words that occur together frequently become associated, e.g. verbs become primed to the point where they suggest nouns rather forcefully . etc.)

        4.3 The consequences for my own teaching are seen in the fact that I pack words in chunks, if at all possible. The priming process is still blank, students will not yet recognize or experience any propensity to produce “time” when they hear “take your…”, or “money” when they hear “making lots of…” As Hoey would point out, with every encounter these associations are strengthened, tuned, or weakened.

        4.4 Priming as default mode:
        When I teach, I keep in mind that whatever is being said or seen leaves a mark with students. Learning is not switched on with the teacher turning to the board the book or the flash cards. Students pick up my greeting, my commenting on the room temperature, etc.

        4.5 I worry about what and how students record, review and recall language.

        4.6 I de-emphasize rule learning. The rules have no active ingredient that allow fluent production of speech.

        4.7 I use concordances to study language samples, to find useful chunks, or phrases. I do not prepare “my own” text anymore, or, I am more cautious.

        4.8 I read frequency lists and shift from rare and obcure to common words. (When I was in college (ESL student) I was asked to improve my vocabulary. We had to study words like parsimonous, procrastination, or sycophant, etc. I try not to do that with my students.)

        I apologize for the long reply. But I think I need to say some things to show that I do not just like the idea of priming, but that I actually spend brain and practice on it.

        I will not defend Dellar here, I think, having met him a couple times, I understand what he does; I rather defend my own views.



      2. Hi Thomas,

        1. A construct can’t, by definition, be true or false. It’s purpose is to help articulate a theory or hypothesis.

        2. Chomsky used the construct of the LAD, a module of the mind, to explain how we have the knowledge that comprises linguistic competence. We know things about how language works that, he claims, can’t be explained by the linguistic information we get from the environment. IF there are things we know about language that we didn’t learn from the environment, then we must have learned them some other way, and Chomsky’s inference is the LAD. Empiricists say that there’s no need to invent a putative LAD because all the knowledge we have of language can be explained by statistical and usage models. The competition model is one example. (The work of N. Ellis and others is another. I personally don’t think Larsen Freeman’s work can be taken as a contender.) So that’s the basic bone of contention. And, in the opinion of rationalists, the argument has to be resolved by appeal to empirical evidence which tests the 2 opposing theories. To date, Chomsky wins, in my opinion.

        3. Hoey is a respected expert in examining corpora and noticing patterns in texts. Building on lots of previous work, he makes a good case for the importance of collocation as an organising principle for English. After that, it all gets murky. Hoey carefully lists all the things we know about a word, but, IMO, priming is too general a construct to explain how we know all that. The general idea of statistical learning is that learners detect collocations in the linguistic environment, and use the resulting information to bootstrap their knowledge of the language, but how? And what about the results of the thousands of studies that show we learn things about language that go beyond collocation? What’s interesting about Rastelli’s (2014) theory is that he explains how. Whether he’s right or not is another question.

        And I don’t know how teachers “can do a U-turn on that trail assuming priming, then the principles, and finally the concrete language items.”

        4. You make some interesting teaching suggestions. Personally, I disagree with your reason for de-emphasizing rule learning. IMO, knowledge of “the rules” helps the fluent production of speech, but I agree (if this is what you’re saying) that explicit grammar teaching is over-emphasised in ELT.


      3. Just a quick clarification. When I refer to “collocation”, I include colligation and the other bits of Hoey’s very clear analysis.


      4. (I somehow can’t reply directly to your reply March 29, 2016 at 5:47 pm; that’s what I will refer to here)

        Hi again,

        2. (In a sense I can embrace the final point in Chomsky’s reasoning–there must be something that powers language learning. Intuitively this idea is nourished not just to make sense of language. The same seems to apply, to me, for music, or painting, or architecture, etc. Granted, language is the most intruiging of human “gifts.”)
        The problem I have with the LAD is that it should have a biological counterpart. After all, when we say mind we also mean brain. I remember an earlier dialogue where you commented that for Chomsky language grows just like an arm, which I found hard to believe. The ghost in the bottle needs to be pinned down, I think.

        3. I think with priming we have a possible candidate that can explain biologically HOW it happens. In its most primitive expression “neurons that fire together, wire together”. I think Hoey cannot explain how words and the derived constructs (collocation, colligation, etc) make sense– he just assumes it. That is, the words in “take – your – time” will have meaning prior to his analysis. I am willing to capitulate before that how. That is, how can we bridge the gap between cellular biology and conscious experience. I don’t have the faintest idea.

        I am not familiar with Rastelli’s (2014). But will try to get a hold of the book.

        The U turn: Hoey started looking at the artifact, and ended up with priming as the end point. As a teacher, I can assume that priming is an adequate description of language learning, elaborate a methodology that is informed by the principles Hoey lists (he actually presents these as hypotheses on p.13 in Lexical Priming) and end up with the bits of language that go into my teacher talk, texts, handouts, activities, anything in the classroom.

        4. Grammar teaching, abused, the addictive substance for language teachers–In my opinion, rules have no generative psychological force. What I mean by that is that in the moment of language production, the brain follows cellular networks that came about as a result of prior stimulation (the priming). There are no paths that would correspond to grammatical rules, at least as we find them in the ELT grammar canon. There are no neurons set apart for the aspect, or tense, etc. (Having said this, I do not mean to imply that pure empiricism can explain everything. After all, the body as receiving device of sense experience is extremely structured. The brain is indeed a-maze-ing. The slate has never been blank.)

        If what I am saying is wrong, I get the impression that even admitting the adequacy of the UG paradigm grammar teaching / structural syllabi are not justified–as grammar evolves on an predetermined track automatically, hence, I understand, the idea of interlanguage. Interlanguage is more or less insensitive to grammar teaching.

        It has been one of my frustrations in ELT, the often failed attempt to wean teachers off the grammar syllabus. “what do I teach after I am done with the conditionals,” or “they haven’t done the past yet; how can you ask me to do present perfect,” or disguised PPP in a party costume.




  2. ‘They have bigger brains than other creatures so their learning is more sophisticated.’

    Here, surely, is the crux. Skinner, it seems, grants that our learning is more sophisticated than that of other animals. Chomsky wishes to investigate the nature of the sophistication. Skinner not. Different research interests do not entail different views. The whole ‘controversy’ may be nothing more than an academic turf war.


  3. Hi Patrick,

    I don’t think Chomsky’s argument with Skinner is best seen as nothing more than an academic turf war. It was, surely, an important argument about how we learn languages, and since the explanations offered by them were so different, at least one of them had to be wrong.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s