I’m afraid I don’t know who “Patrick” is, because his avatar gives no information about him, but whoever he is, he regularly leaves comments on Scott Thornbury’s blog. Most recently, Patrick has given his opinion on the CEFR and on research into interlanguages. On the former, in comments on P is for Predictions, he says he likes the CEFR and that its growing influence throughout the world is “a good thing”. Dismissing criticism of the CEFR, Patrick says:
Fulcher’s recently cited criticism doesn’t stand up to analysis – it does however demonstrate that either he hasn’t read or understood or used the CEFR, or that he’s more interested in distorting its message in the interests of marketing his new theory: he doesn’t even get the quotations right. Sadly this kind of shoddy scholarship seems to be widespread in TESOL academia.
Re the way it was devised, yes there have been critics .. , but if the choice is between the judgement of experienced teachers or the pseudo-science of some particular rating method then I’ll go with the former any day (there are so many rating methods to choose from anyway).
The following week, commenting on Scott’s blog I is for Idiolects Patrick says:
The way I see it Scott is that ‘interlanguage’ is one of the uglier of many unnecessary neologisms invented by academics, presumably to give them a sense that they are forging a profession: there are plenty of plain English alternatives.
Yes you can invoke it as a reason for accounting for the mismatch between teaching and learning/acquiring. But there are a thousand and one other potential reasons for the mismatch that have nothing to do with ‘interlanguage’.
But the main reason that I dislike it so much is that it is so tightly bound up with the notion that languages are learnt via mechanistic pre-programmed stages. The much-touted initial research on this was a travesty, both in terms of its design and its interpretation. And although these initial errors were highlighted and debated over an extended period of time, and refuted by subsequent research, the fact that they still persist in some quarters indicates that they are ideologically driven.
From memory, Rod Ellis gives a pretty even-handed account of this fiasco, and probably Norbert Schmitt too, but many of the academics writing in that particular field seem happy just to perpetuate unanalysed myth
Far be it from me to criticise anybody for expressing their views in a forthright way, but I’m afraid Patrick is talking so much baloney that his/her comments need a reply. It wouldn’t be polite for me to put my reply on Scott’s blog, hence this post.
Patrick has managed to get things completely back to front: it’s the CEFR which treats language learning as a mechanistic process where learners move along a series of stages in linear progression, and it’s the various hyptheses concerned with interlanguage development which insist that L2 learning, far from being linear, is, in fact, a process where lots of things are going on at the same time, and which exhibits plateaus, movement away from, not toward, the L2, and all sorts of U-shaped and zigzag trajectories. Let’s take a look.
The CEFR Framework
There are six levels in the CEF framework, from B1 to C2, each associated with a set of descriptors. At each level, a list of can-do statements describes what the learners can do, one example being what they can do when it comes to transactions to obtain goods and services:
So here we have the progression from ‘can’t-do-much’ to ‘can-do-it- all’ as described by a scale that is statistically determined, hierarchically structured, and linear. The assumed linearity of such scales is contradicted by research findings of SLA, including those on interlanguage development, which show that learners do not actually acquire language in this way. As Fulcher says:
The pedagogic notion of “climbing the CEFR ladder” is therefore naïve in the extreme (Westhoff 2007: 678), and so attempts to produce benchmark samples showing typical performance at different levels inevitably fall prey to the critique that the system merely states analytic truths (Lantolf and Frawley 1985: 339), which are both circular and reductive (Fulcher 2008: 170-171).
Note here that I’m leaning on the work of Glenn Fulcher, who I’m sure would be upset to learn that he’s been accused of “shoddy scholarship” and of deliberately distorting the CEFR “message” in order to promote his own work.
Fulcher points out that the CEFR scales were made without any principled analysis of language use, and without reference to any explanation of how people learn an L2. The selection of descriptors by North were, as he admits (North, 1995), based soley on a theory of measurement, and it relied entirely on intuitive teacher judgments rather than any samples of learner performance. The CEFR scales are therefore “essentially a-theoretical’ (Fulcher 2003: 112), a critique which North and Schneider (1998: 242-243) accept.
The CEFR scales don’t relate to any specific communicative context, or provide any comprehensive description of communicative language ability. If we return to the description of transactions to obtain goods and services, we note that ‘goods’ and ‘services’ are grouped together, and thus no distinction is drawn between, for example, buying fish and chips and buying a new car, which are, of course, qualitatively different communicative transactions. McCarthy & Carter (1994: 63) demonstrate the problem with these two examples:
Customer: I’m interested in looking at a piece of cod, please.
Server: Yes madam, would you like to come and sit down.
Customer: A Ford Escort 1.6L please, blue.
Server: Right, £10,760, please.
The CEFR can thus be seen as an unstructured, incomprehensive list of things that language users might want to get done in a range of contexts. Any, or none, of these might be relevant to a particular testing situation, and any, or none, of them might be linked to any particular task types.
I’ll touch on one other bit of the framework: “transactions”, where the descriptors that are used at each scale level again rely on ‘can-do’ statements to define the levels. Davidson & Fulcher (2007) discuss a number of problems that arise from trying to use these descriptors, and I’ve chosen just two as illustrations:
- The descriptors mix participant roles within a single level. At A2, for example, the leaner can ‘ask for and provide’ goods and services’, implying that they would be able to function as a shopkeeper or travel agent, as well as a procurer of goods and services.
- The distinction between levels is not at all clear, often referring to a vague notion of ‘complexity’ of the transaction. For example, at level B1 learners can deal with ‘most situations, as well as ‘less routine’ situations. But there is no indication as to what kinds of ‘less routine’ situations a learner might not be able to deal with, and no definition of ‘less’, ‘more’ and ‘most’. A2 is characterized by ‘common’, ‘everyday’, ‘simple’, and ‘straightforward’ transactions, but the reader is left to infer what these presumably ‘more routine’ transactions might be.
I won’t attempt any proper critique of the CEFR here, but I hope that this evidence at least indicates that its weaknesses can’t be so breezily brushed aside as Patrick suggests. And we should also note that most leading scholars of language assessment view the reification of the CEFR as both theoretically unjustified and damaging. To quote Fulcher (2008: 170) again:
It is a short step for policy makers, from “the standard required for level X” to “level X is the standard required for…”, a step, which has already been taken by immigration departments in a number of European countries.
Few, except Patrick, perhaps, would argue that the original CEFR framework has undergone an unfortunate process of reification, so that the CEFR scales have now assumed the role of constants which are used in the exercise of power. Of course, this isn’t what Trim or North wanted, but we can’t, or at least we shouldn’t, deny that it’s happened.
As a final observation on Patrick’s remarks, Fulcher, far from misinterpreting “the CEFR message”, has more than once suggested that when the CEFR is seen as a heuristic model used at the practioner’s discretion (as Trim and North intended), it can become a useful tool in test construction. But the context of language use is critical, since it’s the context that limits the inferences drawn from test scores, and restricts the range of decisions to which the score might be relevant.
Fulcher makes the basic distinction between ‘measurement-driven’ and ‘performance-driven’ approaches to assessment. Measurement-driven approaches, like the CEFR, derive meaning from a scaling methodology which orders descriptors onto a single scale and relies on the opinion of judges as to the place of any descriptor on the scale. In contrast, performance data-driven approaches are based on observations of language performance, and on generating descriptors that bear a direct relationship with the original observations of language use. Meaning is derived from the link between performance and description. As Fulcher says
“We argue that measurement-driven approaches generate impoverished descriptions of communication, while performance data-driven approaches have the potential to provide richer descriptions that offer sounder inferences from score meaning to performance in specified domains”.
Patrick’s comments on interlanguage research and its relevance to language teaching need less comment.
- He’s welcome to his harmless opinion that ‘interlanguage’ is an ugly unnecessary neologism invented by academics to bolster their confidence, although I wonder what “lots of plain English alternatives” he has in mind.
- Likewise, to suggest that there are a thousand and one other potential reasons for the mismatch between teaching and learning that have nothing to do with ‘interlanguage’ is to say nothing of interest, since nobody would suggest otherwise.
As to disliking the term so much because it implies that “languages are learnt via mechanistic pre-programmed stages”, the research on interlanguages implies no such thing. In a post on the subject, I quote Doughty and Long (2003)
There is strong evidence for various kinds of developmental sequences and stages in interlanguage development, such as the well known four-stage sequence for ESL negation (Pica, 1983; Schumann, 1979), the six-stage sequence for English relative clauses (Doughty, 1991; Eckman, Bell, & Nelson, 1988; Gass, 1982), and sequences in many other grammatical domains in a variety of L2s (Johnston, 1985, 1997). The sequences are impervious to instruction, in the sense that it is impossible to alter stage order or to make learners skip stages altogether (e.g., R. Ellis, 1989; Lightbown, 1983). Acquisition sequences do not reflect instructional sequences, and teachability is constrained by learnability (Pienemann, 1984).
Interlanguage research is on-going and under constant critical review (see Han & Tarone, 2016). As the above quote indicates, it is not accurate to say that Rod Ellis regards all the research as a “fiasco”, and the suggestion that “many of the academics writing in that particular field seem happy just to perpetuate unanalysed myth” is unworthy of any response.
Doughty, C. and Long, M.H. (2003) Optimal Psycholinguistic Environments for Distance Foreign Language Learning. Downloadable here: http://llt.msu.edu/vol7num3/doughty/default.html
Fulcher, G. See here for all the works cited: http://languagetesting.info/zoom/search.php?zoom_sort=0&zoom_query=Fulcher&zoom_per_page=10&zoom_and=0
Han, Z,H. & Tarone E.(eds) (2016) Interlanguage Forty years later. Amserdam, Benjamins.
Tarone, E. (2006) Interlanguage. Downloadable here: http://socling.genlingnw.ru/files/ya/interlanguage%20Tarone.PDF