* Krashen 3: Scientific Method


Krashen has always insisted that his hypotheses are constructed in line with scientific method. In his comment on my post criticising his theory, he mentions the Higgs-Boson particle and Newton’s hypothesis about the existence of gravity, and says that his hypotheses are so easy to test that “one counterexample is enough to destroy them”. So before I look at criticisms made of Krashen’s theory of SLA, I’d like to look at some background issues concerning scientific method and theory assessment. The text is largely taken from my book Theory Construction in SLA.

We should resist attempts to formalise the scientific method. Reality cannot be fully apprehended; empirical data are often very hard to interpret; any individual instance of falsification can be challenged (and thousands of such instances are in fact ignored); there is no algorithm for hypothesis testing, and there is no hard and fast demarcation line between science and non-science. Science is conducted by people who can be, and indeed have been, prejudiced, corrupt, misguided, dishonest, ambitious, etc., and science takes place in history and within a community, so the development of a theory is neither steady nor linear. Nevertheless, while knowledge of the world is gained in all sorts of ways, I suggest that the most reliable knowledge comes from engaging in scientific research which leads to the development of theories which attempt to explain phenomena. These theories are developed with various rules of logic and language to guide the process and are scrutinised so as to discover flaws in terminology or reasoning, and to build the clearest, simplest version of the theory. Such theories should then lay themselves open to empirical tests.

Can researchers in SLA construct a scientific theory? Many natural scientists, Popper, Lakatos and Feyerabend among them, deny scientific status to the areas involved in SLA research (psychology, cognitive psychology, sociology, anthropology, social psychology, linguistics, applied linguistics), and there are also numerous academics working in the field of SLA who think that the so-called scientific method is inappropriate for their work. If science is defined as the study of natural phenomena then obviously SLA is not part of science, and neither for that matter is mathematics. There is an obvious difference between the natural sciences and those which study human behaviour, but science in general can be characterised by its insistence on the twin criteria of rational argument and empirical testing.


Scientists articulate problems, make observations, perform experiments, propose hypotheses, build theories and test them, all the while communicating their results to colleagues. The content of the messages that accumulate and that are available in the public domain (rather than the personal knowledge of individual scientists, their memories and thoughts), are what we can call scientific knowledge. Popper (1972) refers to it as “World 3”, and distinguishes it from subjective knowledge by saying that it is “knowledge without a knowing subject”. The objectivity of scientific knowledge stems from its being a social construct, not owing its origin to any particular individual but created communally. Einstein’s relativity theory established him as a great scientist, but the final product, the established theory and the body of evidence it has accumulated belong to humanity.

Scientific knowledge is gained primarily through observation, and the crucial principle here is that all observers are equivalent: anyone observing the event would agree on the report one person made of it. This fundamental principle of objective knowledge needs a lot of interpretation and qualification, but the criterion that all human beings are interchangeable as observers remains one of the most important pillars of science. Which is why experiments are such an important tool for scientists. Experiments are observations carried out under controlled, reproducible conditions, and one of their chief functions is to allow others to carry out similar experiments in different places at different times. Mistakes and misunderstandings are cleared up, and promising hypotheses are tested through replication studies and theoretical criticism. The coherent and consistent set of beliefs generated by all this activity is the paradigm, the generally-accepted theory in any given field which allows scientists to pursue their work in a systematic way. Of course, the paradigm is not necessarily close to any absolute truth; paradigms often contain fallacies and need unexpected discoveries or massive falsification of predictions to dislodge them.

A basic characteristic of science is the way its theories are scrutinised. Testing hypotheses is not a mechanical process whose outcome can be determined by simple logic. There are always questions about the reliability of the data, and of how the data should be interpreted, and in the end it is the expert judgement of the community that must decide if there is a good enough fit between the theory and the data. And different standards will be applied to different kinds of theories at different moments in their development. At the beginning of the development of a new theory, when there is little common ground, and where there are few accepted findings, a relatively unsubstantiated theory might be encouraged, despite flimsy empirical support or rigorous conceptualisation, for example, because it is seen as a useful guide to future research. Sometimes, scientists may even choose to work with two contradictory models of the same system. Ziman (1978:67) points out that in the theory of atomic nuclei, in the 1950s there were two theories: the “liquid drop model”, and the “shell model” which contradicted each other in terms of the behaviour of protons and neutrons. Both models earned Nobel prizes for their authors, and both are now part of a more complex but unified theory which deals with all the phenomena involved. It was therefore a wise decision for scientists working in this area in the 1950s to look for evidence that showed how the two theories could be reconciled, rather than assume that one of the two must be false.


By what criteria do we judge theories? What makes one theory “better” than a rival? First, like any text, a theory needs to be coherent and cohesive, and expressed in the clearest possible terms. It should also be consistent – there should be no internal contradictions. Theories can be compared by these initial criteria which may help to expose fatal weaknesses or simply invite a better formulation. In the discussions among philosophers of science about the natural sciences, the big questions concern empirical adequacy, predictive ability, and so on. But in the field of SLA, there is a great deal of muddled thinking, there are poorly-argued assertions, and badly-defined terms. Consequently, discussions among researchers and academics in SLA often deal with conceptual issues. Similarly, research methodology is less of a problem in the natural scientists than it is in SLA, partly because in the former experiments are often easier to control, variables are easier to operationalise, etc.. Whatever the reasons, it is certainly the case that when judging theories of SLA, we should favour those that are most rigorously formulated.

Once a theory passes the test of coherence, cohesiveness, consistency, and clarity, we may pass on to questions of falsifiability and empirical adequacy. Here, I need to do a very quick summary of Karl Popper’s approach to scientific method. Popper (1959; 1963) insists that in scientific investigation we start with problems, not with empirical observations, and that we then leap to a solution of the problem we have identified – in any way we like. This second anarchic stage is crucial to an understanding of Popper’s epistemology: when we are at the stage of coming up with explanations, with theories or hypotheses, then, in a very real sense, anything goes. Inspiration can come from lowering yourself into a bath of water, being hit on the head by an apple, or by imbibing narcotics. It is at the next stage of the theory-building process that empirical observation comes in, and, according to Popper, its role is not to provide data that confirm the theory, but rather to find data that test it. Empirical observations should be carried out in attempts to falsify the theory: we should search high and low for a non-white swan, for an example of the sun rising in the West, etc. The implication is that, at this crucial stage in theory construction, the theory has to be formulated in such a way as to allow for empirical tests to be carried out: there must be, at least in principle, some empirical observation that could clash with the explanations and predictions that the theory offers. If the theory survives repeated attempts to falsify it, then we can hold on to it tentatively, but we will never know for certain that it is true. The bolder the theory (i.e. the more it exposes itself to testing, the more wide-ranging its consequences, the riskier it is) the better. If the theory does not stand up to the tests, if it is falsified, then we need to re-define the problem, come up with an improved solution, a better theory, and then test it again to see if it stands up to empirical tests more successfully. These successive cycles are an indication of the growth of knowledge.

Popper (1959) gives the following diagram to explain his view:

P1 -> TT -> EE -> P2

P = problem TT = tentative theory EE = Error Elimination (empirical experiments to test the theory)

It can also be represented like this (where GP0, GP1,GP2 are P1, P2, P3) :


We begin with a problem (P1), which we should articulate as well as possible. We then propose a tentative theory (TT), that tries to explain the problem. We can arrive at this theory in any way we choose, but we must formulate it in such a way that it leaves itself open to empirical tests. The empirical tests and experiments (EE) that we devise for the theory have the aim of trying to falsify it. These experiments usually generate further problems (P2) because they contradict other experimental findings, or they clash with the theory’s predictions, or they cause us to widen our questions. The new problems give rise to a new tentative theory, the need for more empirical testing, and round we go again. Popper thus gives empirical experiments and observation a completely different role: their job now is to test a theory, not to prove it, and since this is a deductive approach it escapes the problem of induction. Popper takes advantage of the asymmetry between verification and falsification: while no number of empirical observations can ever prove a theory is true, just one such observation can prove that it is false. All you need is to find one black swan and the theory “All swans are white” is disproved. Falsifiability, said Popper, is the hallmark of a scientific theory, and allows us to make a demarcation line between science and non-science: if a theory does not make predictions that can be falsified, it is not scientific. According to such a demarcation, astronomy is scientific and astrology is not, since although there are millions of examples of true predictions made by astrologers, astrologers do not allow that false predictions constitute a challenge to their theory.

So that’s the bit about questions of falsifiability and empirical adequacy. Theories should lay themselves open to empirical testing: there must be a way that a theory can in principle be challenged by empirical observations, and ad hoc hypotheses that attempt to rescue a theory from “unwanted” findings are to be frowned on. The more a theory lays itself open to tests, the more risky it is, the stronger it is. Risky theories tend to be the ones that make the most daring and surprising predictions, which is perhaps the most valued criterion of them all, and they are often also the ones that solve persistent problems in their domain. Generally speaking the wider the scope of a theory, the better it is, although often in practice many broad theories have little empirical content. There are often “depth versus breadth” issues, and yet again, how these two factors are weighted will depend on other factors in the particular situation where the theory finds itself. Simplicity, often referred to as Occam’s Razor, is another criterion for judging rival theories: ceteris paribus, the one with the simplest formula, and the fewest number of basic types of entity postulated, is to be preferred for reasons of economy.

Now we’re ready to examine criticisms which have been made of Krashen’s theory.

Note: When constructing a theory, researchers distinguish between phenomena and data, they use theoretical constructs, and they attempt to give causal explanations. I’ve dealt with these issues in the pages “Science and SLA” and “Theoretical Constructs”. You can also find a summary of the criteria by which I think theories should be assessed in the page called “General Rational Requirements for a Theory of SLA”. For all these pages, scroll up and see the list of Pages in red on the right of the screen.

Popper, K. R. (1972) Objective Knowledge. Oxford: Oxford University Press.

Popper, K. R. (1963) Conjectures and Refutations. London: Hutchinson.

Popper, K. R. (1959) The Logic of Scientific Discovery. London: Hutchinson.

Ziman, J. (1978) Reliable Knowledge. Cambridge: Cambridge University Press.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s