|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
TOEFL Score Interpretation D.W. McKeon, March 2008
Having been developed in the mid 1960’s, the TOEFL has become the major standard of measuring proficiency in English as a second language throughout the world. Because of its widespread use in determining admissions into colleges and universities, TOEFL scores have taken on a sort of magical aura to them. One rather unfortunate consequence is the emergence of innumerable TOEFL preparation courses—not to mention TOEFL preparation centers—throughout the world: the tail is indeed wagging the dog. Even more unfortunate is the temptation to view one’s TOEFL scores as an indicator of general intelligence or as a predictor of academic success, despite the TOEFL Board’s insistence to the contrary. The original paper-based test (PBT) has been modified over the years to keep pace with contemporary emphases in ESL teaching methodology as well as with language testing research. The PBT has three subsections: Listening, Structure (Grammar), and Reading. A total score range of 310-677 is determined from the scaled scores of the subtests (31-67/68). The PBT continues to be administered around the world and now includes a mandatory 30 min. essay (Test of Written English, TWE) with a score range of 0 to 6. (Note: prior to 2005, the TWE was an optional subtest.) Beginning in 1997, the TOEFL Board introduced a computer-based test, CBT (total score range: 0-300), with the intention of gradually replacing the paper-based test (PBT). Other than the supposed ease of electronic test-taking, the CBT TOEFL made a few changes to the PBT, including: (1) linear testing (from easy to difficult) and computer-adaptive testing (tailored to the proficiency level of the individual test-taker); and (2) a compulsory essay, whose score was factored into the Structure/Writing section and was reported separately in an Essay subtest, equivalent to the TWE. The Listening, Structure/Writing and Reading subtests each had a score range of 0-30. However, with the rapid spread of the internet worldwide, the CBT has now been phased out (after September 2006) and replaced by the internet-based TOEFL (iBT). The iBT (total score range: 0-120) represents a substantial revision of its predecessors in two major respects: (1) it includes a Speaking subtest (expressions of opinions on particular topics and responses to questions based on reading and listening tasks); and (2) each subtest (Reading, Writing, Listening, and Speaking) involves integrated tasks, thus making the iBT a more realistic measure of a test-taker’s usage of English. Interestingly, the preparers of the iBT have returned to the more traditional approach to item selection, with each test-taker having the same set of questions to answer. (See the official TOEFL website for
more complete descriptions, testing information, etc. concerning the
two existing versions
of the TOEFL, the PBT and the iBT.) II. Total Scores and Subscores Score reports on all three versions of the TOEFL include a total score as well as subscores measuring individual language skills. In both the PBT and CBT, the total score is an average of the three scaled subscores, multiplied by 10. The iBT total score is an aggregate of the four subscores.
As mentioned above, the iBT tests all the language skills in more integrated tasks than do the PBT and CBT versions. For example, instead of a separate Structure (Grammar) subtest, the test-taker’s command of grammar is evaluated in the Writing and Speaking subtests. Similarly, there are different kinds of tasks for each of the four iBT subtests. The material in the Listening section includes lectures and conversations; the Speaking section includes expressions of an opinion as well as responses to both reading and listening materials; the Reading section includes several passages from academic texts; and the Writing section involves both an integrated writing task, based on reading and listening materials, and an independent task of supporting an opinion on a topic. (See www.ets.org/Media/Tests/TOEFL/pdf/TOEFL_at_a_Glance.pdf) III. Interpreting TOEFL Scores In evaluating a candidate’s performance on the TOEFL—whether PBT, CBT, or iBT—it is important for admission committees to look at not only the total score but also the subscores and the TWE/Writing score (listed separately in the PBT and CBT). To illustrate the importance, let’s say that two students achieve the same total score on the paper-based test (580), but their subscore profiles are quite different, thus reflecting differences in language skills (assuming a high test validity):
Thus, whereas Student A performed rather poorly on the Listening test,
he/she did very well on the Reading test, thereby pulling up the overall
score. Student A might be expected to process written material
well, but have considerable difficulty with listening tasks (including
comprehending lectures, not to mention following and participating in
class discussions!). On the other hand, Student B performed well
on the Listening test, satisfactorily on the Reading test, but only marginally
on the Grammar portion. In contrast to Student A, then, Student
B would be expected to handle listening tasks without major difficulty
but might struggle with writing assignments (grammatically, at least). In
both cases, the Writing (TWE) scores must be taken into consideration
in order to corroborate any tentative conclusion regarding a student’s
writing ability. Certainly, any TWE score below 4.0 should
raise major concern. To conclude, a total TOEFL score, by itself,
does not indicate which of a candidate’s language skills may be
stronger or weaker than the others.
Below are two comparison charts showing the Graduate School’s required cutoff (total) scores for admission, exemption from the English Placement Test (EPT), and exemption from the SPEAK (or TEACH) Test for international teaching assistants with teaching responsibilities. (For a more detailed comparison of the total and subscore comparison of the three versions of the TOEFL, see the TOEFL publication.
B. Recommended Cutoff Subtest Scores for Admissions:
V. Other Factors Affecting TOEFL Scores The number of times an applicant has taken the TOEFL may also be revealing. Occasionally, students take the test several times within a year or two, often without concurrent formalized language instruction, or even frequent opportunities to practice English. Although each test has novel items, a certain amount of test-wiseness is likely to affect the outcome. Again, consider the fact that many students take TOEFL preparation courses. The question remains as to how much real language learning takes place amid the exposure to the finer points of the exam and strategies for successful test-taking. The official TOEFL website (www.ets.org/toefl) offers a detailed description, a virtual tour, and sample questions of the iBT as well as sample on-line practice tests (for a fee). (See www.ets.org/Media/Tests/TOEFL/pdf/TOEFL_Tips.pdf for a detailed description of the exam as well as tips for test-takers.) Finally, the date that the test was taken may be significant, especially if it is more than a year old. Because it is a test of language proficiency, ETS will not release scores more than two years old. What really matters is the applicant’s use of English since the exam was taken. In the case of transfers from another English-speaking university, we may assume that the student has made some improvements in his/her language ability, whereas for those who do not reside in an English-speaking environment, more information on their use of the language would be helpful. VI. The New GRE Analytical Writing Section and the TOEFL The new GRE Test includes an Analytical Writing section, consisting of two writing tasks: (1) presenting and motivating an opinion on an issue, and (2) analyzing the logical soundness of an argument. A combined score is determined using a 6 point scale (with holistic scoring). It purports to measure “critical thinking and analytical writing skills rather than…grammar and mechanics”; more specifically, these writing skills are to be evaluated in terms of the following:
With respect to its use with ESL examinees, the GRE Board recommends that the Analytical Writing section be used to supplement information from the TOEFL—whether the TWE for the PBT (the Essay Writing subtest for the CBT) or the Writing subtest of the iBT—since the latter writing test assesses one’s command of language skills, not the expression of “high levels of thinking and analytical writing.” (See the publication “Guide to the Use of [GRE] Scores at www.ets.org/Media/Tests/GRE/pdf/994994.pdf). One might argue that it is difficult, in practice, for examiners to separate effective analysis from clear linguistic expression, but the GRE Board is correct in concluding that “if ESL examinees don’t understand the task being posed to them, their performance will be affected.” Consequently, it seems safe to make one conclusion: if an ESL examinee obtains a very high Analytical Writing score (5-6), he/she is likely to have a good command of English. |
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||