Abstract
This paper tries to address the question ‘Will it make a difference if reading comprehension questions are set in learners’ L1 instead of English (L2)?’ Past studies addressing this issue have produced contradictory findings. Through a cross-sectional investigation of 3,426 middle school EFL students’ performance in English reading comprehension tests, this study shows that setting questions in learners’ L1 or L2 will make no significant difference in learners’ reading comprehension testing results if their competences in L1 and L2 are both sufficient for the task. However, if their competence in L2 is inadequate while their competence in L1 is not, they tend to perform better if the questions are set in L1. The author suggests that EFL reading comprehension tests, especially those for beginning learner, set the questions in learners’ L1 whenever feasible.
Keywords: Assessing reading, language of the questions, L1 and L2.
Introduction
Assessing reading comprehension in TESOL (Teaching English to Speakers of Other Languages) involves a wide range of factors, ranging from the conception of reading (e.g. the nature of reading, reading process, reading outcome and levels of comprehension) to the framework for assessment (e.g. setting, assessment rubrics, the designed input and the expected response). This paper addresses issues concerning only the language of the questions, that is, the language that is used to present questions in testing ESL/EFL (English as a Second or Foreign Language) reading comprehension. The term ‘language of the questions’ may have two different senses: it may refer to the concrete use of language (i.e. the specific wording and structures) that presents the question; alternatively, it may also refer to the linguistic medium in which the questions are presented, that is, a natural language (e.g. English, French, Chinese etc). In this paper, the language of the questions refers to the second sense if it is not noted otherwise.
The language of the questions in an ESL/EFL reading comprehension test is open for two possible choices: 1) English, the same target language (L2) as the texts for reading; 2) the learners’ first native language (L1). Using L1 or L2 as the language of the questions is a decision affected by a range of factors. It may be determined by the teaching method involved. For example, the Grammar-Translation method uses learners’ native language as the medium of instruction and the language of the questions is usually the learners’ native language (L1), especially for the beginners. However, the Direct Method uses the target language exclusively in teaching and the language of the questions has to be the target language (L2) (Richards & Rodgers, 2001). The decision on the language is also influenced by learners’ L1 backgrounds and their prior knowledge of English (L2). For example, if the intended learners are illiterate in L1 or do not share the same L1, it is impossible for the language of the questions to be set in L1. However, if the difficulty of the language of the questions (Note: in the sense of the wording, structures and the expected type of response) will not exceed that of the text, the language of the questions may then be set in L2 (Alderson, 2000).
Since 1980s, according to Chen and Donin (1997), a concern has been evident among EFL teachers about the impact of the language of the questions on assessing learners’ performance in ESL/EFL reading comprehension. The key issue is which language is better for presenting the questions, learners’ L1 or L2 if learners share the same L1 and are already literate in their L1. An answer to the question will undoubtedly offer the designers of an ESL/EFL reading comprehension test a useful alternative, especially when the intended testees are beginning learners.
To answer this question, we need to examine what makes a good question in a reading comprehension test. According to Alderson (2000), the language of the questions (Note: in the sense of the wording, structures and the expected type of response) should not be harder to understand than the texts themselves. Therefore, if questions set in L1 are easier than those set in L2, the questions should be better set in L1. Furthermore, if ESL/EFL learners are likely to ask themselves some questions about the target L2 text in their native language, the language of the questions set in L1 will therefore be more authentic and better as it is more faithful to what is under examination (Shohamy, 1984). Finally, if learners’ prior L1 knowledge is a pedagogical resource for TESOL, setting the language questions in L1 will have greater pedagogical advantage.
Shohamy (1984) found that multiple-choice questions in L1 were easier than the same questions translated into L2. However, contrary to those studies, Chen et al (1997) found in their study the use of the L1 or L2 as the index of comprehension of L2 texts did not make any significant difference in the students’ L2 reading performance.
Why did the two studies differ in their findings? As noted earlier, many factors may affect a reading comprehension test. Statistically speaking, when all the variables are controlled and assume roughly the same value except for the designated variable – the language of the questions, the difference in the learners’ performance in the test will then yield an answer to the question whether it is better to use L1 or L2 as the language of the questions. If this is so, what may account for the discrepancy between the two findings? A closer look at the two studies indicates that the possible cause of such a discrepancy lies not in the design of the tests but in their sampling. The two studies were based respectively on a particular cohort of learners whose competence in L2 was similar enough to participate respectively in the designated studies. Therefore their findings were true and applicable only with the populations from which their subjects were sampled. In other words, the discrepancy may be because their findings are applicable to different populations.
Secondly, is it more authentic to set the language of the questions in L1? This is a question that concerns the relationships between language and thought. According to Gray (2001), we can survive in a wide range of conditions because we can think of ways to modify them to suit our needs. Thoughts, which serve to make sense of our environment and represent our knowledge of the world, are therefore crucial to our survival. Language, on the other hand, represents concepts and is fine-tuned with thought and knowledge to reflect our physical and social environments. It follows that language becomes central to human cognitive activities (Hampson & Morris, 1996). According to O'Malley and Chamot (1990), L2 learning is closely related to learners’ existing L1-referenced prior knowledge. In the course of meaning making in reading, beginning learners usually match the incoming information against prior knowledge and set up new connections between their existing L1-referenced concepts and the target L2 linguistic forms. L2 acquisition theories, no matter whether they follow a nativist, cognitive or sociocultural model, appear to agree that ‘learners’ performance in a second language is influenced by the language, or languages, that they already know’ (Mitchell & Myles, 1998, p.13). Such mediation by L1 at an early stage of L2 learning (Kroll & Sunderman, 2003) was also found in many earlier studies. Grabe (1991), for example, holds the view that learning L2 vocabulary may be considered ‘largely a matter of remembering a second label’ (p.387). Likewise, Jiang (2000) and Lin (2002) also found in their investigations that L2 learners often resorted to their L1-refrenced knowledge at their initial stages of L2 learning. Likewise, learners are also found to ‘consciously and actively transfer information from their first language for use in the L2’ (O'Malley et al, 1990, p.70) or to borrow from their L1 by translating and incorporating it with whatever implicit and explicit L2 knowledge available when they cope with an L2 learning task (Ellis, 1990).
It is clear from the discussion of the relationship between language and learning that presenting reading comprehension questions in learners’ L1 is of particular pedagogical value in assessing their reading comprehension especially with the beginners. It coincides with learners’ learning strategies, elicits what actually goes on in their minds and achieves its authenticity via its links to learners’ L1-referenced mental presentation of the meanings of the text that takes shape in their process of reading.
Thirdly, according to Harbord (1992), Deller and Rinvolucri (2002) and Wigglesworth (2002), the potential of students’ knowledge of L1 is a resource that should merit considerable attention in any attempt to develop a post-communicative approach to TESOL. Nation (2001) notes:
There is a general feeling that first language translations should not be used in the teaching and testing of vocabulary. This is quite wrong. Translation is one of a number of means of conveying meaning and in general is no better or worse than the use of pictures, real objects, definitions, L2 synonyms and so on. (p. 351)
What is more, according to Nation (2001), using L1 has the pedagogical advantage of being quick, simple and easily understood and its use should be balanced with its disadvantage of reducing the time available for the use of the target language – English. It follows that setting the language of the questions in L1 is desirable in TESOL if it does not reduce seriously the time available for the use of English provided that the intended learners share the same L1 and are of similar proficiency in their L1.
Method
The purpose of this study is to address the question ‘Will it make a difference if reading comprehension questions are set in learners’ L1 instead of English (L2)?’ It intends to explore the reason for the discrepancy in the findings reported in literature and thus offer ESL/EFL teachers a framework of reference if they need one in deciding whether to set reading comprehension questions in learners’ L1 or L2.
Hypothetically, as we noted earlier in the paper, we may say that the discrepancy in the findings of the two earlier studies might be attributed to some difference in the variances of their samples. To indirectly confirm this hypothesis in this study, it is necessary for the study to adopt a cross-sectional sampling and test with differing samples the same H0 hypothesis, that is, setting EFL reading comprehension questions in L1 or L2 makes no difference in learners’ results in the test. In other words, if the samples of the study are taken from different sections of the population and if these samples differ significantly from each other in their L2 competence, testing the H0 hypothesis with these different samples may cast a new light on the issue because the findings of this study may help to interpret the difference in the findings reported in the existing literature.
In order to secure a cross-sectional approach in investigation, this study engaged EFL learners from four different year-levels: Junior Year 2, Junior Year 3, Senior Year 1 and Senior Year 2 in China. These four levels may map exactly onto Years 8-11 in western education systems and will be referred to as such in the remainder of the paper.
3,426 Chinese students at Years 8-11 from four different middle schools took eight different reading comprehension tests: 812 from Year 8; 947 from Year 9; 841 from Year 10 and 826 from Year 11. They have formally studied English as a foreign language respectively for 1.5, 2.5, 3.5 and 4.5 years and their proficiencies in English cannot be compared to any established framework like TOEFL or IELTS because there is no such information available.
Each of the eight tests used in the study comprised six English texts and each text is followed by five questions. Four of the eight tests have their questions set in L1 and will be referred to as Series I tests while the other four have the questions set in L2 and will be referred to as Series II tests hereafter. The two tests for the same year-level in Series I and II use identical English texts and the questions following them are also the same except for the language of the questions, that is, one in L1 and the other in L2.
Results
The answers to the questions in the tests were categorical in nature: T or F for True/False questions; A, B, C or D for multiple-choice questions. To prepare for a statistic analysis, they were quantified: the incorrect answers were assigned a value of 0 and the correct ones took the value of 1.
The test scores were then tested for reliability, using SPSS Alpha (Version 11.5 for Windows) to examine their internal consistency. The statistics are presented in Table 1. It is clear from the table that the Alphaestimates of Series I and II tests demonstrate a clear internal consistency.
Table 1: Coefficient Alpha estimates (Series I & II)
|
Year 8 |
Year 9 |
Year 10 |
Year 11 |
Series I |
0.7488 |
0.7654 |
0.6373 |
0.6620 |
Series II |
0.7925 |
0.7831 |
0.7456 |
0.6694 |
Having confirmed the reliability of the tests, the test scores were then used to test the H0 hypothesis that the two cohorts of participants who participated in the tests at each of the four year-levels have the same variance. In other words, H0 hypothesis is that the students who did Series I test and those who did Series II test belonged to the same population and demonstrated the same variance in their performance in English reading comprehension. If H0 hypothesis is sustained by the data, the difference between the means of the test scores is statistically insignificant and the finding is sustained that setting the language of the questions in L1 or L2 will not lead to significant difference in learners’ performance in their ESL/EFL reading comprehension tests and vice versa.
To test H0 hypothesis, the homogeneity of variance test is used to find out whether the means of the test scores collected for Series I and II tests have the same variance and thus belong to the same population. Since such a test is embedded in t-test in SPSS, an independent-sample t-test was conducted at each of the four year-levels although we were not interested in the statistic outcome of the t-tests (e.g. the t-values, degrees of freedom, means of variables, 95% confidence intervals of difference, etc) because the H0 hypothesis to be tested was about the homogeneity of variance of those variables not the prediction of the values that the means of those variables might take. For the sake of clarity and better relevance, the statistics of t-test of the means have been left out and, for the sake of convenience in comparison, the results of the four homogeneity of variance tests are presented together in Table 2. ANOVA was not used because it is designed for comparing means from more than two samples (SPSS, 1999) and the comparison required for this study was comparing two means from the same year-level doing EFL reading comprehension tests with the questions set in L1 or L2.
Table 2: Independent Samples Test
Series I & Series II |
Levene's Test for Equality of Variances |
F |
Significance |
Year 8 |
3.207 |
.074 |
Year 9 |
.349 |
.555 |
Year 10 |
21.189 |
.000 |
Year 11 |
1.511 |
.019 |
It is clear from Table 2 that the significance levels of the F statistics of the Levene’s tests vary at the four years. Two are larger than .05 while the other two are smaller. According to the theory of homogeneity of variance test, H0 hypothesis of equal variance is sustained if the significance level of F larger than .05 and it is discarded if it is smaller than .05. The significance values of F for Years 8 and 9 are larger than .05, indicating that the students doing Series I and II tests at Years 8 and 9 demonstrated the same variances. The significance values of F for Years 10 and 11 are smaller than .05, indicating that those students at Years 10 and 11 performed differently when the language of the questions was changed from L1 to L2. In other words, setting questions in L1 or L2 made no significant difference in Years 8 and 9 students’ test results and the H0 hypothesis was statistically sustained. However, the same hypothesis was not supported at Years 10 or 11, and the H0 hypothesis had to be discarded accordingly.
Statistically speaking, the homogeneity of variance test is designed to test whether the means of quantitative variables are equal in their variance or their spread differ significantly (SPSS, 1999). To find out more about the relationships between the variables themselves, Pearson Correlation Coefficient was used to examine if there exists any linearity in their relationships. The underlying assumption for using Pearson Correlation Coefficient was that there would be a strong linear relationship between Series I and II test results if the questions set in L1 or L2 made no significant difference in learners’ reading comprehension. That is to say, one who did well when the questions were set in L1 would do equally well when they were set in L2.
Pearson Correlation Coefficient could not be applied directly to individual test scores, because, in order to avoid any unwanted influence of prior knowledge of the test, the participating students were asked to do only one test. Consequently, Series I and II tests results came respectively from two different cohorts of students and they were not paired. Therefore the means of the scores had to be used instead. As an illustration, consider Year 8 Series I and II tests. 440 Year 8 students did Series I and another 372 Year 8 did Series II. The means (i.e. MeansI and MeansII) of the scores of the two groups for the comprehension of the six texts in their Series I and II tests are tabulated in Table 3.
Table 3: Means of Series I and II Tests Scores at Year 8
|
Text 1 |
Text 2 |
Text 3 |
Text 4 |
Text 5 |
Text 6 |
Series I (MeansI) |
4.93 |
4.66 |
3.72 |
4.52 |
4.16 |
3.64 |
Series II (MeansII) |
4.85 |
4.40 |
2.98 |
4.04 |
4.01 |
3.85 |
Pearson Correlation Coefficient was then used to analyse the relationship between these two groups of 6 means (i.e. MeansI and MeansII). The correlations coefficients of the four years are thus calculated respectively and for the sake of convenience in comparison, they are presented together in Table 4.
Table 4: Correlation Coefficients between Series I and II Score Means
|
MeansI (Questions set in L1) and MeansII (Questions set in L2) |
Year 8 |
Pearson Correlation |
.849(*) |
|
Significant (2-tailed) |
.033 |
|
Number |
6 |
Year 9 |
Pearson Correlation |
.781 |
|
Significant (2-tailed) |
.067 |
|
Number |
6 |
Year 10 |
Pearson Correlation |
.482 |
|
Significant (2-tailed) |
.333 |
|
Number |
6 |
Year 11 |
Pearson Correlation |
.237 |
|
Significant (2-tailed) |
.652 |
|
Number |
6 |
Note: * -- Correlation is significant at the 0.05 level (2-tailed).
It is clear from Table 4 that the correlation coefficients vary considerably from one year-level to another. At Year 8, the correlation coefficient is as high as .849. However, the linearity diminishes when the year-level increases. The correlation coefficient for Year 9 is lower at .781 and yet still demonstrates a linear relationship. However, such linearity disappears when it comes to Years 10 and 11 as .482 and .237 do not indicate obvious linearity in their relationships.
Discussion
The homogeneity of variance tests of the means of the participating students’ test results revealed an interesting finding – Years 8 and 9 students appeared to perform equally well disregarding the questions were set in L1 or L2 while Years 10 and 11 students’ performance differed significantly when the language of the questions was switched from L1 to L2. Similarly, the Pearson correlation coefficients of the means of those students’ test results also indicated that Year 8 and 9 tended to perform equally well no matter the language of the questions was L1 or L2 while Year 10 and 11 students had significantly different results when the language of the questions changed from L1 to L2.
The conflicting analytic results posed two questions the study had to address: Why did Years 8 and 9 students differ significantly from Years 10 and 11 students? Why did linearity diminish when the year-level increased? The discrepancy could not be attributed to the design of the study, which was based on a census type of sampling and whose samples were sufficiently large: 1688 subjects for Series I tests and 1738 subjects for Series II tests, about 400 at each year-level. With so large a sample size, such factors as idiosyncratic differences among participating students in terms of their L2 reading abilities can be ignored statistically, and the analyses conducted through the Independent Samples T-Tests and Pearson Correlation Coefficient in SPSS are widely accepted in statistical studies.
Why can the H0 hypothesis find support at Years 8 and 9 but not at Years 10 or 11? To answer this question, we may look at the texts used in the tests. Due to the limitation of space, we shall look at 4 of the 24 texts used in the tests, that is, the first text in each of the four tests across the four year-levels (see Texts 1-4 in Appendix).
It is clear that the four texts differ in what Alderson (2000) called the topic, content, type and genre of a text. Text 1 (Year 8) is a dialogue on a bus, presuming on the part of the reader some prior knowledge of the commencement of a new semester at school and the protocol of greeting among fellow students. Text 2 (Year 9) is a story and expects the reader to have some prior knowledge of travelling by train. When it comes to Text 3 (Year 10) and Text 4 (Year 11), they become demanding in terms of presumed prior conceptual and sociocultural knowledge: Text 3 requires a prior knowledge of machinery and its functions in modern society while Text 4 assumes the reader has some prior knowledge of America and American culture such as the Statue of Liberty may be referred to as Miss Liberty.
Similarly, the comparison of the questions associated with the four texts also indicates a variety in difficulty. According to Alderson (2000), textually explicit, textually implicit and script-based questions vary in difficulty: from the least difficult to the most difficult. The questions for Texts 1 and 2 are either textually explicit or textually implicit and their expected responses are basically a matter of matching the responses with the texts as most of the expected information for the responses can be found in the texts. In contrast, those for Texts 3 and 4 are largely script-based and their expected responses require more inferential reasoning processes and more background knowledge. Therefore, the questions are of less difficulty for Years 8 and 9 students while those for Years 10 and 11 students are of greater difficulty.
According to Alderson (2000), texts that differ in topic, content, type and genre may facilitate or impede the reading process and thus vary in their demand on readers’ L2 competence. If a text is simple, students tend to respond equally well disregarding the questions are set in L1 or L2, as they can access the necessary knowledge and learning strategies through either L1 or L2. Following the theory of transfer, the L1 affects L2 learning through its influence on the hypotheses that learners construct (Ellis, 1994). During the process of reading, the L2 readers needs to construct constantly their hypotheses of the meanings of the text (Rumelhart, 1994) and the L1-referenced knowledge is accessible when the L2 text is comprehensible. However, when the L2 text becomes linguistically and conceptually complex and less comprehensible, students who are more competent in L1 and less so in L2 will respond to the L1 questions better because they may make use of the clues hidden in the more comprehensible questions, but such a L1 transfer will not be possible if the questions are set in L2. That is why the H0 hypothesis was statistically sustained at Years 8 and 9 but not at Years 10 or 11. Likewise, the increasing difficulty and challenge imposed by the texts across the four years may also account for the diminishment of linearity in the relationship between MeansI and MeansII because, as the year-level increased, the availability of the L2-referenced knowledge decreased more rapidly than that of the L1-referenced knowledge and the discrepancy between students’ L2 competence and the difficulty of the reading tasks increased accordingly.
In comparison with the findings of other studies, the findings of this study about Years 8 and 9 students’ performance in the reading comprehension tests agreed with Chen and Donin’s (1997), that is, readers do not perform better in L1 than in L2. However, the findings of this study about Years 10 and 11 students’ performance in the tests confirmed Shohamy’s findings (1984) that the questions set in L1 were easier than in L2. It appears that, apart from the influence of the linguistic distance between the L1 and L2 as indicated in Chen and Donin’s study (1997), the nature of the L2 reading tasks also played a very important role. This was particularly obvious in this study, that is, when the linguistic distance between the L1 and the L2 was the same across the four year-levels, it was the difference in the nature of the reading tasks and their varying demands on the readers’ responses that led to the difference in the students’ performance in their L2 reading comprehension tests.
Conclusion
This investigation of 3,426 EFL students’ responses to reading comprehension questions set in L1 or L2 showed that setting the questions in L1 or L2 would not make significant difference in students’ responses if the students’ competence in L1 and L2 was equally sufficient to meet the challenge imposed by the reading task. However, if they were more competent in L1 and if their competence in L2 was inadequate, they tended to score better if the questions were set in L1 as the additional clues in L1 enabled them to apply their L1-referenced prior knowledge and strategies to the new task, which would be impossible if the language of the questions was set in L2.
It must be noted that, although this study represents a step toward a better understanding of the impact of setting the language of the questions in learners’ L1 and L2 and has yielded an account for the discrepancy in the past findings about the language of the questions in ESL/EFL reading comprehension tests, it has its limitations. It is still unclear about the relationships between the language of the questions and the three question types. Further studies need to find out whether setting the language of the questions to L1 or L2 would make any significant difference in learners’ responses if the questions were textually explicit, textually implicit or script-based.
For all its limitations, this study has shown that, when the intended learners, especially EFL beginners, shared the same L1 and are of similar proficiency in L1, it would be pedagogically better to set the reading comprehension questions in L1 than in L2 as it would be more authentic in matching learners’ reading strategies and it might elicit more truthful and better responses. It would also be educationally beneficial as more positive feedbacks are believed to be more constructive and more encouraging to language learners, especially the beginners.
References
Alderson, J. C. (2000). Assessing reading. Cambridge: Cambridge University Press.
Chen, Q., & Donin, J. (1997). Discourse processing of first and second language biology texts: Effects of language proficiency and domain-specific knowledge. The Modern Language Journal, 81(2), 209-227.
Deller, S., & Rinvolucri, M. (2002). Using the mother tongue: Making the most of the learner's language in a second language classroom. London: First Person Publishing.
Ellis, R. (1990). Instructed second language acquisition. Oxford: Basil Blackwell.
Ellis, R. (1994). The study of second language acquisition. Oxford: Oxford University Press.
Grabe, W. (1991). Current developments in second language reading research. TESOL Quarterly, 25(3), 375-396.
Gray, P. (2001). Psychology (4th ed.). New York: Worth Publishers.
Hampson, P. J., & Morris, P. E. (1996). Understanding Cognition. Oxford: Blackwell.
Harbord, J. (1992). The use of the mother tongue in the classroom. ELT Journal, 46(4), 350-355.
Jiang, N. (2000). Lexical representation and development in a second language. Applied Linguistics, 2(1), 47-77.
Kroll, J. F., & Sunderman, G. (2003). Cognitive processes in L2 learners and bilinguals. In C. J. Doughty & M. H. Long (Eds.), The handbook of second language acquisition (pp. 104-129). London: Blackwell.
Lin, Z. (2002). Discovering EFL learners' perception of prior knowledge and its roles in EFL reading comprehension. Journal of Reading Research, 25, 172-190.
Mitchell, R., & Myles, F. (1998). Second language learning theories. London: Edward Arnold.
Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge: Cambridge University Press.
O'Malley, J. M., & Chamot, A. U. (1990). Learning strategies in second language acquisition. Cambridge: Cambridge University Press.
Richards, J. C., & Rodgers, T. S. (2001). Approaches and methods in language teaching (2nd ed.). Cambridge: Cambridge University Press.
Rumelhart, D. E. (1994). Toward an interactive model of reading. In R. B. Ruddell, M. R. Ruddell & H. Singer (Eds.), Theoretical models and processes of reading (4th ed., pp. 864-893). Newark: International Reading Association.
Shohamy, E. (1984). Does the testing method make a difference? The case of reading comprehension. Language Testing, 1(2), 147-170.
SPSS, I. (1999). SPSS Base 10.0 applications guide. Chicago: SPSS Inc.
Wigglesworth, G. (2002). The role of the first language in the second language classroom: Friend or foe. English Teaching, 57(1), 17-31.
Appendix (see PDF file)