THE IMPACT OF ADDITIONAL CLIL EXPOSURE ON ORAL ENGLISH PRODUCTION

This study aims at testing the effectiveness of additional CLIL (Content and Language Integrated Learning) exposure on the oral production of secondary school learners of English as a Foreign Language. CLIL learners, who had received a 30% increase in exposure by means of using English as a vehicular language, were compared to mainstream English students in a story-telling task. Analyses indicated that CLIL learners’ productions were holistically perceived to exhibit better fluency, lexis and grammar while no differences were found as regards content and pronunciation. Besides, although Non-CLIL learners’ productions were larger in quantity and longer in time, CLIL learners produced denser and more fluent narrations, as attested by their higher number of different words over total number of words, of words over turn, and of utterances over turn. Additionally, CLIL learners resorted to their first language to a lesser extent and demanded fewer vocabulary clarifications. Our findings thus go along with previous research which has revealed advantages of additional CLIL exposure on oral English production.


INTRODUCTION
Many schools in Spain are currently incorporating Foreign Language (FL) teaching programmes where English is used as a vehicular language, the socalled Content and Language Integrated Learning (CLIL) programmes, to teach other school disciplines 1 . This type of exposure aims at developing both subject and language knowledge (Marsh 1994) driven by the need to accommodate European Commission requirements on multilingual education with the purpose of facilitating communication and social cohesion amongst European citizens. of language and content outcomes and ii) establish a consolidated educational approach based on CLIL tenets, namely intense exposure and real communication.
Regarding language outcomes (see Dalton-Puffer 2011; Ruiz de Zarobe 2011 for an extensive description), several comparative studies have been conducted so as to investigate how CLIL and traditional FL courses compare to each other in terms of FL proficiency achievement. Researches such as those of Sylvén (2004Sylvén ( , 2006 in Sweden, Bürgi (2007) in Swizerland, Xanthou (2007) in Cyprus, or Jimenez Catalán, Ruiz de Zarobe and Cenoz (2006), Jiménez Catalán and Ojeda (2009), Jiménez Catalán and Ruiz de Zarobe (2009) and Moreno Espinosa (2009 in Spain have pointed out to the supremacy of students enrolled in CLIL settings over traditional EFL ones as far as vocabulary is concerned. The development of morphosyntactic skills has also been explored in CLIL vs. traditional FL environments, where significantly better outcomes have been attested for CLIL groups in countries such as Austria (Ackerl 2007;Hüttner and Rieder-Bünemann 2007) and Spain (Villarreal and García Mayo 2009;Martinez Adrián and Gutierrez Mangado 2009). Studies on writing skills in Spain (Navés 2011;Ruiz de Zarobe 2010) also show positive results on the part of CLIL learners.
More relevant to the present work are those studies which have explored CLIL outcomes as regards oral skills. In fact, oral production has been acknowledged to be one of the linguistic aspects which may benefit most from the methods which foster the use of the language in meaningful contexts (Block 2003), and CLIL is undoubtedly one of these. However, some authors point out that CLIL leads to "erratic results as far as speaking is concerned" (Van de Craen et al. 2007: 71). The still few studies conducted in these lines have used diverse perspectives in the examination of oral production skills. Some studies have analysed overall oral proficiency by means of holistic methods (Lasagabaster 2008;Ruiz de Zarobe 2008), other studies have measured discourse production (Hüttner and Rieder-Bünemann 2007;Whittaker and Llinares 2009) while others have focused on pronunciation (Gallardo del Puerto et al. 2009, Rallo Fabra andJuan-Garau 2010). Lasagabaster (2008) conducted a holistic comparative analysis between CLIL and traditional FL groups of oral (and written) production in English by 198 secondary school students in the Basque Country. In this study the CLIL group outscored those subjects in the traditional programme significantly in all the variables analysed: pronunciation, vocabulary, grammar, fluency and content. Similar results were also obtained by Ruiz de Zarobe (2008) using the same instrument, though in a previous study Ruiz de Zarobe (2007) found no statistical significant differences between judges' holistic evaluations of CLIL and non CLILs' oral productions. However, it should be noted that all these studies do not specify whether extra-curricular exposure to the FL was controlled. Hüttner and Rieder-Bünemann (2007) also investigated the effect of CLIL on oral narrative competence in a comparative study with 44 secondary students in Vienna. This study examined the narrative aptitudes in an analysis of the content development in the actions depicted. This study revealed that the CLIL group outperformed their Non-CLIL counterparts, as they referred to all plot elements and textualise conceptually complex elements to a slightly greater extent. However, this study poses some limitations such as the fact that differences between groups were not supported statistically and affective variables such as motivation were not controlled. In fact, it is important to note that affective variables may play a role, as CLIL learners have been purported to show lower inhibition levels when speaking (Dalton-Puffer, Hüettner, Schindelegger and Smit 2009), to exhibit less angst in the classroom (Dalton-Puffer 2009), and to have better attitudes towards FL learning and multilingualism (Lasagabaster 2009;Lasagabaster and Sierra 2009). Whittaker and Llinares (2009) conducted some preliminary work on oral production in the CLIL classroom in first-year secondary students. Although the authors claim that their data await to be statistically analysed against a control group, they report they noticed a rise in oral fluency by the end of the year (these data are not provided, though) and comment that, as indicated by the number of words and error-free clauses, CLIL productions were as rich as those produced by traditional EFL learners in late secondary levels.
As for pronunciation, Gallardo del Puerto, Gómez Lacabex and García Lecumberri (2009) compared the degree of foreign accent of teenage students that were learning English through traditional classroom instruction with those learning in CLIL environments. Additionally, they tested the communicative effects of foreign accent, specifically the intelligibility and irritation produced by learners' accent in a narration task as perceived by a group of naïve native speakers of English. This study concluded that CLIL students' accent was judged to be more intelligible and less irritating than that of the students engaged in traditional FL lessons. However, and surprisingly, no differences in terms of degree of foreign accent itself were discovered. In the same vein, Rallo Fabra and Juan-Garau (2010) have recently conducted a study in which intelligibility and accentedness differences between CLIL and FL students were also explored longitudinally. This study analysed differences between the two groups over a year and also added a comparison to a group of English monolingual speakers matched in age. Preliminary results in a reading aloud task also read that CLIL students were more intelligible than the FL ones and that differences in accentedness were slight. Interestingly, no differences between the two testing times (1 year apart) were found in the CLIL group, indicating that one year of CLIL instruction may not be sufficient to improve aspects such as intelligibility or accentedness. The authors also suggest that, in fact, these aspects may not improve unless specific attention is driven towards them (see also Garcia Lecumberri and Gallardo del Puerto 2003; Fullana 2006).
Given that existing research has produced arbitrary results on the effects of CLIL on oral skills ( Van de Craen et al. 2007;Ruiz de Zarobe 2007) and that potential positive outcomes have been suggested to be less evident in secondary school students ( Van de Craen et al. 2007), the present study aims at exploring the oral production on the part of two secondary education groups which, having started learning English as a FL at the same age and presenting similar motivation rates, differ in the methodological approach and in amount of exposure, since one of the groups has received CLIL instruction for 3-4 years in addition to traditional FL lessons.

PARTICIPANTS
The participants in this study were 28 Basque-Spanish bilingual children attending secondary school in 3 rd and 4 th grades with a mean age: 14.6 (see table 1). Exposure to the foreign language outside school was controlled and, hence, the sample was selected by eliminating those learners who attended extra lessons or had stayed abroad in English-speaking countries. Participants received school instruction in Basque (the minority language in the community), whereas Spanish (the majority language in the Basque Country) and English (a foreign language in the community) were school subjects to which 4 and 3 hours per week were devoted respectively. All participants had started learning English when they were 8 years old. Learners were divided into two groups (CLIL group and Non-CLIL group) of 14 students each according to whether they were/not receiving extra exposure to English by means of CLIL. Both groups were made up of 10 students in their 6 th year of English learning (3 rd graders) and 4 subjects in their 7 th year of English instruction (4 th graders).
By the time of testing, the Non-CLIL group had received around 720 hours of traditional EFL teaching while the CLIL group had received this same exposure in addition to an average of 250 hours in the CLIL fashion, including subjects such as science, biology or geography/history (see table 1). The CLIL programme had been implemented in secondary school so the CLIL learners had been receiving content-based instruction in English for 3-4 years.

INSTRUMENTS
The participants were enrolled in a story-telling activity in which they were individually presented a series of wordless black and white vignettes: Frog, where are you? (Mayer 1969). Students had to look at the pictures and tell the interviewer the story in English. Productions were recorded in a digital audio tape recorder (TCD-D100) in a quiet room. Participants' productions were assessed holistically by 2 trained listeners involving the following variables: pronunciation, use of vocabulary, grammatical correctness, fluency and content development in a 1 to 10 scale (Cenoz 1991). The assessment sheets were facilitated with instructions/ guidelines which the judges could access whenever needed. The two assessors (aged 30-35) were Spanish-natives postgraduates in English Studies and experienced language judges. A second analysis was also computed so as to explore the productions quantitatively. The full outputs were transcribed and codified in Childes and a frequency count was computed for the following variables: total no. of words, total no. of words-L1/s transfer, (total number of words used in English only), total no. of different words, total no. of utterances, total no. of turns as well as for no. of different words over no. of words, no. of utterances over turn and no. of words over turn. The time the students used to narrate the complete story was also controlled for (narration time) as well as those Basque and Spanish words uttered by the student (L1/s transfer) and, finally, the number of interventions on the part of the person interviewing (interviewer turns). With regard to this variable, instructions for interviewers read that these were only to intervene should the subject explicitly ask for lexicon clarification. All these variables will be clustered in the results section in three main frameworks: variables which elucidate 'amount of production': total no. of words, total no. of words-L1/s transfer, total no. of  (table 4); and variables which reveal strategies which the students may use to compensate for lack of L2 resources, namely native language transfer (L1/s transfer) and appeal for vocabulary assistance (interviewer turns) (table 5). Motivation towards the English language and the English lessons was controlled for by means of two tests which examined attitudes towards the English language in a 7-point Likert scale (Motivation test 1) and a 13-question test which tested mainly instrumental motivation or the practicality of the language for their future careers (Motivation test 2). Neither of these two variables (in percentages-% and standard deviations-SD in Table 1) reported significant differences between the CLIL and Non-CLIL groups indicating that the groups exhibited a similar and rather positive motivation rate towards the English language.

HOLISTIC ASSESSMENT
A first and necessary analysis explored the reliability of the holistic assessment by correlating the data provided by the two judges for the two groups. Moderate to high correlational indexes were found in all variables assessed (pronunciation: r(26) = .63, p < .001; use of vocabulary: r(26) = .84, p < .001; grammar correctness: r(26) = .93, p < .001; fluency: r(26) = .93, p < .001; content development: r(26) = .78, p < .001), indicating that both judges employed similar criteria in the evaluation. T-Test analyses were computed so as to establish comparisons between the means (range: 1-10) of the 5 variables assessed by the two judges in the holistic evaluation of students' oral productions according to the instruction received (CLIL and Non-CLIL). As can be seen in Table 2, the CLIL group outscored the Non-CLIL group in all variables analysed. The CLIL group significantly outscored the Non-CLIL group in grammar and fluency (t(26)=2.94, p<.05 and t(26)=2.10, p < .05 respectively), which assessed grammatical accuracy and communicative effect and continuity and speed of speech respectively. Along the same lines, CLIL superiority turned out to be marginally significant (p=0.84) in vocabulary.
However, those variables which measured density of production (  Table 4. Density of production: mean scores, standard deviation (SD) and significance.
As for compensation strategies (Table 5), it can be observed that the CLIL group used the L1/s significantly less than the Non-CLIL group (t(25) = -3.31, p < .005). It is also observed that that interviewers interacted significantly more with the Non-CLIL group, interviewer turns: (t(25) = -2.69, p < .05), an indicator that these subjects demanded lexical cues more often than the CLIL group. Table 5. Compensation strategies: mean scores, standard deviation (SD) and significance.

DISCUSSION
The present study has provided comparative data on oral output outcomes on the part of students enrolled in a CLIL programme and students engaged in traditional foreign language lessons (Non-CLIL) so as to elucidate the potential effect of additional CLIL exposure. The holistic assessment of oral production has evinced that CLIL students outscore traditional students in all variables but that this superiority is only significant in use of grammar, fluency and vocabulary. These findings go along with those studies reporting that in CLIL settings fluency and vocabulary development seem to be more benefitted areas (Dalton-Puffer,  (Hüttner and Rieder-Bünemann, 2007). There are several factors which can account for the hindrance in pronunciation in these learning contexts. First, teachers are very often non-native speakers in these formal learning environments (Cenoz 2003) and we may find various levels of phonetic competence among these professionals. Second, pronunciation has been referred to as "the least useful of the basic language skills" (Quijada 1997) given that language teaching goals aim at the need to understand and be understood (intelligibility) rather than attaining a native accent (Jenkins 2000;Levis 2005). In fact, research has shown that intelligibility may not be necessarily confronted with foreign accent (Munro 2008). A third possible factor which may have contributed to the lack of advance in pronunciation is the fact that most of the English text books used in Basque secondary schools are characterised by the scarcity of exercises targeting pronunciation skills (Gallardo del Puerto 2005). Finally, a further sociolinguistic factor may be mentioned in these lines, namely the poor presence of native English in the media and entertainment given the strong present of the dubbing industry in Spanish (TV series and cartoons, films at theatres or video games). We did not observe differences in content development between the groups. These results may be related to the type of task used in this study and the administration mode. The story telling activity guided with pictures was presented to the students so that they would access the vignettes sequentially during the task. This procedure, along with the fact that the story was the same for all the subjects, may have limited the development of the plot or further development of characters or scenes. It shall be noted that for a more efficient assessment of content development skills a less guided (maybe semi-guided) task may better explore potential differences in the contextualization and character, plot and scene development. As other authors have pointed out (Hüttner and Rieder-Bünemann 2007), a further possible reason for this lack of differences in content development may be that of cognitive development, that is, the ability to extend and detail the story may progress independently of the amount of the type of instruction received.
The quantitative analysis provided in this study revealed interesting results. Unlike Whittaker and Llinares (2009), simple frequency counts showed that the Non-CLIL students produced longer outputs: more words, more different words, more utterances and more turns (Table 3). It is interesting to note that when the variable use of the L1s was controlled for, these differences were still present (Total No. of words-L1s transfer, in Table 3). A further important measurement taken, narration time, revealed that these students used significantly more time in telling the story than the CLIL group. However, when exploring the data in terms of density in production (Table 4), measured in number of different words over number of words, number of utterances over turn and number of words over turn, and in terms of compensation strategies (Table 5), measured in amount of L1 use and number of interviewer interventions, data reveal some advantages on the part of the CLIL group. First, their outputs become more compact as they use more utterances and words in each turn, data which go along with the higher fluency on the part of the CLIL group observed in the holistic assessment and which evinces that the two analyses used in the present study (holistic and quantitative) report similar findings. Secondly, those variables which aimed at exploring compensation strategies revealed that there was a significantly higher use of words and expressions in Basque and Spanish in the Non-CLIL group, as well as many more interviewer turns, in the form of, mainly, vocabulary clarifications. This may have accounted for the advantage on the part of the Non-CLIL groups in terms of 'quantity' of production observed in Table 3 but actually reveals an advantage on the part of the CLIL group in 'quality' of production, understood in this study as revealing a more fluent and denser narration as well as a better ability to limit the access to the L1/s. This last finding could evince that CLIL students either already knew the vocabulary they needed to tell the stories or showed a decrease in the use of negotiation and repair strategies which characterise foreigner talk (Gass & Varonis 1991). In other words, it might be conceded that CLIL learners do further rely on target language-based knowledge or compensation strategies, which makes them be less dependent on both the L1 and the interviewer.

CONCLUSIONS
Our study supports the findings of those investigations indicating that the CLIL approach is associated with better language outcomes (Ackerl 2007;Hüttner and Rieder-Bünemann 2007;Bürgi 2007 Sylvén 2004Sylvén , 2006Villarreal and García Mayo 2009;Xanthou 2007) and more particularly the findings of research pinpointing that oral production can be enhanced by CLIL (Hüttner and Rieder-Bünemann 2007;Lasagabaster 2008;Ruiz de Zarobe 2008;Whittaker and Llinares 2009). We have verified that additional CLIL exposure leads to better oral production in a story-telling task. More specifically, our CLIL learners have been found to display more fluent and denser speech characterized by better grammar and vocabulary, as well as lesser reliance on both the L1 and the interviewer's help, all of which make CLIL learners more efficient and independent speakers of the foreign language. However, as far as pronunciation is concerned, and in agreement with previous research (Gallardo del Puerto, Gómez Lacabex and García Lecumberri 2009; Rallo Fabra and Juan-Garau 2010), the positive effect of CLIL is not so clear, which confirms Van de Craen et al 's (2007) conclusion regarding the controversial results of CLIL in the case of oral production.
Nonetheless, the present study is not without some limitations. First, we have been unable to gather data from groups which only differ in amount of exposure, that is, groups which may have received the same amount of English hours differing in the type of instruction (CLIL vs. traditional EFL) only. This is so given that CLIL methodologies are mainly being implemented in Spain by adding exposure to traditional English lesson rather than substituting those hours by CLIL ones. As a result, the two groups analysed in the present study not only differ in type of exposure but also in amount of exposure, the CLIL group having received more instruction hours, which may be interpreted as a factor contributing to the superiority observed. Some researchers (Lasagabaster 2008;Navés 2011;Ruiz de Zarobe 2008, 2010 have tried to rule out the effect of amount of exposure by comparing students receiving additional CLIL exposure to traditional students enrolled one or two grades above. The result of these comparisons, however, seems to indicate that, in spite of a younger age, CLIL students obtain better language outcomes than traditional students. Hence, we will try to approach/address this type of comparison in future research. Alternatively, a comparison between our CLIL students and a group of amountof-exposure-matched Non-CLIL peers who have started to learn English at an earlier age than their CLIL counterparts would be addressed, if possible. A further limitation of our study relates to the nature of the instrument employed. We are aware that, having not given speakers a set narration time in the story-telling activity, measurements such as the type-token ratio provided by Childes may be less reliable when interpreting density of vocabulary used (McKee et al. 2000) as those students having taken longer to narrate the story are likely to have used a wider range of lexicon. In an attempt to control for this factor, we have provided duration of narration time as a variable and although we did not find differences in density (No. of different words/No. of words in table 4), we could see that the group using more narration time did have a higher ratio in this variable. The lack of significance in the present study may owe to the nature of the task: a picture-guided task, which could have hindered these effects as the pictures provided may have led the participants to access similar lexicon if compared to a free speech task or a story-telling task without guiding pictures.