Introduction
Following its introduction to the research field in 1950s, corpus-based research has gained prominence particularly in the fields of linguistics and language education. Corpora have been utilized exponentially in almost all branches of linguistics including but not limited to, lexicography, grammatical and lexico-grammatical studies, stylistics, pragmatics, semantics, sociolinguistics, discourse analysis, and language pedagogy. With its empirical nature, corpus data makes linguistic analysis more objective while also providing an insight into the ways that language use differs in line with register, genre, discipline, and the mode of communication. The real and authentic occurrences of language provided by corpus data have soon come to be recognized as a great resource for language pedagogy. This upsurge of interest in the use of corpora in language research and pedagogy has in turn led to the emergence of corpus-based learning materials to be used in language learning classrooms.
The pedagogical applications of corpus use include both direct and indirect access to corpora. These various interaction patterns with corpora are referred to as Data-Driven Learning (DDL) (Boulton, 2010). As a dominant paradigm in earlier studies, direct DDL provides language learners with direct access to a concordancing software in their process of exploring rules and patterns of specific linguistic items, whereas, in indirect DDL, learners use paper-based materials prepared by teachers or researchers based on corpus data. While language learning practices in traditional classrooms heavily depend on explicit instruction and textbook use, a data-driven approach to learning provides room for guided discovery tasks in which the learner takes on the role of a researcher treating language as data, while also undertaking such multiple tasks as observation, critical reasoning, and analysis. A major component of this approach is that it shifts the role of the teacher from a knowledge presenter to teacher as a researcher. Leech (1994) defines this new role of the teacher as “Teacher the Seeker, who (like the academic scientist or scholar) knows only some of the answers and would like to know more.” (p.20)
Data-driven Learning: Key Factors to Consider in Classroom Implementation
The movement toward a more discovery-based type of learning in language education has brought with itself novel approaches to curriculum and material design as well as assessment and evaluation. An ever-expanding body of scholarly research has demonstrated the potential of DDL-based instruction for enhancing writing skills of L2 learners. The advantages of a corpus-assisted language learning approach occupy a prominent place in DDL literature; however, a critical pedagogical issue is how language educators can ensure the effectiveness of DDL activities in their respective classrooms. The process of syllabus, materials, and activity design requires a considerable amount of attention, and a key factor that informs this design process is an awareness of the factors that influence the learning outcomes of DDL activities. To that end, the current paper, reviews the current literature and presents an overview of some of the important variables to consider in the design and implementation process of DDL-based instruction in EFL/ESL writing classrooms.
Task Type
As research has shown, an important factor to take into consideration in instructional design is the task type used in DDL-based writing activities. Previous studies, for instance, investigated learning outcomes regarding the use of linking adverbials, collocations, idea generation and creative writing, and synonyms for adjectives. In most of these studies, error-correction tasks have been the central focus of the research (Charles, 2007). Similarly, in his study with 93 upper-intermediate EFL students, Tono (2016) investigated how corpus use could contribute to revising and correcting different types of errors. The students produced a short essay without using a dictionary, and the instructors gave coded feedback for errors in lexis and grammar. After the classification of errors into such categories as omission, addition, and misformation, the students were asked to consult corpus tools to correct their errors. While the study revealed positive changes in omission and addition types of errors, the accuracy rates for misformation were significantly lower. In other words, though DDL approach could help resolve issues in various aspects of writing, it may not work equally well in correcting all types of errors. Therefore, it is the teachers’ primary responsibility to make sure that the selected task is appropriate for student levels and needs.
Methodology
Another important factor is the methodology employed in the process of DDL implementation. Many of the recent studies demonstrated that direct and indirect use of corpora have its own distinctive advantages and disadvantages; a sheer volume of direct DDL studies revealed several different positive changes in L2 learners’ writing practices including improvement in genre-specific language use, fewer linguistic errors in the use of collocational and colligational patterns, lexical development in the selection of contextual vocabulary and an increased awareness in rhetorical language and lexico-grammatical patterning (Charles, 2007). The second strand of studies, which employed indirect DDL, revealed positive outcomes on the efficacy of corpus-printed materials in language learning such as enhanced noticing skills in collocation use and prepositional colligations, improved accuracy in the use of phrasal verbs and linking adverbials, and of passive voice to name a few (Boulton, 2010). Most of these studies, however, explored the effectiveness of either approach as exclusive of one other; only a few researchers underpinned the necessity of a comparative study on the benefits of direct corpus consultation as opposed to indirect application (Boulton, 2010). Yoon and Jo (2014), for example, undertook a small-scale research study investigating the influences of both methods on students’ error correction practices. As a result, the rates of self-correction were found to be higher for lower levels in indirect DDL activities. Direct DDL activities, on the other hand, proved to have positive results in learner autonomy for advanced learners. The study lends support to findings from the study of Boulton (2010)-that is, an indirect DDL instruction could be more appropriate for lower-levels or novice L2 writers and could also be a transitional step to direct DDL.
Trainings in Corpus Consultation
Another necessary component of improving instruction and facilitating successful outcomes in DDL-oriented learning contexts is the availability of and access to extensive trainings in corpus consultation. Direct use of corpora, in particular, is typically assumed to be confined to advanced levels (Luo, 2016). Nevertheless, as previous studies also suggested, training is essential for both lower levels and advanced levels of students; while providing appropriate training could reduce the challenges that lower-level students experience in using corpus tools, it could also help both groups of learners to exploit corpora more effectively. Numerous studies have found that trainings involving scaffolding prompts and teacher guidance improve students’ performance in DDL-based activities, minimize challenges in corpus search techniques, and help develop a more positive attitude towards using corpus-based tools (Chang & Sun, 2009). However, a synthesis of the results from these studies demonstrates that there are a few principles to bear in mind to ensure the effectiveness of these trainings. First, a training session should be rather situated within the framework of individual characteristics, learners’ needs, and abilities. Second, they should be arranged in a way that learners could recognize the complementary role that corpus tools play in their process of writing and thus could develop a meaningful partnership with corpus tools. Third, trainings should include a wide variety of activities ranging from teacher modelling to group work and inductive/deductive tasks before immersing students in individual exploratory tasks (Chang and Sun 2009).
Selection of Corpus Tools
Lastly, the type and size of the corpus play a decisive role in the effectiveness of DDL activities. An analysis of the studies in the field suggests that the most popular tools used as reference sources in DDL tasks include general corpora, such as the British National Corpus (BNC), Corpus of Contemporary American English (COCA), and Collins COBUILD Corpus (Boulton 2018). Others include specialized corpora since they provide register, genre, and discipline-specific information on language use. In addition to these aforementioned tools, web-based corpus tools and Google search engines have also been used in recent studies While each has its own merits and demerits, it is important to note that while a general corpus has the distinctive advantage of catering to various student needs, a specialized corpus including a compilation of discipline-specific writing samples can be particularly useful in EAP/ESP settings. Given that writing is more of an individual and private process, situating corpus selection process within a learner-specific framework is instrumental. Developing such a framework is, of course, conditional upon tracking learner interaction with the corpus tools more closely.
Conclusion
As the review of literature in DDL suggests, task type, methodology, availability of corpus consultation trainings, and the type of corpora chosen for DDL-based activities are important variables that need to be taken into consideration prior to designing DDL-based activities. However, it should also be noted that most of these studies have their own limitations. Further complementary analyses should be undertaken to explore the long-term benefits of corpus use in DDL classrooms. In addition, follow-up studies are needed to investigate if learners have incorporated corpus use into their real-life writing tasks and/or independent studies. Also, as most of these studies examined direct and indirect applications of corpora exclusive of one another, future research should alternate between these two approaches to provide a better understanding of their effects and to explore the extent of any possible impact.
References
Boulton, A. (2010). Data-driven learning: On paper, in practice. In T. Harris, & M. Jaén (Eds.). Corpus linguistics in language teaching (pp.17–52). Bern: Peter Lang.
Chang WL, Sun YC (2009) Scaffolding and web concordancers as support for language learning. CALL, 22(4), 283–302.
Charles, M. (2007). Reconciling top-down and bottom-up approaches to graduate writing: Using a corpus to teach rhetorical functions. Journal of English for Academic Purposes, 6, 289-302.
Luo, Q. (2016). The effects of data-driven learning activities on EFL learners’ writing development. Springer Open, 5(1255), 1-13.
Yoon, H., & Jo, J.W. (2014). Direct and indirect access to corpora: an exploratory case study comparing students’ error correction and learning strategy use in L2 writing. Language Learning and Technology, 18, (1), 96-117.
Gözde Durgut has taught ESL/ EFL for seven years and is currently a graduate student in the Applied Linguistics and TESOL program at the University of Alabama. Her research interests include teaching academic writing, corpus linguistics, ESL/EFL pedagogy, material design and curriculum development. |