Corpora Use in Second Language Learning

Hansol Lee (School of Education, University of Irvine)

Project Funding:
Republic of Korea Army

In contrast to the general way of acquiring general vocabulary through their daily lives, learners have few opportunities to be exposed to second language (L2) inputs. In this sense, because understanding of multi-faceted disciplinary vocabulary is required for language learners to succeed, their lexical knowledge needs more attention. As a response to this unique learning environment, the use of corpora (i.e., analyzing structured language datasets to understand how target vocabulary is used in authentic contexts) has been highlighted in L2 vocabulary learning since their advent in their modern form along with the development of computer technology. After exploring the given authentic linguistic data, for example, students can find answers to their linguistic questions, including contextual meaning, collocational patterns, or appropriate usage of target vocabulary.

Information of ongoing activities.
Currently, I am working on an article that aims to examine the overall effect of corpora use on L2 vocabulary learning. Corpora, which are structured collections of language data, have been widely used by language teachers, both to provide their students with authentic sample sentences and to allow students opportunities to autonomously explore structured text. Empirical findings across these studies have not been consistent, and no recent meta-analyses across the broad range of corpora studies have attempted to reconcile these inconsistent findings. We thus conducted a meta-analysis to systematically and comprehensively synthesize findings of empirical research on the impact of corpora use on academic vocabulary learning. So far, a total of 20 empirical studies met our criteria for inclusion (e.g., with control groups). Among 97 effect sizes from these studies, we will examine an overall effect of corpora on L2 vocabulary learning. We will also try to understand variation in mean effect sizes across different teaching and learning contexts. In so dong, we will discuss the implications of these findings for guiding teachers on how they may want to use corpora in accordance with different pedagogical purposes.


Lee, H., Warschauer, M., & Lee, J. H. (in press). Advancing CALL research via data mining techniques: Unearthing hidden groups of learners in a corpus-based L2 vocabulary learning experiment. ReCALL.

Lee, H., Warschauer, M., & Lee, J. H. (2018). The effects of corpus use on second language vocabulary learning: A multilevel meta-analysis. Applied Linguistics. Advance online publication.

Lee, H., Warschauer, M., & Lee, J. H. (2017). The effects of concordance-based electronic glosses on L2 vocabulary learning. Language Learning & Technology, 21(2), 32–51.

Lee, Ho, Lee, Hansol, & Lee, J. H. (2016). Evaluation of electronic and paper textual glosses on second language vocabulary learning and reading comprehension. The Asia-Pacific Education Researcher, 25(4), 499–507.

Lee, H., & Lee, J. H. (2015). The effects of different glossing formats on foreign language vocabulary acquisition. The Asia-Pacific Education Researcher, 24(4), 591–601.

Lee, J. H., Lee, H., & Sert, C. (2015). A corpus approach for autonomous teachers and learners: Implementing and on-line concordancer on teachers’ laptops. Language Learning & Technology, 19(2), 1–15.

Lee, H., & Lee, J. H. (2013). Implementing glossing in mobile-assisted language learning environments: Directions and outlook. Language Learning & Technology, 17(3), 6–22.

For more information, please contact Hansol Lee (

Back to projects