Analysing corpus-based criterial conjunctions for automatic proficiency classification

Ángeles Zarco-Tejada, Carmen Noya Gallardo, Mª Carmen Merino Ferradá, Isabel Calderón López


The linguistic profiling of L2 learning texts can be taken as a model for automatic proficiency assessment of new texts. But proficiency levels are distinguished by many different linguistic features among which the use of cohesive devices can be a criterial element for level distinctions, either in the number of conjunctions used (quantitative) and/or in the type and variety of them (qualitative). We have carried such an analysis with a subgroup of the CLEC (CEFR-levelled English Corpus) using Coh-Metrix, a tool for computing computational cohesion and coherence metrics for written and spoken texts, but our results suggest that automatic proficiency level assessment needs a deeper examination of conjunctions that should rely on the analysis of conjunction-types use and conjunction varieties, with an analysis of lexical choice. A variable based on familiarity ranks could help to predict cohesive levels proficiencyoriented.


Cohesion; language assessment; corpus linguistics; L2 English learning texts; linguistic profiling; Coh-Metrix.

