The field of Data-Driven Learning (DDL) an approach to second language
learning in which the student interacts directly with corpus data has made much progress in only the matter of a
few decades. However, there are still certain frontiers that have thus far remained underexplored, mostly the result
of limited technological capabilities for a good portion of the fields existence. Until now, DDL has mainly centered
on text corpora, leaving aside such aspects of language learning as oral comprehension and speech production.
This doctoral dissertation presents the LITTERA corpus, and examines in depth how this English-Spanish parallel
literary speech corpus can be applied to language learning within the framework of DDL. The dissertation begins
with a general overview of the current state of DDL, followed by a detailed description of the creation and design of
the LITTERA crorpus. Then a series of potential pedagogical exercises are presented, aimed at showing how
LITTERA can be applied to the learning of English phonology by Spanish-speaking students. The exercises set out
to examine how the different features of English prosodyco-articulatory phenomena such as linking, blending,
assimilation, elision, resyllabfication, palatization, as well as vowel reductioncan be studied in the data to improve
students oral comprehension and speech production. Furthermore, possible DDL question prompts are proposed
to explore the different features in the classroom