568 research outputs found
Comparing the production of a formula with the development of L2 competence
This pilot study investigates the production of a formula with the development of L2 competence over proficiency levels of a spoken learner corpus. The results show that the formula
in beginner production data is likely being recalled holistically from learners’ phonological
memory rather than generated online, identifiable by virtue of its fluent production in absence
of any other surface structure evidence of the formula’s syntactic properties. As learners’ L2
competence increases, the formula becomes sensitive to modifications which show structural
conformity at each proficiency level. The transparency between the formula’s modification
and learners’ corresponding L2 surface structure realisations suggest that it is the independent
development of L2 competence which integrates the formula into compositional language,
and ultimately drives the SLA process forward
LeBenchmark 2.0: a Standardized, Replicable and Enhanced Framework for Self-supervised Representations of French Speech
Self-supervised learning (SSL) is at the origin of unprecedented improvements
in many different domains including computer vision and natural language
processing. Speech processing drastically benefitted from SSL as most of the
current domain-related tasks are now being approached with pre-trained models.
This work introduces LeBenchmark 2.0 an open-source framework for assessing and
building SSL-equipped French speech technologies. It includes documented,
large-scale and heterogeneous corpora with up to 14,000 hours of heterogeneous
speech, ten pre-trained SSL wav2vec 2.0 models containing from 26 million to
one billion learnable parameters shared with the community, and an evaluation
protocol made of six downstream tasks to complement existing benchmarks.
LeBenchmark 2.0 also presents unique perspectives on pre-trained SSL models for
speech with the investigation of frozen versus fine-tuned downstream models,
task-agnostic versus task-specific pre-trained models as well as a discussion
on the carbon footprint of large-scale model training.Comment: Under submission at Computer Science and Language. Preprint allowe
Vagueness Markers in Italian
Moving from a broad socio-pragmatic perspective, this study analyses how speakers of different ages use a class of items and constructions that codify intentional vagueness in Italian.
Items as un po’ ‘a bit’, tipo ‘kind’, diciamo ‘let us say’, così ‘so’, e cose del genere ‘and things like that’, or cosa ‘thing’ constitute a class of linguistically heterogeneous means that often function in conversation as vagueness markers, i.e. elements by which speakers signal that their knowledge or communication are somehow only tentative, approximate and vague. Their use does not depend on language systemic factors, but is the result of a, more or less conscious, choice of speakers to enhance conversation for different reasons, which include facilitating the flow of conversation, signifying a vague categorization, and, eventually, being polite.
Operating at the pragmatic level, vagueness markers represent elements that are readily available to speakers’ choices and contribute to characterize individual and generational discourse styles. Through a corpus-based analysis of listeners’ phone-ins to a Milan radio station, this study investigates how vagueness markers are used by speakers of different ages in 1976 and in 2010, and how Italian discourse styles have evolved in the last forty years
Northeastern Illinois University, Academic Catalog 2023-2024
https://neiudc.neiu.edu/catalogs/1064/thumbnail.jp
Machine Learning Algorithm for the Scansion of Old Saxon Poetry
Several scholars designed tools to perform the automatic scansion of poetry in many languages, but none of these tools
deal with Old Saxon or Old English. This project aims to be a first attempt to create a tool for these languages. We
implemented a Bidirectional Long Short-Term Memory (BiLSTM) model to perform the automatic scansion of Old Saxon
and Old English poems. Since this model uses supervised learning, we manually annotated the Heliand manuscript, and
we used the resulting corpus as labeled dataset to train the model. The evaluation of the performance of the algorithm
reached a 97% for the accuracy and a 99% of weighted average for precision, recall and F1 Score. In addition, we tested
the model with some verses from the Old Saxon Genesis and some from The Battle of Brunanburh, and we observed that
the model predicted almost all Old Saxon metrical patterns correctly misclassified the majority of the Old English input
verses
Approximation in Morphology
This Special Issue "Approximation in Morphology" has been collated from peer-reviewed papers presented at the ApproxiMo 'discontinuous' workshop (2022), which was held online between December 2021 and May 2022, and organized by Francesca Masini (Bologna), Muriel Norde (Berlin) and Kristel Van Goethem (Louvain)
Operatic Pasticcios in 18th-Century Europe
In Early Modern times, techniques of assembling, compiling and arranging pre-existing material were part of the established working methods in many arts. In the world of 18th-century opera, such practices ensured that operas could become a commercial success because the substitution or compilation of arias fitting the singer's abilities proved the best recipe for fulfilling the expectations of audiences. Known as »pasticcios« since the 18th-century, these operas have long been considered inferior patchwork. The volume collects essays that reconsider the pasticcio, contextualize it, define its preconditions, look at its material aspects and uncover its aesthetical principles
Vielfalt und Integration - diversitá ed integrazione - diversité et intégration: Sprache(n) in sozialen und digitalen Räumen: Eine Festschrift für Elisabeth Burr
Diese Festschrift für Elisabeth Burr stellt Vielfalt und Integration in der Sprachwissenschaft und in den Digital Humanities in den Mittelpunkt. Die Beiträge berühren zentrale Fragen im Schaffen Burrs: Wie kann Sprache und ihre Variation in Abhängigkeit von sozialen und geographischen Faktoren adäquat beschrieben werden? Wie lassen sich informatische und digitale Zugänge dafür nutzen? Verknüpft werden sie mit ihr wichtigen und aktuellen Themen aus Sozio-, Gender- und Korpuslinguistik, Dialektologie und Sprachgeographie sowie den digitalen Geisteswissenschaften.
Die Beitragenden sind u. a. Stefania Spina, Thomas Krefeld, Annette Gerstenberg, Lazslo Hinyadi, Carol Chiodo und Lauren Tilton, Manuel Burghardt, Øyvind Eide, Jürgen Hermes, Andreas Witt. Ray Siemens, Arianna Ciula, Alejandro BÃa sowie Rob Evans
- …