1,159 research outputs found
Quantifying the Dialect Gap and its Correlates Across Languages
Historically, researchers and consumers have noticed a decrease in quality
when applying NLP tools to minority variants of languages (i.e. Puerto Rican
Spanish or Swiss German), but studies exploring this have been limited to a
select few languages. Additionally, past studies have mainly been conducted in
a monolingual context, so cross-linguistic trends have not been identified and
tied to external factors. In this work, we conduct a comprehensive evaluation
of the most influential, state-of-the-art large language models (LLMs) across
two high-use applications, machine translation and automatic speech
recognition, to assess their functionality on the regional dialects of several
high- and low-resource languages. Additionally, we analyze how the regional
dialect gap is correlated with economic, social, and linguistic factors. The
impact of training data, including related factors like dataset size and its
construction procedure, is shown to be significant but not consistent across
models or languages, meaning a one-size-fits-all approach cannot be taken in
solving the dialect gap. This work will lay the foundation for furthering the
field of dialectal NLP by laying out evident disparities and identifying
possible pathways for addressing them through mindful data collection.Comment: Accepted to EMNLP Findings 202
Huqariq: A Multilingual Speech Corpus of Native Languages of Peru for Speech Recognition
The Huqariq corpus is a multilingual collection of speech from native
Peruvian languages. The transcribed corpus is intended for the research and
development of speech technologies to preserve endangered languages in Peru.
Huqariq is primarily designed for the development of automatic speech
recognition, language identification and text-to-speech tools. In order to
achieve corpus collection sustainably, we employ the crowdsourcing methodology.
Huqariq includes four native languages of Peru, and it is expected that by the
end of the year 2022, it can reach up to 20 native languages out of the 48
native languages in Peru. The corpus has 220 hours of transcribed audio
recorded by more than 500 volunteers, making it the largest speech corpus for
native languages in Peru. In order to verify the quality of the corpus, we
present speech recognition experiments using 220 hours of fully transcribed
audio.Comment: Language Resources and Evaluation Conference (LREC 2022
Recommended from our members
Pathways through early childhood education in Ethiopia, India and Peru: Rights, equity and diversity. Young Lives Working Paper 54
The potential of quality early childhood and primary education to help break inter-generational poverty cycles is widely recognised. My focus is on how far this potential is being translated into reality, through implementing positive early childhood policies in practice. The paper summarises evidence from Young Lives research into early transitions, based on both survey and in-depth qualitative research with 2,000 Young Lives younger cohort children in Ethiopia, Andhra Pradesh (India) and Peru. Primary education is still being consolidated in Ethiopia, and pre-school is a minority urban experience, mainly offered by the private sector. Peru offers a very different story, with a well-established government primary and pre-school system but concerns about quality and coordination between sectors. Andhra Pradesh offers the most complex set of challenges, with a long-established government system of ECCE, but an increasing trend towards use of private services, including amongst the poorest communities. The paper offers five broad conclusions, about the importance of: ensuring quality and equity in early education; better coordinated pre-school and school systems; targeting the most vulnerable and disadvantaged children; recognising the full range of equity issues; and ensuring more effective governance, including governance of the private sector
Social Justice Documentary: Designing for Impact
Explores current methodologies for assessing social issue documentary films by combining strategic design and evaluation of multiplatform outreach and impact, including documentaries' role in network- and field-building. Includes six case studies
Towards Responsive Schools Supporting Better Schooling for Disadvantaged Children
Teaching/Communication/Extension/Profession,
Recommended from our members
Learning in developing economy clusters: The role of intermediary organisations
This thesis was submitted for the degree of Doctor of Philosophy and was awarded by Brunel UniversityIntermediary organisations play a distinctive, yet underestimated, role in the learning processes of developing economy clusters. This study situates itself in a new way of thinking about knowledge and innovation; one that emphasises learning as a social process, within communities that emerge through the development of shared practice. It finds that, while previous formulations of intermediaries have emphasised linking and accessing, in some contexts their roles are more fundamental and include community-building and coordinating common strategies.
For many agricultural clusters, reflecting a move in developing economies from „import-substitution‟ towards a focus on exports, learning and innovation has become central. Facing challenges in knowledge generation and transfer (Bessant et al, 2003), clustering aids knowledge diffusion amongst producers and stimulates the learning necessary to penetrate international markets (Schmitz and Nadvi, 1999; Humphrey and Schmitz, 2000). While opportunities sometimes exist for learning from global buyers, however, it is more common in natural-resource based clusters for the onus to be on producers to develop their own capabilities (Gomes, 2006). This study examines the contribution a diverse group of actors, categorised as intermediary organisations, make to this process.
The practice-based perspective (Amin and Cohendet, 2004) provides a framework through which the intermediary role is conceptualised, alongside insights from the innovation and network literatures (Howells, 2006; Burt, 2005). While these literatures predominantly focus on linking and accessing, however, intermediaries‟ roles are found, in certain developing economy contexts, to stretch wider. Through a case study of a Peruvian agricultural cluster, they are identified as performing a cluster-building role, by providing a platform for inter-firm cooperation. They also, through their ability to coordinate firm actions, facilitate opportunities for value chain learning. In addition, they provide new knowledge inputs to cluster actors, either through their own knowledge creation capabilities or their ability to translate and adapt existing knowledge
Proceedings of the Fifth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2018)
Peer reviewe
The Andean Tribunal of Justice and Its Interlocutors: Understanding Preliminary Reference Patterns in the Andean Community
In the European Union, national courts have been key intermediaries in helping to bolster and expand the authority of the European Court of Justice through its preliminary reference mechanism. This article analyzes the role of national judges in the Andean Community, a regional legal system whose judicial institution - the Andean Tribunal of Justice (ATJ) - was modeled directly on its European predecessor. Our analysis is based on an original coding of every publically available national court referral to the ATJ from 1987 to 2007 and interviews with over forty participants in the Andean legal system. We find that the relationship between the ATJ and national judges differs significantly from the relationship between the ECJ and its domestic judicial colleagues. As in Europe, references from national judges account for the vast majority of cases on the ATJ\u27s docket. But unlike in Europe, national courts are mostly passive intermediaries. Our coding reveals that national judges do not pose provocative questions to the ATJ, and that there is significant cross-national variation in referral patterns. Interviews corroborate what the data suggests: national judges have a circumscribed understanding of what Andean law requires of them. More than 90% of references involve technical issues of Andean intellectual property (IP) law and the registration decisions of domestic IP administrative agencies. National judges have embraced the ATJ\u27s active role in IP disputes because of the support of these agencies, which seek the Tribunal\u27s guidance to interpret vague areas of Andean law. Outside the area of IP, national judges are far more reluctant, contributing to the limited penetration of Andean law into national legal orders. We conclude by comparing the role of national judges in Europe to their role in the Andean context, extracting broader insights about the role of national judges in building international rules of law
Pathways through early childhood education in Ethiopia, India and Peru: rights, equity and diversity
El potencial de calidad de la primera infancia y la educación primaria para ayudar a romper los ciclos de pobreza intergeneracionales es ampliamente reconocido. Mi atención se centra en hasta qué punto este potencial se está traduciendo en realidad, mediante la implementación positiva de principios
políticas de infancia en la práctica. El documento resume la evidencia de la investigación de Young Lives sobre las transiciones tempranas, basadas
tanto en la encuesta como en la investigación cualitativa en profundidad con 6.000 niños de la cohorte más joven de Young Lives en Etiopía, Andhra Pradesh (India) y Perú. La educación primaria todavía se está consolidando en Etiopía, y la educación preescolar es una minoría urbana
experiencia, principalmente ofrecida por el sector privado. Perú ofrece una historia muy diferente, con un gobierno bien establecido
sistema primario y preescolar pero preocupaciones sobre la calidad y coordinación entre sectores. Andhra Pradesh ofrece el conjunto más complejo de desafíos, con un sistema gubernamental de AEPC de larga data, pero una tendencia creciente hacia el uso de servicios privados, incluso entre las comunidades más pobres. El documento ofrece cinco conclusiones generales, sobre la importancia de: garantizar la calidad y la equidad en la educación temprana; sistemas preescolares y escolares mejor coordinados; dirigidos a los niños más vulnerables y desfavorecidos; reconocer la gama completa de problemas de equidad; y
garantizar una gobernanza más efectiva, incluida la gobernanza del sector privado
- …