1,425 research outputs found
Data and methods for a visual understanding of sign languages
Signed languages are complete and natural languages used as the first or preferred mode of communication by millions of people worldwide. However, they, unfortunately, continue to be marginalized languages. Designing, building, and evaluating models that work on sign languages presents compelling research challenges and requires interdisciplinary and collaborative efforts. The recent advances in Machine Learning (ML) and Artificial Intelligence (AI) has the power to enable better accessibility to sign language users and narrow down the existing communication barrier between the Deaf community and non-sign language users. However, recent AI-powered technologies still do not account for sign language in their pipelines. This is mainly because sign languages are visual languages, that use manual and non-manual features to convey information, and do not have a standard written form. Thus, the goal of this thesis is to contribute to the development of new technologies that account for sign language by creating large-scale multimodal resources suitable for training modern data-hungry machine learning models and developing automatic systems that focus on computer vision tasks related to sign language that aims at learning better visual understanding of sign languages.
Thus, in Part I, we introduce the How2Sign dataset, which is a large-scale collection of multimodal and multiview sign language videos in American Sign Language. In Part II, we contribute to the development of technologies that account for sign languages by presenting in Chapter 4 a framework called Spot-Align, based on sign spotting methods, to automatically annotate sign instances in continuous sign language. We further present the benefits of this framework and establish a baseline for the sign language recognition task on the How2Sign dataset. In addition to that, in Chapter 5 we benefit from the different annotations and modalities of the How2Sign to explore sign language video retrieval by learning cross-modal embeddings. Later in Chapter 6, we explore sign language video generation by applying Generative Adversarial Networks to the sign language domain and assess if and how well sign language users can understand automatically generated sign language videos by proposing an evaluation protocol based on How2Sign topics and English translationLes llengües de signes són llengües completes i naturals que utilitzen milions de persones de tot el món com mode de comunicació primer o preferit. Tanmateix, malauradament, continuen essent llengües marginades. Dissenyar, construir i avaluar tecnologies que funcionin amb les llengües de signes presenta reptes de recerca que requereixen d’esforços interdisciplinaris i col·laboratius. Els avenços recents en l’aprenentatge automàtic i la intel·ligència artificial (IA) poden millorar l’accessibilitat tecnològica dels signants, i alhora reduir la barrera de comunicació existent entre la comunitat sorda i les persones no-signants. Tanmateix, les tecnologies més modernes en IA encara no consideren les llengües de signes en les seves interfícies amb l’usuari. Això es deu principalment a que les llengües de signes són llenguatges visuals, que utilitzen característiques manuals i no manuals per transmetre informació, i no tenen una forma escrita estàndard. Els objectius principals d’aquesta tesi són la creació de recursos multimodals a gran escala adequats per entrenar models d’aprenentatge automàtic per a llengües de signes, i desenvolupar sistemes de visió per computador adreçats a una millor comprensió automàtica de les llengües de signes. Així, a la Part I presentem la base de dades How2Sign, una gran col·lecció multimodal i multivista de vídeos de la llengua de signes nord-americana. A la Part II, contribuïm al desenvolupament de tecnologia per a llengües de signes, presentant al capítol 4 una solució per anotar signes automàticament anomenada Spot-Align, basada en mètodes de localització de signes en seqüències contínues de signes. Després, presentem els avantatges d’aquesta solució i proporcionem uns primers resultats per la tasca de reconeixement de la llengua de signes a la base de dades How2Sign. A continuació, al capítol 5 aprofitem de les anotacions i diverses modalitats de How2Sign per explorar la cerca de vídeos en llengua de signes a partir de l’entrenament d’incrustacions multimodals. Finalment, al capítol 6, explorem la generació de vídeos en llengua de signes aplicant xarxes adversàries generatives al domini de la llengua de signes. Avaluem fins a quin punt els signants poden entendre els vídeos generats automàticament, proposant un nou protocol d’avaluació basat en les categories dins de How2Sign i la traducció dels vídeos a l’anglès escritLas lenguas de signos son lenguas completas y naturales que utilizan millones de personas
de todo el mundo como modo de comunicación primero o preferido. Sin embargo, desgraciadamente,
siguen siendo lenguas marginadas. Diseñar, construir y evaluar tecnologías
que funcionen con las lenguas de signos presenta retos de investigación que requieren
esfuerzos interdisciplinares y colaborativos. Los avances recientes en el aprendizaje automático
y la inteligencia artificial (IA) pueden mejorar la accesibilidad tecnológica de
los signantes, al tiempo que reducir la barrera de comunicación existente entre la comunidad
sorda y las personas no signantes. Sin embargo, las tecnologías más modernas en
IA todavía no consideran las lenguas de signos en sus interfaces con el usuario. Esto
se debe principalmente a que las lenguas de signos son lenguajes visuales, que utilizan
características manuales y no manuales para transmitir información, y carecen de una
forma escrita estándar. Los principales objetivos de esta tesis son la creación de recursos
multimodales a gran escala adecuados para entrenar modelos de aprendizaje automático
para lenguas de signos, y desarrollar sistemas de visión por computador dirigidos a una
mejor comprensión automática de las lenguas de signos.
Así, en la Parte I presentamos la base de datos How2Sign, una gran colección multimodal
y multivista de vídeos de lenguaje la lengua de signos estadounidense. En la Part II,
contribuimos al desarrollo de tecnología para lenguas de signos, presentando en el capítulo
4 una solución para anotar signos automáticamente llamada Spot-Align, basada en
métodos de localización de signos en secuencias continuas de signos. Después, presentamos
las ventajas de esta solución y proporcionamos unos primeros resultados por la tarea
de reconocimiento de la lengua de signos en la base de datos How2Sign. A continuación,
en el capítulo 5 aprovechamos de las anotaciones y diversas modalidades de How2Sign
para explorar la búsqueda de vídeos en lengua de signos a partir del entrenamiento de
incrustaciones multimodales. Finalmente, en el capítulo 6, exploramos la generación
de vídeos en lengua de signos aplicando redes adversarias generativas al dominio de la
lengua de signos. Evaluamos hasta qué punto los signantes pueden entender los vídeos
generados automáticamente, proponiendo un nuevo protocolo de evaluación basado en
las categorías dentro de How2Sign y la traducción de los vídeos al inglés escrito.Teoria del Senyal i Comunicacion
Data and methods for a visual understanding of sign languages
Signed languages are complete and natural languages used as the first or preferred mode of communication by millions of people worldwide. However, they, unfortunately, continue to be marginalized languages. Designing, building, and evaluating models that work on sign languages presents compelling research challenges and requires interdisciplinary and collaborative efforts. The recent advances in Machine Learning (ML) and Artificial Intelligence (AI) has the power to enable better accessibility to sign language users and narrow down the existing communication barrier between the Deaf community and non-sign language users. However, recent AI-powered technologies still do not account for sign language in their pipelines. This is mainly because sign languages are visual languages, that use manual and non-manual features to convey information, and do not have a standard written form. Thus, the goal of this thesis is to contribute to the development of new technologies that account for sign language by creating large-scale multimodal resources suitable for training modern data-hungry machine learning models and developing automatic systems that focus on computer vision tasks related to sign language that aims at learning better visual understanding of sign languages.
Thus, in Part I, we introduce the How2Sign dataset, which is a large-scale collection of multimodal and multiview sign language videos in American Sign Language. In Part II, we contribute to the development of technologies that account for sign languages by presenting in Chapter 4 a framework called Spot-Align, based on sign spotting methods, to automatically annotate sign instances in continuous sign language. We further present the benefits of this framework and establish a baseline for the sign language recognition task on the How2Sign dataset. In addition to that, in Chapter 5 we benefit from the different annotations and modalities of the How2Sign to explore sign language video retrieval by learning cross-modal embeddings. Later in Chapter 6, we explore sign language video generation by applying Generative Adversarial Networks to the sign language domain and assess if and how well sign language users can understand automatically generated sign language videos by proposing an evaluation protocol based on How2Sign topics and English translationLes llengües de signes són llengües completes i naturals que utilitzen milions de persones de tot el món com mode de comunicació primer o preferit. Tanmateix, malauradament, continuen essent llengües marginades. Dissenyar, construir i avaluar tecnologies que funcionin amb les llengües de signes presenta reptes de recerca que requereixen d’esforços interdisciplinaris i col·laboratius. Els avenços recents en l’aprenentatge automàtic i la intel·ligència artificial (IA) poden millorar l’accessibilitat tecnològica dels signants, i alhora reduir la barrera de comunicació existent entre la comunitat sorda i les persones no-signants. Tanmateix, les tecnologies més modernes en IA encara no consideren les llengües de signes en les seves interfícies amb l’usuari. Això es deu principalment a que les llengües de signes són llenguatges visuals, que utilitzen característiques manuals i no manuals per transmetre informació, i no tenen una forma escrita estàndard. Els objectius principals d’aquesta tesi són la creació de recursos multimodals a gran escala adequats per entrenar models d’aprenentatge automàtic per a llengües de signes, i desenvolupar sistemes de visió per computador adreçats a una millor comprensió automàtica de les llengües de signes. Així, a la Part I presentem la base de dades How2Sign, una gran col·lecció multimodal i multivista de vídeos de la llengua de signes nord-americana. A la Part II, contribuïm al desenvolupament de tecnologia per a llengües de signes, presentant al capítol 4 una solució per anotar signes automàticament anomenada Spot-Align, basada en mètodes de localització de signes en seqüències contínues de signes. Després, presentem els avantatges d’aquesta solució i proporcionem uns primers resultats per la tasca de reconeixement de la llengua de signes a la base de dades How2Sign. A continuació, al capítol 5 aprofitem de les anotacions i diverses modalitats de How2Sign per explorar la cerca de vídeos en llengua de signes a partir de l’entrenament d’incrustacions multimodals. Finalment, al capítol 6, explorem la generació de vídeos en llengua de signes aplicant xarxes adversàries generatives al domini de la llengua de signes. Avaluem fins a quin punt els signants poden entendre els vídeos generats automàticament, proposant un nou protocol d’avaluació basat en les categories dins de How2Sign i la traducció dels vídeos a l’anglès escritLas lenguas de signos son lenguas completas y naturales que utilizan millones de personas
de todo el mundo como modo de comunicación primero o preferido. Sin embargo, desgraciadamente,
siguen siendo lenguas marginadas. Diseñar, construir y evaluar tecnologías
que funcionen con las lenguas de signos presenta retos de investigación que requieren
esfuerzos interdisciplinares y colaborativos. Los avances recientes en el aprendizaje automático
y la inteligencia artificial (IA) pueden mejorar la accesibilidad tecnológica de
los signantes, al tiempo que reducir la barrera de comunicación existente entre la comunidad
sorda y las personas no signantes. Sin embargo, las tecnologías más modernas en
IA todavía no consideran las lenguas de signos en sus interfaces con el usuario. Esto
se debe principalmente a que las lenguas de signos son lenguajes visuales, que utilizan
características manuales y no manuales para transmitir información, y carecen de una
forma escrita estándar. Los principales objetivos de esta tesis son la creación de recursos
multimodales a gran escala adecuados para entrenar modelos de aprendizaje automático
para lenguas de signos, y desarrollar sistemas de visión por computador dirigidos a una
mejor comprensión automática de las lenguas de signos.
Así, en la Parte I presentamos la base de datos How2Sign, una gran colección multimodal
y multivista de vídeos de lenguaje la lengua de signos estadounidense. En la Part II,
contribuimos al desarrollo de tecnología para lenguas de signos, presentando en el capítulo
4 una solución para anotar signos automáticamente llamada Spot-Align, basada en
métodos de localización de signos en secuencias continuas de signos. Después, presentamos
las ventajas de esta solución y proporcionamos unos primeros resultados por la tarea
de reconocimiento de la lengua de signos en la base de datos How2Sign. A continuación,
en el capítulo 5 aprovechamos de las anotaciones y diversas modalidades de How2Sign
para explorar la búsqueda de vídeos en lengua de signos a partir del entrenamiento de
incrustaciones multimodales. Finalmente, en el capítulo 6, exploramos la generación
de vídeos en lengua de signos aplicando redes adversarias generativas al dominio de la
lengua de signos. Evaluamos hasta qué punto los signantes pueden entender los vídeos
generados automáticamente, proponiendo un nuevo protocolo de evaluación basado en
las categorías dentro de How2Sign y la traducción de los vídeos al inglés escrito.Postprint (published version
The recurrent evolution of extremely resistant xylem
International audienceAbstractKey messageHighly resistant xylem has evolved multiple times over the past 400 million years.ContextWater is transported under tension in xylem and consequently is vulnerable to invasion by air and the formation of embolism. A debate has raged over whether embolism formation is non-reversible occurring at low water potentials or a regular diurnal occurrence that is non-lethal because of a capacity to refill embolised conduits.AimsThis commentary is on a recent article, which utilised new non-invasive imaging techniques for assessing the formation of embolism in xylem, finding that the xylem of Laurus nobilis was highly resistant to the formation of embolism.MethodsThe recent results of this discovery are placed in the context knowledge from a diversity of species that has so far been identified with xylem similarly highly resistant to embolism formation.ResultsThe discovery that L. nobilis has xylem highly resistant to embolism formation adds to a body of literature suggesting that the resistance of xylem to embolism formation is a key adaptation utilised by many species native to seasonally dry environments. Highly resistant xylem has evolved numerous times across the angiosperm clade.ConclusionWith more studies utilising similar observational and direct methods of assessing embolism resistance, further insight into the ecological and evolutionary relevance of this trait is imminent
Patterns of the PRICE vowel in Liverpool English: History, Phonetics, and Corpus Phonology
Previous work on the Liverpool dialect has established that the PRICE vowel has an interesting phonological pattern; even so, there has never been a comprehensive study to confirm this claim. This dissertation provides an exploration of the PRICE vowel in Liverpool English through a corpus phonology approach. The present study finds that in the Liverpool dialect there are five PRICE vowel phonological patterns with a combination of four variants in three environments. These variants are: a raised nucleus diphthong, non-raised nucleus diphthong, lengthened nucleus diphthong, and monophthong; and the conditioning environments are: before voiceless obstruents, voiced obstruents, and nasal consonants. A striking observation is that the phonological patterns seem to have restrictions on variant combinations, which supports the hypothesis that Liverpool English has phonological patterns, rather than a number of variants available for each environment independent of the other variants. Specifically, there is no phonological pattern with a raised variant that does not have a monophthongal variant. Furthermore, an informant who produces a monophthong in the voiced environment necessarily has a monophthong before nasal consonants. The results of the present study may also suggest that there is phonological change in progress in the Liverpool PRICE vowel as two of the phonological patterns are produced exclusively by younger females. Many previous studies have suggested that younger women are the innovators in linguistic change. Finally, this dissertation takes a novel approach in explaining the origins of the PRICE vowel raising patterns. Three of the current theories on the origins of raising patterns in English are evaluated and combined in a way that encompasses the subfields of historical linguistics, phonology and dialectology in the final explanation
Meta tcc
O Meta TCC se apresenta como uma investigação dos alcances do recurso teórico
que é o trabalho de conclusão de curso, no desenvolvimento de uma pesquisa
em artes visuais. A partir da análise de um recorte do trabalho da dupla de artistas
“Amanda & Isadora”, o texto busca entrelaçar o entendimento de seu conteúdo
com a consciência de seu formato. Menções à teóricos institucionalizados como
Jean Baudrillard e Vílem Flusser são emparelhadas a estudos de pesquisadores da
atualidade, como Hito Steyerl, Boris Groys e Marisa Olsen. Dessa forma, o conteúdo
que se depreende de diversas instâncias e vertentes conceituais é apresentado de
forma análoga aos mecanismos contemporâneos de circulação do conhecimento,
ao qual o próprio texto se dedica a evidenciar. O texto propõe que as “Metaimagens”
(2021-) de Amanda & Isadora estão inseridas em uma discussão mais ampla que o
trabalho individual das artistas, à medida que questões, que perpassam sua produção
artística, são evidenciadas ao longo do texto
Dialectology, phonology, diachrony: Liverpool English realisations of PRICE and MOUTH.
Dialect emergence or new-dialect formation in intensive contact situations has
been the subject of research for decades. Approaches to dialect emergence
have led to a more solid understanding of the origins of specific phonological
features. This line of research often approaches issues of new-dialect formation
and phonological feature development within the confines of one linguistic
subfield. However, new-dialect formation is a multifaceted phenomenon which
results from a combination of dialectological, phonological and historical
linguistic factors. The current thesis presents a comprehensive account of
phonological feature development in new-dialect formation from a combined
theoretical perspective by exploring historical and contemporary processes
in the emergence of phonologically-conditioned variation in the price and
mouth lexical sets in Liverpool English.
This feature has been widely researched in other varieties of English and
has previously been attributed to new-dialect formation. However, little is
known about the patterns of price and mouth in Liverpool English. The
current thesis relies on multiple methods of data collection (e.g. a combination
of fieldwork and corpus data), various quantitative methods, and detailed
acoustic analyses (e.g. formants and Euclidean distance in a two-dimensional
formant space) to investigate the precise details and the processes involved in
the emergence and development of price and mouth patterns in Liverpool
English.
Liverpool English is thought to have emerged during the 19th century as
a result of extensive and prolonged immigration from the surrounding areas
of Lancashire and Cheshire, and from Ireland, Wales, and Scotland. However,
the specific timing, extent of immigration, and proportion of immigrant
populations have not been investigated in detail. The current thesis provides
the first in-depth analysis of historical census records in order to extend our
knowledge of the populations in Liverpool at the time of new-dialect formation.
The insights obtained from this analysis provide a more nuanced picture of
the development of Liverpool English. They are essential for determining
what dialects potentially contributed to dialect formation and the repertoire
of price and mouth variants present at the time that these processes were
developing. The analysis of historical census records is further augmented
by using a combination of quantitative methods and historical corpora in
order to gain a fuller understanding of the processes involved in the formation
of these dialect features.
The contemporary investigation of price and mouth in Liverpool English
shows that these patterns are separate, but related, and that their phonological
conditioning environments resemble those reported for cases of price and
mouth variation in other varieties of English. I present a detailed overview
of the phonetics and phonology of price and mouth variation in Liverpool
English, looking at a wide range of conditioning environments. This investigation
also reviews a range of different quantitative measurements useful
for research on variation involving diphthongs.
The origins of price and mouth phonological patterns in Liverpool English
indicate that an approach combining different theoretical perspectives is
required to adequately explain the development of these patterns. The current
thesis suggests that price and mouth phonologically conditioned variation
in Liverpool English initially resulted from variants of different dialects within
the dialect contact situation. However, some features of the contemporary
patterns developed following new-dialect formation as a by-product of phonetic
and phonological properties of diphthong production in certain following
environments. By approaching the development of these phonological features
in Liverpool English from a combination of theoretical perspectives, the
current thesis expands our understanding of emergent phonological features
in new-dialect formation
Estabilidade à luz das antocianinas no suco de mirtilo e no extrato obtido do bagaço do mirtilo
Sign language video retrieval with free-form textual queries
Systems that can efficiently search collections of sign language videos have been highlighted as a useful application of sign language technology. However, the problem of searching videos beyond individual keywords has received limited attention in the literature. To address this gap, in this work we introduce the task of sign language retrieval with textual queries: given a written query (e.g. a sentence) and a large collection of sign language videos, the objective is to find the signing video that best matches the written query. We propose to tackle this task by learning cross-modal embeddings on the recently introduced large-scale How2Sign dataset of American Sign Language (ASL). We identify that a key bottleneck in the performance of the system is the quality of the sign video embedding which suffers from a scarcity of labelled training data. We, therefore, propose SPOT-ALIGN, a framework for interleaving iterative rounds of sign spotting and feature alignment to expand the scope and scale of available training data. We validate the effectiveness of SPOT-ALIGN for learning a robust sign video embedding through improvements in both sign recognition and the proposed video retrieval task.This work was supported by the project PID2020-117142GB-I00, funded by MCIN/ AEI /10.13039/501100011033, ANR project CorVis ANR-21-CE23-0003- 01, and gifts from Google and Adobe. AD received support from la Caixa Foundation (ID 100010434), fellowship code LCF/BQ/IN18/11660029.Peer ReviewedObjectius de Desenvolupament Sostenible::10 - Reducció de les DesigualtatsObjectius de Desenvolupament Sostenible::10 - Reducció de les Desigualtats::10.2 - Per a 2030, potenciar i promoure la inclusió social, econòmica i política de totes les persones, independentment de l’edat, sexe, discapacitat, raça, ètnia, origen, religió, situació econòmica o altra condicióPostprint (author's final draft
Jornalistas negras e racismo no jornalismo esportivo televisivo
Este trabalho analisa a atuação de jornalistas esportivas negras nos canais e programas de
esportes que vão ao ar no Rio de Janeiro e busca demonstrar como este grupo não é
beneficiado pelas estratégias de inclusão e equidade de gênero no jornalismo esportivo.
Buscamos analisar como o racismo historicamente põe as mulheres negras em desvantagens
em relação às mulheres brancas e como a opressão de raça e gênero reduz as oportunidades de
jornalistas negras no esporte. Para isso, utilizamos a revisão bibliográfica e histórica e as
experiências de profissionais na ativa. Além de reportagens, áudios e vídeos disponíveis na
imprensa, incluímos relatos concedidos em entrevista por duas jornalistas de um dos canais
selecionados para a pesquisa
- …