1,425 research outputs found

    Data and methods for a visual understanding of sign languages

    Get PDF
    Signed languages are complete and natural languages used as the first or preferred mode of communication by millions of people worldwide. However, they, unfortunately, continue to be marginalized languages. Designing, building, and evaluating models that work on sign languages presents compelling research challenges and requires interdisciplinary and collaborative efforts. The recent advances in Machine Learning (ML) and Artificial Intelligence (AI) has the power to enable better accessibility to sign language users and narrow down the existing communication barrier between the Deaf community and non-sign language users. However, recent AI-powered technologies still do not account for sign language in their pipelines. This is mainly because sign languages are visual languages, that use manual and non-manual features to convey information, and do not have a standard written form. Thus, the goal of this thesis is to contribute to the development of new technologies that account for sign language by creating large-scale multimodal resources suitable for training modern data-hungry machine learning models and developing automatic systems that focus on computer vision tasks related to sign language that aims at learning better visual understanding of sign languages. Thus, in Part I, we introduce the How2Sign dataset, which is a large-scale collection of multimodal and multiview sign language videos in American Sign Language. In Part II, we contribute to the development of technologies that account for sign languages by presenting in Chapter 4 a framework called Spot-Align, based on sign spotting methods, to automatically annotate sign instances in continuous sign language. We further present the benefits of this framework and establish a baseline for the sign language recognition task on the How2Sign dataset. In addition to that, in Chapter 5 we benefit from the different annotations and modalities of the How2Sign to explore sign language video retrieval by learning cross-modal embeddings. Later in Chapter 6, we explore sign language video generation by applying Generative Adversarial Networks to the sign language domain and assess if and how well sign language users can understand automatically generated sign language videos by proposing an evaluation protocol based on How2Sign topics and English translationLes llengües de signes són llengües completes i naturals que utilitzen milions de persones de tot el món com mode de comunicació primer o preferit. Tanmateix, malauradament, continuen essent llengües marginades. Dissenyar, construir i avaluar tecnologies que funcionin amb les llengües de signes presenta reptes de recerca que requereixen d’esforços interdisciplinaris i col·laboratius. Els avenços recents en l’aprenentatge automàtic i la intel·ligència artificial (IA) poden millorar l’accessibilitat tecnològica dels signants, i alhora reduir la barrera de comunicació existent entre la comunitat sorda i les persones no-signants. Tanmateix, les tecnologies més modernes en IA encara no consideren les llengües de signes en les seves interfícies amb l’usuari. Això es deu principalment a que les llengües de signes són llenguatges visuals, que utilitzen característiques manuals i no manuals per transmetre informació, i no tenen una forma escrita estàndard. Els objectius principals d’aquesta tesi són la creació de recursos multimodals a gran escala adequats per entrenar models d’aprenentatge automàtic per a llengües de signes, i desenvolupar sistemes de visió per computador adreçats a una millor comprensió automàtica de les llengües de signes. Així, a la Part I presentem la base de dades How2Sign, una gran col·lecció multimodal i multivista de vídeos de la llengua de signes nord-americana. A la Part II, contribuïm al desenvolupament de tecnologia per a llengües de signes, presentant al capítol 4 una solució per anotar signes automàticament anomenada Spot-Align, basada en mètodes de localització de signes en seqüències contínues de signes. Després, presentem els avantatges d’aquesta solució i proporcionem uns primers resultats per la tasca de reconeixement de la llengua de signes a la base de dades How2Sign. A continuació, al capítol 5 aprofitem de les anotacions i diverses modalitats de How2Sign per explorar la cerca de vídeos en llengua de signes a partir de l’entrenament d’incrustacions multimodals. Finalment, al capítol 6, explorem la generació de vídeos en llengua de signes aplicant xarxes adversàries generatives al domini de la llengua de signes. Avaluem fins a quin punt els signants poden entendre els vídeos generats automàticament, proposant un nou protocol d’avaluació basat en les categories dins de How2Sign i la traducció dels vídeos a l’anglès escritLas lenguas de signos son lenguas completas y naturales que utilizan millones de personas de todo el mundo como modo de comunicación primero o preferido. Sin embargo, desgraciadamente, siguen siendo lenguas marginadas. Diseñar, construir y evaluar tecnologías que funcionen con las lenguas de signos presenta retos de investigación que requieren esfuerzos interdisciplinares y colaborativos. Los avances recientes en el aprendizaje automático y la inteligencia artificial (IA) pueden mejorar la accesibilidad tecnológica de los signantes, al tiempo que reducir la barrera de comunicación existente entre la comunidad sorda y las personas no signantes. Sin embargo, las tecnologías más modernas en IA todavía no consideran las lenguas de signos en sus interfaces con el usuario. Esto se debe principalmente a que las lenguas de signos son lenguajes visuales, que utilizan características manuales y no manuales para transmitir información, y carecen de una forma escrita estándar. Los principales objetivos de esta tesis son la creación de recursos multimodales a gran escala adecuados para entrenar modelos de aprendizaje automático para lenguas de signos, y desarrollar sistemas de visión por computador dirigidos a una mejor comprensión automática de las lenguas de signos. Así, en la Parte I presentamos la base de datos How2Sign, una gran colección multimodal y multivista de vídeos de lenguaje la lengua de signos estadounidense. En la Part II, contribuimos al desarrollo de tecnología para lenguas de signos, presentando en el capítulo 4 una solución para anotar signos automáticamente llamada Spot-Align, basada en métodos de localización de signos en secuencias continuas de signos. Después, presentamos las ventajas de esta solución y proporcionamos unos primeros resultados por la tarea de reconocimiento de la lengua de signos en la base de datos How2Sign. A continuación, en el capítulo 5 aprovechamos de las anotaciones y diversas modalidades de How2Sign para explorar la búsqueda de vídeos en lengua de signos a partir del entrenamiento de incrustaciones multimodales. Finalmente, en el capítulo 6, exploramos la generación de vídeos en lengua de signos aplicando redes adversarias generativas al dominio de la lengua de signos. Evaluamos hasta qué punto los signantes pueden entender los vídeos generados automáticamente, proponiendo un nuevo protocolo de evaluación basado en las categorías dentro de How2Sign y la traducción de los vídeos al inglés escrito.Teoria del Senyal i Comunicacion

    Data and methods for a visual understanding of sign languages

    Get PDF
    Signed languages are complete and natural languages used as the first or preferred mode of communication by millions of people worldwide. However, they, unfortunately, continue to be marginalized languages. Designing, building, and evaluating models that work on sign languages presents compelling research challenges and requires interdisciplinary and collaborative efforts. The recent advances in Machine Learning (ML) and Artificial Intelligence (AI) has the power to enable better accessibility to sign language users and narrow down the existing communication barrier between the Deaf community and non-sign language users. However, recent AI-powered technologies still do not account for sign language in their pipelines. This is mainly because sign languages are visual languages, that use manual and non-manual features to convey information, and do not have a standard written form. Thus, the goal of this thesis is to contribute to the development of new technologies that account for sign language by creating large-scale multimodal resources suitable for training modern data-hungry machine learning models and developing automatic systems that focus on computer vision tasks related to sign language that aims at learning better visual understanding of sign languages. Thus, in Part I, we introduce the How2Sign dataset, which is a large-scale collection of multimodal and multiview sign language videos in American Sign Language. In Part II, we contribute to the development of technologies that account for sign languages by presenting in Chapter 4 a framework called Spot-Align, based on sign spotting methods, to automatically annotate sign instances in continuous sign language. We further present the benefits of this framework and establish a baseline for the sign language recognition task on the How2Sign dataset. In addition to that, in Chapter 5 we benefit from the different annotations and modalities of the How2Sign to explore sign language video retrieval by learning cross-modal embeddings. Later in Chapter 6, we explore sign language video generation by applying Generative Adversarial Networks to the sign language domain and assess if and how well sign language users can understand automatically generated sign language videos by proposing an evaluation protocol based on How2Sign topics and English translationLes llengües de signes són llengües completes i naturals que utilitzen milions de persones de tot el món com mode de comunicació primer o preferit. Tanmateix, malauradament, continuen essent llengües marginades. Dissenyar, construir i avaluar tecnologies que funcionin amb les llengües de signes presenta reptes de recerca que requereixen d’esforços interdisciplinaris i col·laboratius. Els avenços recents en l’aprenentatge automàtic i la intel·ligència artificial (IA) poden millorar l’accessibilitat tecnològica dels signants, i alhora reduir la barrera de comunicació existent entre la comunitat sorda i les persones no-signants. Tanmateix, les tecnologies més modernes en IA encara no consideren les llengües de signes en les seves interfícies amb l’usuari. Això es deu principalment a que les llengües de signes són llenguatges visuals, que utilitzen característiques manuals i no manuals per transmetre informació, i no tenen una forma escrita estàndard. Els objectius principals d’aquesta tesi són la creació de recursos multimodals a gran escala adequats per entrenar models d’aprenentatge automàtic per a llengües de signes, i desenvolupar sistemes de visió per computador adreçats a una millor comprensió automàtica de les llengües de signes. Així, a la Part I presentem la base de dades How2Sign, una gran col·lecció multimodal i multivista de vídeos de la llengua de signes nord-americana. A la Part II, contribuïm al desenvolupament de tecnologia per a llengües de signes, presentant al capítol 4 una solució per anotar signes automàticament anomenada Spot-Align, basada en mètodes de localització de signes en seqüències contínues de signes. Després, presentem els avantatges d’aquesta solució i proporcionem uns primers resultats per la tasca de reconeixement de la llengua de signes a la base de dades How2Sign. A continuació, al capítol 5 aprofitem de les anotacions i diverses modalitats de How2Sign per explorar la cerca de vídeos en llengua de signes a partir de l’entrenament d’incrustacions multimodals. Finalment, al capítol 6, explorem la generació de vídeos en llengua de signes aplicant xarxes adversàries generatives al domini de la llengua de signes. Avaluem fins a quin punt els signants poden entendre els vídeos generats automàticament, proposant un nou protocol d’avaluació basat en les categories dins de How2Sign i la traducció dels vídeos a l’anglès escritLas lenguas de signos son lenguas completas y naturales que utilizan millones de personas de todo el mundo como modo de comunicación primero o preferido. Sin embargo, desgraciadamente, siguen siendo lenguas marginadas. Diseñar, construir y evaluar tecnologías que funcionen con las lenguas de signos presenta retos de investigación que requieren esfuerzos interdisciplinares y colaborativos. Los avances recientes en el aprendizaje automático y la inteligencia artificial (IA) pueden mejorar la accesibilidad tecnológica de los signantes, al tiempo que reducir la barrera de comunicación existente entre la comunidad sorda y las personas no signantes. Sin embargo, las tecnologías más modernas en IA todavía no consideran las lenguas de signos en sus interfaces con el usuario. Esto se debe principalmente a que las lenguas de signos son lenguajes visuales, que utilizan características manuales y no manuales para transmitir información, y carecen de una forma escrita estándar. Los principales objetivos de esta tesis son la creación de recursos multimodales a gran escala adecuados para entrenar modelos de aprendizaje automático para lenguas de signos, y desarrollar sistemas de visión por computador dirigidos a una mejor comprensión automática de las lenguas de signos. Así, en la Parte I presentamos la base de datos How2Sign, una gran colección multimodal y multivista de vídeos de lenguaje la lengua de signos estadounidense. En la Part II, contribuimos al desarrollo de tecnología para lenguas de signos, presentando en el capítulo 4 una solución para anotar signos automáticamente llamada Spot-Align, basada en métodos de localización de signos en secuencias continuas de signos. Después, presentamos las ventajas de esta solución y proporcionamos unos primeros resultados por la tarea de reconocimiento de la lengua de signos en la base de datos How2Sign. A continuación, en el capítulo 5 aprovechamos de las anotaciones y diversas modalidades de How2Sign para explorar la búsqueda de vídeos en lengua de signos a partir del entrenamiento de incrustaciones multimodales. Finalmente, en el capítulo 6, exploramos la generación de vídeos en lengua de signos aplicando redes adversarias generativas al dominio de la lengua de signos. Evaluamos hasta qué punto los signantes pueden entender los vídeos generados automáticamente, proponiendo un nuevo protocolo de evaluación basado en las categorías dentro de How2Sign y la traducción de los vídeos al inglés escrito.Postprint (published version

    The recurrent evolution of extremely resistant xylem

    Get PDF
    International audienceAbstractKey messageHighly resistant xylem has evolved multiple times over the past 400 million years.ContextWater is transported under tension in xylem and consequently is vulnerable to invasion by air and the formation of embolism. A debate has raged over whether embolism formation is non-reversible occurring at low water potentials or a regular diurnal occurrence that is non-lethal because of a capacity to refill embolised conduits.AimsThis commentary is on a recent article, which utilised new non-invasive imaging techniques for assessing the formation of embolism in xylem, finding that the xylem of Laurus nobilis was highly resistant to the formation of embolism.MethodsThe recent results of this discovery are placed in the context knowledge from a diversity of species that has so far been identified with xylem similarly highly resistant to embolism formation.ResultsThe discovery that L. nobilis has xylem highly resistant to embolism formation adds to a body of literature suggesting that the resistance of xylem to embolism formation is a key adaptation utilised by many species native to seasonally dry environments. Highly resistant xylem has evolved numerous times across the angiosperm clade.ConclusionWith more studies utilising similar observational and direct methods of assessing embolism resistance, further insight into the ecological and evolutionary relevance of this trait is imminent

    Patterns of the PRICE vowel in Liverpool English: History, Phonetics, and Corpus Phonology

    Get PDF
    Previous work on the Liverpool dialect has established that the PRICE vowel has an interesting phonological pattern; even so, there has never been a comprehensive study to confirm this claim. This dissertation provides an exploration of the PRICE vowel in Liverpool English through a corpus phonology approach. The present study finds that in the Liverpool dialect there are five PRICE vowel phonological patterns with a combination of four variants in three environments. These variants are: a raised nucleus diphthong, non-raised nucleus diphthong, lengthened nucleus diphthong, and monophthong; and the conditioning environments are: before voiceless obstruents, voiced obstruents, and nasal consonants. A striking observation is that the phonological patterns seem to have restrictions on variant combinations, which supports the hypothesis that Liverpool English has phonological patterns, rather than a number of variants available for each environment independent of the other variants. Specifically, there is no phonological pattern with a raised variant that does not have a monophthongal variant. Furthermore, an informant who produces a monophthong in the voiced environment necessarily has a monophthong before nasal consonants. The results of the present study may also suggest that there is phonological change in progress in the Liverpool PRICE vowel as two of the phonological patterns are produced exclusively by younger females. Many previous studies have suggested that younger women are the innovators in linguistic change. Finally, this dissertation takes a novel approach in explaining the origins of the PRICE vowel raising patterns. Three of the current theories on the origins of raising patterns in English are evaluated and combined in a way that encompasses the subfields of historical linguistics, phonology and dialectology in the final explanation

    Meta tcc

    Get PDF
    O Meta TCC se apresenta como uma investigação dos alcances do recurso teórico que é o trabalho de conclusão de curso, no desenvolvimento de uma pesquisa em artes visuais. A partir da análise de um recorte do trabalho da dupla de artistas “Amanda & Isadora”, o texto busca entrelaçar o entendimento de seu conteúdo com a consciência de seu formato. Menções à teóricos institucionalizados como Jean Baudrillard e Vílem Flusser são emparelhadas a estudos de pesquisadores da atualidade, como Hito Steyerl, Boris Groys e Marisa Olsen. Dessa forma, o conteúdo que se depreende de diversas instâncias e vertentes conceituais é apresentado de forma análoga aos mecanismos contemporâneos de circulação do conhecimento, ao qual o próprio texto se dedica a evidenciar. O texto propõe que as “Metaimagens” (2021-) de Amanda & Isadora estão inseridas em uma discussão mais ampla que o trabalho individual das artistas, à medida que questões, que perpassam sua produção artística, são evidenciadas ao longo do texto

    Dialectology, phonology, diachrony: Liverpool English realisations of PRICE and MOUTH.

    Get PDF
    Dialect emergence or new-dialect formation in intensive contact situations has been the subject of research for decades. Approaches to dialect emergence have led to a more solid understanding of the origins of specific phonological features. This line of research often approaches issues of new-dialect formation and phonological feature development within the confines of one linguistic subfield. However, new-dialect formation is a multifaceted phenomenon which results from a combination of dialectological, phonological and historical linguistic factors. The current thesis presents a comprehensive account of phonological feature development in new-dialect formation from a combined theoretical perspective by exploring historical and contemporary processes in the emergence of phonologically-conditioned variation in the price and mouth lexical sets in Liverpool English. This feature has been widely researched in other varieties of English and has previously been attributed to new-dialect formation. However, little is known about the patterns of price and mouth in Liverpool English. The current thesis relies on multiple methods of data collection (e.g. a combination of fieldwork and corpus data), various quantitative methods, and detailed acoustic analyses (e.g. formants and Euclidean distance in a two-dimensional formant space) to investigate the precise details and the processes involved in the emergence and development of price and mouth patterns in Liverpool English. Liverpool English is thought to have emerged during the 19th century as a result of extensive and prolonged immigration from the surrounding areas of Lancashire and Cheshire, and from Ireland, Wales, and Scotland. However, the specific timing, extent of immigration, and proportion of immigrant populations have not been investigated in detail. The current thesis provides the first in-depth analysis of historical census records in order to extend our knowledge of the populations in Liverpool at the time of new-dialect formation. The insights obtained from this analysis provide a more nuanced picture of the development of Liverpool English. They are essential for determining what dialects potentially contributed to dialect formation and the repertoire of price and mouth variants present at the time that these processes were developing. The analysis of historical census records is further augmented by using a combination of quantitative methods and historical corpora in order to gain a fuller understanding of the processes involved in the formation of these dialect features. The contemporary investigation of price and mouth in Liverpool English shows that these patterns are separate, but related, and that their phonological conditioning environments resemble those reported for cases of price and mouth variation in other varieties of English. I present a detailed overview of the phonetics and phonology of price and mouth variation in Liverpool English, looking at a wide range of conditioning environments. This investigation also reviews a range of different quantitative measurements useful for research on variation involving diphthongs. The origins of price and mouth phonological patterns in Liverpool English indicate that an approach combining different theoretical perspectives is required to adequately explain the development of these patterns. The current thesis suggests that price and mouth phonologically conditioned variation in Liverpool English initially resulted from variants of different dialects within the dialect contact situation. However, some features of the contemporary patterns developed following new-dialect formation as a by-product of phonetic and phonological properties of diphthong production in certain following environments. By approaching the development of these phonological features in Liverpool English from a combination of theoretical perspectives, the current thesis expands our understanding of emergent phonological features in new-dialect formation

    Sign language video retrieval with free-form textual queries

    Get PDF
    Systems that can efficiently search collections of sign language videos have been highlighted as a useful application of sign language technology. However, the problem of searching videos beyond individual keywords has received limited attention in the literature. To address this gap, in this work we introduce the task of sign language retrieval with textual queries: given a written query (e.g. a sentence) and a large collection of sign language videos, the objective is to find the signing video that best matches the written query. We propose to tackle this task by learning cross-modal embeddings on the recently introduced large-scale How2Sign dataset of American Sign Language (ASL). We identify that a key bottleneck in the performance of the system is the quality of the sign video embedding which suffers from a scarcity of labelled training data. We, therefore, propose SPOT-ALIGN, a framework for interleaving iterative rounds of sign spotting and feature alignment to expand the scope and scale of available training data. We validate the effectiveness of SPOT-ALIGN for learning a robust sign video embedding through improvements in both sign recognition and the proposed video retrieval task.This work was supported by the project PID2020-117142GB-I00, funded by MCIN/ AEI /10.13039/501100011033, ANR project CorVis ANR-21-CE23-0003- 01, and gifts from Google and Adobe. AD received support from la Caixa Foundation (ID 100010434), fellowship code LCF/BQ/IN18/11660029.Peer ReviewedObjectius de Desenvolupament Sostenible::10 - Reducció de les DesigualtatsObjectius de Desenvolupament Sostenible::10 - Reducció de les Desigualtats::10.2 - Per a 2030, potenciar i promoure la inclusió social, econòmica i política de totes les persones, independentment de l’edat, sexe, discapacitat, raça, ètnia, origen, religió, situació econòmica o altra condicióPostprint (author's final draft

    Jornalistas negras e racismo no jornalismo esportivo televisivo

    Get PDF
    Este trabalho analisa a atuação de jornalistas esportivas negras nos canais e programas de esportes que vão ao ar no Rio de Janeiro e busca demonstrar como este grupo não é beneficiado pelas estratégias de inclusão e equidade de gênero no jornalismo esportivo. Buscamos analisar como o racismo historicamente põe as mulheres negras em desvantagens em relação às mulheres brancas e como a opressão de raça e gênero reduz as oportunidades de jornalistas negras no esporte. Para isso, utilizamos a revisão bibliográfica e histórica e as experiências de profissionais na ativa. Além de reportagens, áudios e vídeos disponíveis na imprensa, incluímos relatos concedidos em entrevista por duas jornalistas de um dos canais selecionados para a pesquisa
    corecore