    Efficient Spoken Language Recognition via Multilabel Classification

    Spoken language recognition (SLR) is the task of automatically identifying the language present in a speech signal. Existing SLR models are either too computationally expensive or too large to run effectively on devices with limited resources. For real-world deployment, a model should also gracefully handle unseen languages outside of the target language set, yet prior work has focused on closed-set classification where all input languages are known a-priori. In this paper we address these two limitations: we explore efficient model architectures for SLR based on convolutional networks, and propose a multilabel training strategy to handle non-target languages at inference time. Using the VoxLingua107 dataset, we show that our models obtain competitive results while being orders of magnitude smaller and faster than current state-of-the-art methods, and that our multilabel strategy is more robust to unseen non-target languages compared to multiclass classification.Comment: Accepted to InterSpeech 202

    La protección de las mujeres extranjeras víctimas de violencia de género en situación administrativa irregular: Un estudio práctico sobre la autorización de residencia y trabajo por circunstancias excepcionales para víctimas de violencia de género en la provincia de Barcelona

    Treball de Grau de Dret. Universitat de Barcelona, Curs: 2019-2020, Tutor: Markus González Beilfuss.II Premi Clara Campoamor al millor Treball Final de Grau amb perspectiva de gènere de la Universitat de Barcelona. Accèssit a la branca de Ciències JurídiquesLos datos del último informe del Consejo General del Poder Judicial sobre Violencia de Género ponen de manifiesto una mayor incidencia de esta sobre las mujeres migrantes. Desde el año 2009, la Ley Orgánica de Extranjería contempla la posibilidad de solicitar una autorización de residencia y trabajo en atención únicamente al hecho de haber sido víctima de Violencia de Género. Este trabajo tiene como objetivo estudiar el régimen jurídico y la aplicación práctica de estas autorizaciones, así como los factores de vulnerabilidad específicos de este colectivo de mujeres partiendo de la interseccionalidad y la perspectiva de género como paradigmas analíticos. El estudio de la aplicación práctica de esta autorización se ha realizado mediante un trabajo de campo consistente en el análisis de todos los expedientes relativos a estas autorizaciones tramitados en la provincia de Barcelona entre los años 2015 y 2019 y la realización de entrevistas cualitativas a funcionarios de la Subdelegación del Gobierno en Barcelona claves en su tramitación. Los resultados de este trabajo muestran como los dos principales textos normativos aplicables (la Ley Integral contra la Violencia de Género y la Ley de Extranjería) invisibilizan a las mujeres migrantes y priorizan su condición de migrantes a la de víctimas y que las mujeres extranjeras que se encuentran en esta situación tienen unos factores de vulnerabilidad específicos de su condición simultánea de mujeres y de migrantes. A partir del estudio de la aplicación práctica de estas autorizaciones, se han podido identificar algunas buenas prácticas llevadas a cabo por la Subdelegación del Gobierno en Barcelona que podrían aplicarse en otras provincias, como son la agilidad y detalle en la tramitación y las elevadas tasas de concesión. Por otra parte, también se han identificado grupos específicos de mujeres que están quedando fuera de la protección de esta autorización, así como una sobrerrepresentación entre las mujeres solicitantes de aquellas provenientes de países de origen de trata de seres humanos. Finalmente, se han revelado determinados problemas relacionados con la denuncia de la situación de violencia de género, con la asistencia jurídica y con el cambio a autorizaciones de residencia de larga duración una vez terminado el período de protección de esta autorización

    SALAI-Net: species-agnostic local ancestry inference network

    Availability and implementation: We provide an open source implementation and links to publicly available data at github.com/AI-sandbox/SALAI-Net. Data is publicly available as follows: https://www.internationalgenome.org (1000 Genomes), https://www.simonsfoundation.org/simons-genome-diversity-project (Simons Genome Diversity Project), https://www.sanger.ac.uk/resources/downloads/human/hapmap3.html (HapMap), ftp://ngs.sanger.ac.uk/production/hgdp/hgdp_wgs.20190516 (Human Genome Diversity Project) and https://www.ncbi.nlm.nih.gov/bioproject/PRJNA448733 (Canid genomes).Local ancestry inference (LAI) is the high resolution prediction of ancestry labels along a DNA sequence. LAI is important in the study of human history and migrations, and it is beginning to play a role in precision medicine applications including ancestry-adjusted genome-wide association studies (GWASs) and polygenic risk scores (PRSs). Existing LAI models do not generalize well between species, chromosomes or even ancestry groups, requiring re-training for each different setting. Furthermore, such methods can lack interpretability, which is an important element in each of these applications. We present SALAI-Net, a portable statistical LAI method that can be applied on any set of species and ancestries (species-agnostic), requiring only haplotype data and no other biological parameters. Inspired by identity by descent methods, SALAI-Net estimates population labels for each segment of DNA by performing a reference matching approach, which leads to an interpretable and fast technique. We benchmark our models on whole-genome data of humans and we test these models’ ability to generalize to dog breeds when trained on human data. SALAI-Net outperforms previous methods in terms of balanced accuracy, while generalizing between different settings, species and datasets. Moreover, it is up to two orders of magnitude faster and uses considerably less RAM memory than competing methods.This paper was published as part of a special issue financially supported by ECCB2022. Some of the computing for this project was performed on the Sherlock cluster at Stanford University. We would like to thank Stanford University and the Stanford Research Computing Center for providing computational resources and support that contributed to these research results. A.G.I. and D.M.M. received support from NIH under award R01HG010140. Conflict of Interest: AGI is a co-founder of Galatea Bio Inc.Peer ReviewedObjectius de Desenvolupament Sostenible::3 - Salut i BenestarObjectius de Desenvolupament Sostenible::3 - Salut i Benestar::3.4 - Per a 2030, reduir en un terç la mortalitat prematura per malalties no transmissibles, mitjançant la prevenció i el tractament, i promoure la salut mental i el benestarPostprint (published version

    Bridging High-Quality Audio and Video via Language for Sound Effects Retrieval from Visual Queries

    Finding the right sound effects (SFX) to match moments in a video is a difficult and time-consuming task, and relies heavily on the quality and completeness of text metadata. Retrieving high-quality (HQ) SFX using a video frame directly as the query is an attractive alternative, removing the reliance on text metadata and providing a low barrier to entry for non-experts. Due to the lack of HQ audio-visual training data, previous work on audio-visual retrieval relies on YouTube (in-the-wild) videos of varied quality for training, where the audio is often noisy and the video of amateur quality. As such it is unclear whether these systems would generalize to the task of matching HQ audio to production-quality video. To address this, we propose a multimodal framework for recommending HQ SFX given a video frame by (1) leveraging large language models and foundational vision-language models to bridge HQ audio and video to create audio-visual pairs, resulting in a highly scalable automatic audio-visual data curation pipeline; and (2) using pre-trained audio and visual encoders to train a contrastive learning-based retrieval system. We show that our system, trained using our automatic data curation pipeline, significantly outperforms baselines trained on in-the-wild data on the task of HQ SFX retrieval for video. Furthermore, while the baselines fail to generalize to this task, our system generalizes well from clean to in-the-wild data, outperforming the baselines on a dataset of YouTube videos despite only being trained on the HQ audio-visual pairs. A user study confirms that people prefer SFX retrieved by our system over the baseline 67% of the time both for HQ and in-the-wild data. Finally, we present ablations to determine the impact of model and data pipeline design choices on downstream retrieval performance. Please visit our project website to listen to and view our SFX retrieval results.Comment: WASPAA 2023. Project page: https://juliawilkins.github.io/sound-effects-retrieval-from-video/. 4 pages, 2 figures, 2 table

    Predicting Audio Advertisement Quality

    Online audio advertising is a particular form of advertising used abundantly in online music streaming services. In these platforms, which tend to host tens of thousands of unique audio advertisements (ads), providing high quality ads ensures a better user experience and results in longer user engagement. Therefore, the automatic assessment of these ads is an important step toward audio ads ranking and better audio ads creation. In this paper we propose one way to measure the quality of the audio ads using a proxy metric called Long Click Rate (LCR), which is defined by the amount of time a user engages with the follow-up display ad (that is shown while the audio ad is playing) divided by the impressions. We later focus on predicting the audio ad quality using only acoustic features such as harmony, rhythm, and timbre of the audio, extracted from the raw waveform. We discuss how the characteristics of the sound can be connected to concepts such as the clarity of the audio ad message, its trustworthiness, etc. Finally, we propose a new deep learning model for audio ad quality prediction, which outperforms the other discussed models trained on hand-crafted features. To the best of our knowledge, this is the first large-scale audio ad quality prediction study.Comment: WSDM '18 Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining, 9 page

    Language-Guided Audio-Visual Source Separation via Trimodal Consistency

    We propose a self-supervised approach for learning to perform audio source separation in videos based on natural language queries, using only unlabeled video and audio pairs as training data. A key challenge in this task is learning to associate the linguistic description of a sound-emitting object to its visual features and the corresponding components of the audio waveform, all without access to annotations during training. To overcome this challenge, we adapt off-the-shelf vision-language foundation models to provide pseudo-target supervision via two novel loss functions and encourage a stronger alignment between the audio, visual and natural language modalities. During inference, our approach can separate sounds given text, video and audio input, or given text and audio input alone. We demonstrate the effectiveness of our self-supervised approach on three audio-visual separation datasets, including MUSIC, SOLOS and AudioSet, where we outperform state-of-the-art strongly supervised approaches despite not using object detectors or text labels during training.Comment: Accepted at CVPR 202

    Syphilis vaccine : challenges, controversies and opportunities

    Syphilis is a sexually or vertically (mother to fetus) transmitted disease caused by the infection of Treponema pallidum subspecie pallidum (TPA). The incidence of syphilis has increased over the past years despite the fact that this bacterium is an obligate human pathogen, the infection route is well known, and the disease can be successfully treated with penicillin. As complementary measures to preventive campaigns and early treatment of infected individuals, development of a syphilis vaccine may be crucial for controlling disease spread and/or severity, particularly in countries where the effectiveness of the aforementioned measures is limited. In the last century, several vaccine prototypes have been tested in preclinical studies, mainly in rabbits. While none of them provided protection against infection, some prototypes prevented bacteria from disseminating to distal organs, attenuated lesion development, and accelerated their healing. In spite of these promising results, there is still some controversy regarding the identification of vaccine candidates and the characteristics of a syphilis-protective immune response. In this review, we describe what is known about TPA immune response, and the main mechanisms used by this pathogen to evade it. Moreover, we emphasize the importance of integrating this knowledge, in conjunction with the characterization of outer membrane proteins (OMPs), to expedite the development of a syphilis vaccine that can protect against TPA infection

    Key Figures on Alt Empordà 2022

    Demografia; Salut; Economia; TerritoriDemografía; Salud; Economía; TerritorioDemographics; Health; Economy; TerritoryIndicadors Clau de l’Alt Empordà 2022 presenta una selecció de dades estadístiques sobre l’Alt Empordà i la demarcació de Girona. Aquestes dades s’emmarquen en l’àmbit municipal en tots aquells casos en què ha estat possible i, en altres en què no s’ha pogut obtenir informació prou detallada, es presenten en àmbit comarcal o provincial. En aquest document es vol aportar una visió general que faciliti conèixer l’estat del territori amb dades de l’any 2021 o les més recents editades, i veure quines són les tendències actuals. El recull inclou dades sociodemogràfiques, de salut, d’economia, de medi ambient i d’altres recursos del territori. En l’edició de 2022, com ja passava l’any anterior, s’objectiva la influència de l'epidèmia de COVID-19 en diversos indicadors i tendències.Indicadores Clave del Alt Empordà 2022 presenta una selección de datos estadísticos sobre el Alt Empordà y la demarcación de Girona. Estos datos se enmarcan en el ámbito municipal en todos aquellos casos en los que ha sido posible y, en los que no se ha podido obtener información lo suficientemente detallada, se presentan en ámbito comarcal o provincial. En este documento se quiere aportar una visión general que facilite conocer el estado del territorio con datos del año 2021 o más recientes editados, y ver cuáles son las tendencias actuales. La recopilación incluye datos sociodemográficos, de salud, de economía, de medio ambiente y otros recursos del territorio. En la edición de 2022, como ya pasaba el año anterior, se objetiva la influencia de la epidemia de COVID-19 en varios indicadores y tendencias.Key Figures on Alt Empordà 2022 presents a selection of statistics on Alt Empordà and the province of Girona. Whenever possible, this data is defined within the municipal scope, but in many cases, it has not been possible to obtain sufficiently detailed information, so the figures are for the whole area or provincial. This document aims to provide a general view to help us find out about the state of the area using 2021 data or the most recent data available and identifying current trends. This document includes data on demographics, socio-demographics, health, economy, the territory and its resources. As was the case last year, in the 2022 edition, the influence of the COVID-19 epidemic has been objectified in various indicators and trends