56 research outputs found
Learning spectro-temporal representations of complex sounds with parameterized neural networks
Deep Learning models have become potential candidates for auditory
neuroscience research, thanks to their recent successes on a variety of
auditory tasks. Yet, these models often lack interpretability to fully
understand the exact computations that have been performed. Here, we proposed a
parametrized neural network layer, that computes specific spectro-temporal
modulations based on Gabor kernels (Learnable STRFs) and that is fully
interpretable. We evaluated predictive capabilities of this layer on Speech
Activity Detection, Speaker Verification, Urban Sound Classification and Zebra
Finch Call Type Classification. We found out that models based on Learnable
STRFs are on par for all tasks with different toplines, and obtain the best
performance for Speech Activity Detection. As this layer is fully
interpretable, we used quantitative measures to describe the distribution of
the learned spectro-temporal modulations. The filters adapted to each task and
focused mostly on low temporal and spectral modulations. The analyses show that
the filters learned on human speech have similar spectro-temporal parameters as
the ones measured directly in the human auditory cortex. Finally, we observed
that the tasks organized in a meaningful way: the human vocalizations tasks
closer to each other and bird vocalizations far away from human vocalizations
and urban sounds tasks
Measuring the social and environmental impact of small socio-economic projects in Morocco
L’un des principes de la constitution Marocaine est l’égalité d’accès des citoyennes et des citoyens aux conditions leur permettant de jouir des droits à la santé, à la protection sociale, à un logement décent, au travail, à l’accès à l’eau et à un environnement sain, à l’éducation à la formation professionnelle, à l’éducation physique et artistique (Article 31) [1] . La prise en compte des personnes vulnérables demeure un phénomène complexe. Victimes de préjugés socioculturels, notamment ceux liés à l'éducation, à la formation et à l'emploi, elles ont souvent des conditions de vie précaires.Le soutien aux activités génératrices de revenus (AGR) permet donc de prendre en charge les besoins qui leur sont spécifiques, mais aussi d'améliorer leurs conditions socioéconomiques et celles de leurs entourages.C'est pour appréhender l'apport de ces Activités Génératrices de Revenus (AGR) dans l'amélioration des conditions socioéconomiques et le respect de la réglementation environnementales et sociales, que nous allons entrepris la présente étude par un exemple concret de création d’un gite touristique
Modelling the Quantum Capacitance of Single-layer and Bilayer Graphene
In this paper, we report the modelling of quantum capacitance in both single-layer and bilayer graphene devices to investigate the temperature dependence. The model includes the existence of electron and hole puddles due to local fluctuations of the potential, which is taken into account with the possibility of finite lifetimes of electronic states to calculate the quantum capacitance using the Gaussian distribution. The results indicate that the simulations are in agreement with the experimental measurements, which proves the accuracy of the proposed model. On the other hand, temperature dependence around the charge neutrality point has been reported for both single and bilayer graphene
Sampling strategies in Siamese Networks for unsupervised speech representation learning
Recent studies have investigated siamese network architectures for learning
invariant speech representations using same-different side information at the
word level. Here we investigate systematically an often ignored component of
siamese networks: the sampling procedure (how pairs of same vs. different
tokens are selected). We show that sampling strategies taking into account
Zipf's Law, the distribution of speakers and the proportions of same and
different pairs of words significantly impact the performance of the network.
In particular, we show that word frequency compression improves learning across
a large range of variations in number of training pairs. This effect does not
apply to the same extent to the fully unsupervised setting, where the pairs of
same-different words are obtained by spoken term discovery. We apply these
results to pairs of words discovered using an unsupervised algorithm and show
an improvement on state-of-the-art in unsupervised representation learning
using siamese networks.Comment: Conference paper at Interspeech 201
Impact of tumor-infiltrating lymphocytes on pathological complete response after neoadjuvant chemotherapy in patients with early triple-negative breast cancer
Description of the subject: Triple-negative breast cancer (TNBC), a breast cancer subtype, is characterized by the lack of both estrogen and progesterone hormonal receptors expression and by the absence of human epidermal growth factor receptor 2 overexpression. Patients with a pathological complete response (pCR) have better disease-free and overall survival compared to those with residual disease. The high level of tumor-infiltrating lymphocytes (TILs) is associated with a higher response to neoadjuvant chemotherapy (NAC) and better prognosis.
Objective: Evaluation of TILs and their predictive impact in early TNBC in an Algerian population.
Methods: We assessed TILs and correlated them with the pCR rate in 94 early TNBC patients treated from 2015 to 2017 who underwent breast microbiopsy, NAC, and then surgery.
Results: Among 94 early TNBC patients, 53 (56.4%) achieved pCR and 41 (43.6%) had a residual disease. While some clinicopathological factors did not affect pCR, stromal TILs showed significant correlation with pCR (P < 0.0001). The presence of CD3+, CD4+, CD8+ and CD20+ TILs was also significantly correlated with pCR (P < 0.0001, P = 0.001, P = 0.0003 and P = 0.0001, respectively).
Conclusion: Our data showed that TILs were significantly associated with pCR, suggesting that TILs are a predictive biomarker for pCR in early TNBC patients treated by NAC in our cohort
XNMT: The eXtensible Neural Machine Translation Toolkit
This paper describes XNMT, the eXtensible Neural Machine Translation toolkit.
XNMT distin- guishes itself from other open-source NMT toolkits by its focus on
modular code design, with the purpose of enabling fast iteration in research
and replicable, reliable results. In this paper we describe the design of XNMT
and its experiment configuration system, and demonstrate its utility on the
tasks of machine translation, speech recognition, and multi-tasked machine
translation/parsing. XNMT is available open-source at
https://github.com/neulab/xnmtComment: To be presented at AMTA 2018 Open Source Software Showcas
Learning Word Embeddings: Unsupervised Methods for Fixed-size Representations of Variable-length Speech Segments
International audienceFixed-length embeddings of words are very useful for a variety of tasks in speech and language processing. Here we systematically explore two methods of computing fixed-length embeddings for variable-length sequences. We evaluate their susceptibility to phonetic and speaker-specific variability on English, a high resource language and Xitsonga, a low resource language, using two evaluation metrics: ABX word discrimination and ROC-AUC on same-different phoneme n-grams. We show that a simple downsampling method supplemented with length information can outperform the variable-length input feature representation on both evaluations. Recurrent autoencoders, trained without supervision, can yield even better results at the expense of increased computational complexity
Enregistrements de longue durée: Opportunités et défis
International audienceTechnological developments have allowed the development of lightweight, wearable recorders that collect audio (including speech) lasting up to a whole day. We provide a general description of the technique and lay out the advantages and drawbacks when using this methodology. Field linguists may gain a uniquely naturalistic viewpoint of language use as people go about their everyday activities. However, due to their duration, noisiness, and likelihood of containing sensitive information, long-form recordings remain difficult to annotate manually. Open-source tools improve reproducibility and ease-of-use for researchers, to which end speech technologists can contribute. Additionally, new approaches to human and automated annotation make the study of speech in longform recordings increasingly feasible and promising
- …