6,577 research outputs found
Recommended from our members
The role of HG in the analysis of temporal iteration and interaural correlation
Interpreter identification in the Polish Interpreting Corpus
This paper describes automated identification of interpreter voices in the Polish Interpreting Corpus (PINC). After collecting a set of voice samples of interpreters, a deep neural network model was used to match all the utterances from the corpus with specific individuals. The final result is very accurate and provides a considerable saving of time and accuracy off human judgment.Aquest article descriu la identificació automatitzada de veus d'intèrprets al Corpus d'Intèrprets Polonès (Polish Interpreting Corpus, PINC). Després de recollir un conjunt de mostres de veu de diversos intèrprets, s'ha utilitzat un model de xarxa neuronal profunda per fer coincidir les mostres de parla del corpus amb les de cada individu. El resultat final és molt precís i proporciona un estalvi considerable de temps i de precisió en la interpretació humana.Este artículo describe la identificación automática de voces de intérpretes en el Corpus Polaco de Interpretación. Tras recopilar una serie de muestras de voces de intérpretes, se utilizó un modelo de red neuronal profunda para asociar todas las elocuciones del corpus con individuos específicos. El resultado final es muy acertado, lo cual implica un ahorro considerable de tiempo y análisis humano
Towards End-to-End Acoustic Localization using Deep Learning: from Audio Signal to Source Position Coordinates
This paper presents a novel approach for indoor acoustic source localization
using microphone arrays and based on a Convolutional Neural Network (CNN). The
proposed solution is, to the best of our knowledge, the first published work in
which the CNN is designed to directly estimate the three dimensional position
of an acoustic source, using the raw audio signal as the input information
avoiding the use of hand crafted audio features. Given the limited amount of
available localization data, we propose in this paper a training strategy based
on two steps. We first train our network using semi-synthetic data, generated
from close talk speech recordings, and where we simulate the time delays and
distortion suffered in the signal that propagates from the source to the array
of microphones. We then fine tune this network using a small amount of real
data. Our experimental results show that this strategy is able to produce
networks that significantly improve existing localization methods based on
\textit{SRP-PHAT} strategies. In addition, our experiments show that our CNN
method exhibits better resistance against varying gender of the speaker and
different window sizes compared with the other methods.Comment: 18 pages, 3 figures, 8 table
Modern Views of Machine Learning for Precision Psychiatry
In light of the NIMH's Research Domain Criteria (RDoC), the advent of
functional neuroimaging, novel technologies and methods provide new
opportunities to develop precise and personalized prognosis and diagnosis of
mental disorders. Machine learning (ML) and artificial intelligence (AI)
technologies are playing an increasingly critical role in the new era of
precision psychiatry. Combining ML/AI with neuromodulation technologies can
potentially provide explainable solutions in clinical practice and effective
therapeutic treatment. Advanced wearable and mobile technologies also call for
the new role of ML/AI for digital phenotyping in mobile mental health. In this
review, we provide a comprehensive review of the ML methodologies and
applications by combining neuroimaging, neuromodulation, and advanced mobile
technologies in psychiatry practice. Additionally, we review the role of ML in
molecular phenotyping and cross-species biomarker identification in precision
psychiatry. We further discuss explainable AI (XAI) and causality testing in a
closed-human-in-the-loop manner, and highlight the ML potential in multimedia
information extraction and multimodal data fusion. Finally, we discuss
conceptual and practical challenges in precision psychiatry and highlight ML
opportunities in future research
Detection of fake images generated by deep learning
During the last few years, the amount of audiovisual content produced is continually increasing
with technology development. Along with this growth comes the availability of the same
information through numerous devices that any individual holds, including smartphones,
laptops, tablets, and smart TVs, in an entirely free and open manner. These type of content are
considered an authenticity element since they represent a reality record. For example, in court,
photos frequently determine the jury's course of action since what is available is a recorded
picture that validates a narrative and usually does not leave room for doubts. However, with the
advancement of Deep Learning (DL) algorithms, a new and dangerous trend known as
Deepfakes begins to emerge. For example, a deepfake can be a video or an image of a person
on which their face or body is totally or partially modified to appear to be someone else. This
technique is often used for manipulation, blackmailing, and spreading false information.
After recognizing such a dangerous problem, this study aims to uncover patterns that deepfakes
show to identify authenticity as accurately as possible, using machine learning and deep
learning algorithms. To get the highest level of accuracy, these algorithms were trained on
datasets that included both real and phony photos. The outcomes demonstrate that deepfakes
can be accurately identified and that the optimal model may be selected based on the specific
requirements of the application.RESUMO: Nos últimos anos, a quantidade de conteúdo audiovisual produzido tem vindo a aumentar
continuamente com o desenvolvimento da tecnologia. Juntamente com este crescimento, surge
a disponibilidade da mesma informação através de inúmeros dispositivos que qualquer
indivíduo possui, incluindo telemóveis, computadores, tablets e smart TVs, de uma forma
totalmente livre e aberta. Este tipo de conteúdo é considerado um elemento de autenticidade,
uma vez que representa um registo da realidade. Por exemplo, em tribunal, as fotografias
frequentemente determinam a linha de ação do júri, uma vez que o que está disponível é uma
imagem registada que valida uma narrativa e geralmente não deixa espaço para dúvidas. No
entanto, com o avanço dos algoritmos de Deep Learning (DL), começa a surgir uma nova e
perigosa tendência conhecida como Deepfakes. Por exemplo, um deepfake pode ser um vídeo
ou uma imagem de uma pessoa na qual o rosto ou o corpo é totalmente ou parcialmente
modificado para parecer ser outra pessoa. Esta técnica é frequentemente utilizada para
manipulação, chantagem e disseminação de informações falsas.
Após reconhecer um problema tão perigoso, este estudo tem como objetivo descobrir padrões
que os Deepfakes apresentam para identificar a autenticidade da forma mais precisa possível,
utilizando algoritmos de Machine Learning e Deep Learning. Estes algoritmos foram treinados
utilizando conjuntos de dados que contenham tanto fotografias autênticas quanto falsas, a fim
de obter o melhor nível de precisão. Os resultados obtidos mostram bons resultados na
identificação de deepfakes e que a escolha do melhor modelo pode ser ajustada às necessidades
da aplicação em causa
- …