6,577 research outputs found

    Interpreter identification in the Polish Interpreting Corpus

    Get PDF
    This paper describes automated identification of interpreter voices in the Polish Interpreting Corpus (PINC). After collecting a set of voice samples of interpreters, a deep neural network model was used to match all the utterances from the corpus with specific individuals. The final result is very accurate and provides a considerable saving of time and accuracy off human judgment.Aquest article descriu la identificació automatitzada de veus d'intèrprets al Corpus d'Intèrprets Polonès (Polish Interpreting Corpus, PINC). Després de recollir un conjunt de mostres de veu de diversos intèrprets, s'ha utilitzat un model de xarxa neuronal profunda per fer coincidir les mostres de parla del corpus amb les de cada individu. El resultat final és molt precís i proporciona un estalvi considerable de temps i de precisió en la interpretació humana.Este artículo describe la identificación automática de voces de intérpretes en el Corpus Polaco de Interpretación. Tras recopilar una serie de muestras de voces de intérpretes, se utilizó un modelo de red neuronal profunda para asociar todas las elocuciones del corpus con individuos específicos. El resultado final es muy acertado, lo cual implica un ahorro considerable de tiempo y análisis humano

    Towards End-to-End Acoustic Localization using Deep Learning: from Audio Signal to Source Position Coordinates

    Full text link
    This paper presents a novel approach for indoor acoustic source localization using microphone arrays and based on a Convolutional Neural Network (CNN). The proposed solution is, to the best of our knowledge, the first published work in which the CNN is designed to directly estimate the three dimensional position of an acoustic source, using the raw audio signal as the input information avoiding the use of hand crafted audio features. Given the limited amount of available localization data, we propose in this paper a training strategy based on two steps. We first train our network using semi-synthetic data, generated from close talk speech recordings, and where we simulate the time delays and distortion suffered in the signal that propagates from the source to the array of microphones. We then fine tune this network using a small amount of real data. Our experimental results show that this strategy is able to produce networks that significantly improve existing localization methods based on \textit{SRP-PHAT} strategies. In addition, our experiments show that our CNN method exhibits better resistance against varying gender of the speaker and different window sizes compared with the other methods.Comment: 18 pages, 3 figures, 8 table

    Modern Views of Machine Learning for Precision Psychiatry

    Full text link
    In light of the NIMH's Research Domain Criteria (RDoC), the advent of functional neuroimaging, novel technologies and methods provide new opportunities to develop precise and personalized prognosis and diagnosis of mental disorders. Machine learning (ML) and artificial intelligence (AI) technologies are playing an increasingly critical role in the new era of precision psychiatry. Combining ML/AI with neuromodulation technologies can potentially provide explainable solutions in clinical practice and effective therapeutic treatment. Advanced wearable and mobile technologies also call for the new role of ML/AI for digital phenotyping in mobile mental health. In this review, we provide a comprehensive review of the ML methodologies and applications by combining neuroimaging, neuromodulation, and advanced mobile technologies in psychiatry practice. Additionally, we review the role of ML in molecular phenotyping and cross-species biomarker identification in precision psychiatry. We further discuss explainable AI (XAI) and causality testing in a closed-human-in-the-loop manner, and highlight the ML potential in multimedia information extraction and multimodal data fusion. Finally, we discuss conceptual and practical challenges in precision psychiatry and highlight ML opportunities in future research

    Detection of fake images generated by deep learning

    Get PDF
    During the last few years, the amount of audiovisual content produced is continually increasing with technology development. Along with this growth comes the availability of the same information through numerous devices that any individual holds, including smartphones, laptops, tablets, and smart TVs, in an entirely free and open manner. These type of content are considered an authenticity element since they represent a reality record. For example, in court, photos frequently determine the jury's course of action since what is available is a recorded picture that validates a narrative and usually does not leave room for doubts. However, with the advancement of Deep Learning (DL) algorithms, a new and dangerous trend known as Deepfakes begins to emerge. For example, a deepfake can be a video or an image of a person on which their face or body is totally or partially modified to appear to be someone else. This technique is often used for manipulation, blackmailing, and spreading false information. After recognizing such a dangerous problem, this study aims to uncover patterns that deepfakes show to identify authenticity as accurately as possible, using machine learning and deep learning algorithms. To get the highest level of accuracy, these algorithms were trained on datasets that included both real and phony photos. The outcomes demonstrate that deepfakes can be accurately identified and that the optimal model may be selected based on the specific requirements of the application.RESUMO: Nos últimos anos, a quantidade de conteúdo audiovisual produzido tem vindo a aumentar continuamente com o desenvolvimento da tecnologia. Juntamente com este crescimento, surge a disponibilidade da mesma informação através de inúmeros dispositivos que qualquer indivíduo possui, incluindo telemóveis, computadores, tablets e smart TVs, de uma forma totalmente livre e aberta. Este tipo de conteúdo é considerado um elemento de autenticidade, uma vez que representa um registo da realidade. Por exemplo, em tribunal, as fotografias frequentemente determinam a linha de ação do júri, uma vez que o que está disponível é uma imagem registada que valida uma narrativa e geralmente não deixa espaço para dúvidas. No entanto, com o avanço dos algoritmos de Deep Learning (DL), começa a surgir uma nova e perigosa tendência conhecida como Deepfakes. Por exemplo, um deepfake pode ser um vídeo ou uma imagem de uma pessoa na qual o rosto ou o corpo é totalmente ou parcialmente modificado para parecer ser outra pessoa. Esta técnica é frequentemente utilizada para manipulação, chantagem e disseminação de informações falsas. Após reconhecer um problema tão perigoso, este estudo tem como objetivo descobrir padrões que os Deepfakes apresentam para identificar a autenticidade da forma mais precisa possível, utilizando algoritmos de Machine Learning e Deep Learning. Estes algoritmos foram treinados utilizando conjuntos de dados que contenham tanto fotografias autênticas quanto falsas, a fim de obter o melhor nível de precisão. Os resultados obtidos mostram bons resultados na identificação de deepfakes e que a escolha do melhor modelo pode ser ajustada às necessidades da aplicação em causa
    corecore