Search CORE

20 research outputs found

Improvements of Silent Speech Interface Algorithms

Author: Honarmandi Shandiz Amin
Publication venue
Publication date
Field of study

SZTE Doktori Értekezések Repozitórium (SZTE Repository of Dissertations)

A Survey on Deep Learning in Medical Image Analysis

Author: Bejnordi Babak Ehteshami
Ciompi Francesco
Ghafoorian Mohsen
Kooi Thijs
Litjens Geert
Setio Arnaud Arindra Adiyoso
Sánchez Clara I.
van der Laak Jeroen A. W. M.
van Ginneken Bram
Publication venue: 'Elsevier BV'
Publication date: 01/01/2017
Field of study

Deep learning algorithms, in particular convolutional networks, have rapidly become a methodology of choice for analyzing medical images. This paper reviews the major deep learning concepts pertinent to medical image analysis and summarizes over 300 contributions to the field, most of which appeared in the last year. We survey the use of deep learning for image classification, object detection, segmentation, registration, and other tasks and provide concise overviews of studies per application area. Open challenges and directions for future research are discussed.Comment: Revised survey includes expanded discussion section and reworked introductory section on common deep architectures. Added missed papers from before Feb 1st 201

arXiv.org e-Print Archive

Radboud Repository

Exploring the Landscape of Ubiquitous In-home Health Monitoring: A Comprehensive Survey

Author: Etemad Ali
Pourpanah Farhad
Publication venue
Publication date: 22/06/2023
Field of study

Ubiquitous in-home health monitoring systems have become popular in recent years due to the rise of digital health technologies and the growing demand for remote health monitoring. These systems enable individuals to increase their independence by allowing them to monitor their health from the home and by allowing more control over their well-being. In this study, we perform a comprehensive survey on this topic by reviewing a large number of literature in the area. We investigate these systems from various aspects, namely sensing technologies, communication technologies, intelligent and computing systems, and application areas. Specifically, we provide an overview of in-home health monitoring systems and identify their main components. We then present each component and discuss its role within in-home health monitoring systems. In addition, we provide an overview of the practical use of ubiquitous technologies in the home for health monitoring. Finally, we identify the main challenges and limitations based on the existing literature and provide eight recommendations for potential future research directions toward the development of in-home health monitoring systems. We conclude that despite extensive research on various components needed for the development of effective in-home health monitoring systems, the development of effective in-home health monitoring systems still requires further investigation.Comment: 35 pages, 5 figure

arXiv.org e-Print Archive

IberSPEECH 2020: XI Jornadas en Tecnología del Habla and VII Iberian SLTech

Author: Cardeñoso Payo Valentín
Escudero Mancebo David
González Ferreras César
Publication venue: 'International Speech Communication Association'
Publication date: 25/03/2021
Field of study

IberSPEECH2020 is a two-day event, bringing together the best researchers and practitioners in speech and language technologies in Iberian languages to promote interaction and discussion. The organizing committee has planned a wide variety of scientific and social activities, including technical paper presentations, keynote lectures, presentation of projects, laboratories activities, recent PhD thesis, discussion panels, a round table, and awards to the best thesis and papers. The program of IberSPEECH2020 includes a total of 32 contributions that will be presented distributed among 5 oral sessions, a PhD session, and a projects session. To ensure the quality of all the contributions, each submitted paper was reviewed by three members of the scientific review committee. All the papers in the conference will be accessible through the International Speech Communication Association (ISCA) Online Archive. Paper selection was based on the scores and comments provided by the scientific review committee, which includes 73 researchers from different institutions (mainly from Spain and Portugal, but also from France, Germany, Brazil, Iran, Greece, Hungary, Czech Republic, Ucrania, Slovenia). Furthermore, it is confirmed to publish an extension of selected papers as a special issue of the Journal of Applied Sciences, “IberSPEECH 2020: Speech and Language Technologies for Iberian Languages”, published by MDPI with fully open access. In addition to regular paper sessions, the IberSPEECH2020 scientific program features the following activities: the ALBAYZIN evaluation challenge session.Red Española de Tecnologías del Habla. Universidad de Valladoli

Repositorio Documental de la Universidad de Valladolid

On Improving Generalization of CNN-Based Image Classification with Delineation Maps Using the CORF Push-Pull Inhibition Operator

Author: Antonisse Joey
Azzopardi George
Bennabhaktula Swaroop
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/10/2021
Field of study

Deployed image classification pipelines are typically dependent on the images captured in real-world environments. This means that images might be affected by different sources of perturbations (e.g. sensor noise in low-light environments). The main challenge arises by the fact that image quality directly impacts the reliability and consistency of classification tasks. This challenge has, hence, attracted wide interest within the computer vision communities. We propose a transformation step that attempts to enhance the generalization ability of CNN models in the presence of unseen noise in the test set. Concretely, the delineation maps of given images are determined using the CORF push-pull inhibition operator. Such an operation transforms an input image into a space that is more robust to noise before being processed by a CNN. We evaluated our approach on the Fashion MNIST data set with an AlexNet model. It turned out that the proposed CORF-augmented pipeline achieved comparable results on noise-free images to those of a conventional AlexNet classification model without CORF delineation maps, but it consistently achieved significantly superior performance on test images perturbed with different levels of Gaussian and uniform noise

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Beszédtudomány 2020

Author
Publication venue: Nyelvtudományi Intézet
Publication date: 01/01/2020
Field of study

REAL-J

Emotion-aware voice interfaces based on speech signal processing

Author: Ma Yong
Publication venue: Ludwig-Maximilians-Universität München
Publication date: 28/11/2022
Field of study

Voice interfaces (VIs) will become increasingly widespread in current daily lives as AI techniques progress. VIs can be incorporated into smart devices like smartphones, as well as integrated into autos, home automation systems, computer operating systems, and home appliances, among other things. Current speech interfaces, however, are unaware of users’ emotional states and hence cannot support real communication. To overcome these limitations, it is necessary to implement emotional awareness in future VIs. This thesis focuses on how speech signal processing (SSP) and speech emotion recognition (SER) can enable VIs to gain emotional awareness. Following an explanation of what emotion is and how neural networks are implemented, this thesis presents the results of several user studies and surveys. Emotions are complicated, and they are typically characterized using category and dimensional models. They can be expressed verbally or nonverbally. Although existing voice interfaces are unaware of users’ emotional states and cannot support natural conversations, it is possible to perceive users’ emotions by speech based on SSP in future VIs. One section of this thesis, based on SSP, investigates mental restorative eﬀects on humans and their measures from speech signals. SSP is less intrusive and more accessible than traditional measures such as attention scales or response tests, and it can provide a reliable assessment for attention and mental restoration. SSP can be implemented into future VIs and utilized in future HCI user research. The thesis then moves on to present a novel attention neural network based on sparse correlation features. The detection accuracy of emotions in the continuous speech was demonstrated in a user study utilizing recordings from a real classroom. In this section, a promising result will be shown. In SER research, it is unknown if existing emotion detection methods detect acted emotions or the genuine emotion of the speaker. Another section of this thesis is concerned with humans’ ability to act on their emotions. In a user study, participants were instructed to imitate five fundamental emotions. The results revealed that they struggled with this task; nevertheless, certain emotions were easier to replicate than others. A further study concern is how VIs should respond to users’ emotions if SER techniques are implemented in VIs and can recognize users’ emotions. The thesis includes research on ways for dealing with the emotions of users. In a user study, users were instructed to make sad, angry, and terrified VI avatars happy and were asked if they would like to be treated the same way if the situation were reversed. According to the results, the majority of participants tended to respond to these unpleasant emotions with neutral emotion, but there is a diﬀerence among genders in emotion selection. For a human-centered design approach, it is important to understand what the users’ preferences for future VIs are. In three distinct cultures, a questionnaire-based survey on users’ attitudes and preferences for emotion-aware VIs was conducted. It was discovered that there are almost no gender diﬀerences. Cluster analysis found that there are three fundamental user types that exist in all cultures: Enthusiasts, Pragmatists, and Sceptics. As a result, future VI development should consider diverse sorts of consumers. In conclusion, future VIs systems should be designed for various sorts of users as well as be able to detect the users’ disguised or actual emotions using SER and SSP technologies. Furthermore, many other applications, such as restorative eﬀects assessments, can be included in the VIs system

Digitale Hochschulschriften der LMU

XVII. Magyar Számítógépes Nyelvészeti Konferencia

Author
Publication venue
Publication date: 01/01/2021
Field of study

University of Szeged

Intra-nodal Caching Assisted UAV Based Data Acquisition from Wireless Mobile Ad-hoc Sensor Networks

Author: Chaudhry U
ETAI 2021 Conference Proceedings
Publication venue
Publication date: 01/09/2021
Field of study

Queen Mary Research Online