Search CORE

5 research outputs found

Human-technology integration with industrial conversational agents: A conceptual architecture and a taxonomy for manufacturing

Author: Colabianchi S.
Costantino F.
Tedeschi A.
Publication venue
Publication date: 01/01/2023
Field of study

Conversational agents are systems with great potential to enhance human-computer interaction in industrial settings. Although the number of applications of conversational agents in many fields is growing, there is no shared view of the elements to design and implement for chatbots in the industrial field. The paper presents the combination of many research contributions into an integrated conceptual architecture, for developing industrial conversational agents using Nickerson's methodology. The conceptual architecture consists of five core modules; every module consists of specific elements and approaches. Furthermore, the paper defines a taxonomy from the study of empirical applications of manufacturing conversational agents. Indeed, some applications of chatbots in manufacturing are available but those have never been collected in single research. The paper fills this gap by analyzing the empirical cases and presenting a qualitative analysis, with verification of the proposed taxonomy. The contribution of the article is mainly to illustrate the elements needed for the development of a conversational agent in manufacturing: researchers and practitioners can use the proposed conceptual architecture and taxonomy to more easily investigate, define, and develop all the elements for chatbot implementation

Archivio della ricerca- Università di Roma La Sapienza

Survey of deep representation learning for speech emotion recognition

Author: Jurdak Raja
Khalifa Sara
Latif Siddique
Qadir Junaid
Rana Rajib
Schuller Björn W.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Traditionally, speech emotion recognition (SER) research has relied on manually handcrafted acoustic features using feature engineering. However, the design of handcrafted features for complex SER tasks requires significant manual eort, which impedes generalisability and slows the pace of innovation. This has motivated the adoption of representation learning techniques that can automatically learn an intermediate representation of the input signal without any manual feature engineering. Representation learning has led to improved SER performance and enabled rapid innovation. Its effectiveness has further increased with advances in deep learning (DL), which has facilitated \textit{deep representation learning} where hierarchical representations are automatically learned in a data-driven manner. This paper presents the first comprehensive survey on the important topic of deep representation learning for SER. We highlight various techniques, related challenges and identify important future areas of research. Our survey bridges the gap in the literature since existing surveys either focus on SER with hand-engineered features or representation learning in the general setting without focusing on SER

OPUS Augsburg

Queensland University of Technology ePrints Archive

University of Southern Queensland ePrints

Deep Learning-based Speech Enhancement for Real-life Applications

Author: Abdallah Abdelhafiz Nossier S.
Abdallah Abdelhafiz Nossier S.
Publication venue: University of East London
Publication date: 01/01/2023
Field of study

Speech enhancement is the process of improving speech quality and intelligibility by suppressing noise. Inspired by the outstanding performance of the deep learning approach for speech enhancement, this thesis aims to add to this research area through the following contributions. The thesis presents an experimental analysis of different deep neural networks for speech enhancement, to compare their performance and investigate factors and approaches that improve the performance. The outcomes of this analysis facilitate the development of better speech enhancement networks in this work. Moreover, this thesis proposes a new deep convolutional denoising autoencoderbased speech enhancement architecture, in which strided and dilated convolutions were applied to improve the performance while keeping network complexity to a minimum. Furthermore, a two-stage speech enhancement approach is proposed that reduces distortion, by performing a speech denoising first stage in the frequency domain, followed by a second speech reconstruction stage in the time domain. This approach was proven to reduce speech distortion, leading to better overall quality of the processed speech in comparison to state-of-the-art speech enhancement models. Finally, the work presents two deep neural network speech enhancement architectures for hearing aids and automatic speech recognition, as two real-world speech enhancement applications. A smart speech enhancement architecture was proposed for hearing aids, which is an integrated hearing aid and alert system. This architecture enhances both speech and important emergency noise, and only eliminates undesired noise. The results show that this idea is applicable to improve the performance of hearing aids. On the other hand, the architecture proposed for automatic speech recognition solves the mismatch issue between speech enhancement automatic speech recognition systems, leading to significant reduction in the word error rate of a baseline automatic speech recognition system, provided by Intelligent Voice for research purposes. In conclusion, the results presented in this thesis show promising performance for the proposed architectures for real time speech enhancement applications

UEL Research Repository at University of East London