Search CORE

20 research outputs found

Attention-Inspired Artificial Neural Networks for Speech Processing: A Systematic Review

Author: Garcia-Constantino Matias
Hernández-Nolasco José Adán
Pancardo Pablo
Zacarias-Morales Noel
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

Artificial Neural Networks (ANNs) were created inspired by the neural networks in the human brain and have been widely applied in speech processing. The application areas of ANN include: Speech recognition, speech emotion recognition, language identification, speech enhancement, and speech separation, amongst others. Likewise, given that speech processing performed by humans involves complex cognitive processes known as auditory attention, there has been a growing amount of papers proposing ANNs supported by deep learning algorithms in conjunction with some mechanism to achieve symmetry with the human attention process. However, while these ANN approaches include attention, there is no categorization of attention integrated into the deep learning algorithms and their relation with human auditory attention. Therefore, we consider it necessary to have a review of the different ANN approaches inspired in attention to show both academic and industry experts the available models for a wide variety of applications. Based on the PRISMA methodology, we present a systematic review of the literature published since 2000, in which deep learning algorithms are applied to diverse problems related to speech processing. In this paper 133 research works are selected and the following aspects are described: (i) Most relevant features, (ii) ways in which attention has been implemented, (iii) their hypothetical relationship with human attention, and (iv) the evaluation metrics used. Additionally, the four publications most related with human attention were analyzed and their strengths and weaknesses were determined

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Ulster University's Research Portal

Bridging the Granularity Gap for Acoustic Modeling

Author: Hu Chi
Jiao Chengbo
Liu Xiaoqian
Ma Anxiang
Wang Huizhen
Xiao Tong
Xu Chen
Zeng Xin
Zhang Yuhao
Zhu JingBo
Publication venue
Publication date: 26/05/2023
Field of study

While Transformer has become the de-facto standard for speech, modeling upon the fine-grained frame-level features remains an open challenge of capturing long-distance dependencies and distributing the attention weights. We propose \textit{Progressive Down-Sampling} (PDS) which gradually compresses the acoustic features into coarser-grained units containing more complete semantic information, like text-level representation. In addition, we develop a representation fusion method to alleviate information loss that occurs inevitably during high compression. In this way, we compress the acoustic features into 1/32 of the initial length while achieving better or comparable performances on the speech recognition task. And as a bonus, it yields inference speedups ranging from 1.20

\times

to 1.47

\times

. By reducing the modeling burden, we also achieve competitive results when training on the more challenging speech translation task.Comment: ACL 2023 Finding

arXiv.org e-Print Archive

Machine Reading at Scale: A Search Engine for Scientific and Academic Research

Author: Oliveira Nuno
Praça Isabel
Sousa Norberto
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

The Internet, much like our universe, is ever-expanding. Information, in the most varied formats, is continuously added to the point of information overload. Consequently, the ability to navigate this ocean of data is crucial in our day-to-day lives, with familiar tools such as search engines carving a path through this unknown. In the research world, articles on a myriad of topics with distinct complexity levels are published daily, requiring specialized tools to facilitate the access and assessment of the information within. Recent endeavors in artificial intelligence, and in natural language processing in particular, can be seen as potential solutions for breaking information overload and provide enhanced search mechanisms by means of advanced algorithms. As the advent of transformer-based language models contributed to a more comprehensive analysis of both text-encoded intents and true document semantic meaning, there is simultaneously a need for additional computational resources. Information retrieval methods can act as low-complexity, yet reliable, filters to feed heavier algorithms, thus reducing computational requirements substantially. In this work, a new search engine is proposed, addressing machine reading at scale in the context of scientific and academic research. It combines state-of-the-art algorithms for information retrieval and reading comprehension tasks to extract meaningful answers from a corpus of scientific documents. The solution is then tested on two current and relevant topics, cybersecurity and energy, proving that the system is able to perform under distinct knowledge domains while achieving competent performance.This work has received funding from the following projects: UIDB/00760/2020 and UIDP/00760/2020.info:eu-repo/semantics/publishedVersio

Repositório Científico do Instituto Politécnico do Porto

Directory of Open Access Journals

Vision-based pavement marking detection and condition assessment : a case study

Author: Chen Mengcheng
Shou Wenchi (R19820)
Wang Jun (R20511)
Wang Xiangyu
Wu Peng
Xu Shuyuan
Publication venue: 'MDPI AG'
Publication date: 01/01/2021
Field of study

Pavement markings constitute an effective way of conveying regulations and guidance to drivers. They constitute the most fundamental way to communicate with road users, thus, greatly contributing to ensuring safety and order on roads. However, due to the increasingly extensive traffic demand, pavement markings are subject to a series of deterioration issues (e.g., wear and tear). Markings in poor condition typically manifest as being blurred or even missing in certain places. The need for proper maintenance strategies on roadway markings, such as repainting, can only be determined based on a comprehensive understanding of their as-is worn condition. Given the fact that an efficient, automated and accurate approach to collect such condition information is lacking in practice, this study proposes a vision-based framework for pavement marking detection and condition assessment. A hybrid feature detector and a threshold-based method were used for line marking identification and classification. For each identified line marking, its worn/blurred severity level was then quantified in terms of worn percentage at a pixel level. The damage estimation results were compared to manual measurements for evaluation, indicating that the proposed method is capable of providing indicative knowledge about the as-is condition of pavement markings. This paper demonstrates the promising potential of computer vision in the infrastructure sector, in terms of implementing a wider range of managerial operations for roadway management

Deakin Research Online

Western Sydney ResearchDirect

espace@Curtin

Neural Natural Language Generation: A Survey on Multilinguality, Multimodality, Controllability and Learning

Author: Apostol Elena-Simona
Babii Andrii
Berend Gábor
Calixto Iacer
Erdem Aykut
Erdem Erkut
Frank Anette
Gatt Albert
Korvel Grăzina
Kuyu Menekse
Lloret Elena
Martinčić-Ipšić Sanda
Parcalabescu Letitia
Truică Ciprian-Octavian
Turuta Oleksii
Yagcioglu Semih
Šandrih Branislava
Publication venue: 'AI Access Foundation'
Publication date: 06/04/2022
Field of study

Developing artificial learning systems that can understand and generate natural language has been one of the long-standing goals of artificial intelligence. Recent decades have witnessed an impressive progress on both of these problems, giving rise to a new family of approaches. Especially, the advances in deep learning over the past couple of years have led to neural approaches to natural language generation (NLG). These methods combine generative language learning techniques with neural-networks based frameworks. With a wide range of applications in natural language processing, neural NLG (NNLG) is a new and fast growing field of research. In this state-of-the-art report, we investigate the recent developments and applications of NNLG in its full extent from a multidimensional view, covering critical perspectives such as multimodality, multilinguality, controllability and learning strategies. We summarize the fundamental building blocks of NNLG approaches from these aspects and provide detailed reviews of commonly used preprocessing steps and basic neural architectures. This report also focuses on the seminal applications of these NNLG models such as machine translation, description generation, automatic speech recognition, abstractive summarization, text simplification, question answering and generation, and dialogue generation. Finally, we conclude with a thorough discussion of the described frameworks by pointing out some open research directions.This work has been partially supported by the European Commission ICT COST Action “Multi-task, Multilingual, Multi-modal Language Generation” (CA18231). AE was supported by BAGEP 2021 Award of the Science Academy. EE was supported in part by TUBA GEBIP 2018 Award. BP is in in part funded by Independent Research Fund Denmark (DFF) grant 9063-00077B. IC has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Sklodowska-Curie grant agreement No 838188. EL is partly funded by Generalitat Valenciana and the Spanish Government throught projects PROMETEU/2018/089 and RTI2018-094649-B-I00, respectively. SMI is partly funded by UNIRI project uniri-drustv-18-20. GB is partly supported by the Ministry of Innovation and the National Research, Development and Innovation Office within the framework of the Hungarian Artificial Intelligence National Laboratory Programme. COT is partially funded by the Romanian Ministry of European Investments and Projects through the Competitiveness Operational Program (POC) project “HOLOTRAIN” (grant no. 29/221 ap2/07.04.2020, SMIS code: 129077) and by the German Academic Exchange Service (DAAD) through the project “AWAKEN: content-Aware and netWork-Aware faKE News mitigation” (grant no. 91809005). ESA is partially funded by the German Academic Exchange Service (DAAD) through the project “Deep-Learning Anomaly Detection for Human and Automated Users Behavior” (grant no. 91809358)

Repositorio Institucional de la Universidad de Alicante

Past, Present, and Future of EEG-Based BCI Applications

Author: Muhammad Naveed
Muhammad Yar
Värbu Kaido
Publication venue: 'MDPI AG'
Publication date: 01/04/2022
Field of study

An electroencephalography (EEG)-based brain–computer interface (BCI) is a system that provides a pathway between the brain and external devices by interpreting EEG. EEG-based BCI applications have initially been developed for medical purposes, with the aim of facilitating the return of patients to normal life. In addition to the initial aim, EEG-based BCI applications have also gained increasing significance in the non-medical domain, improving the life of healthy people, for instance, by making it more efficient, collaborative and helping develop themselves. The objective of this review is to give a systematic overview of the literature on EEG-based BCI applications from the period of 2009 until 2019. The systematic literature review has been prepared based on three databases PubMed, Web of Science and Scopus. This review was conducted following the PRISMA model. In this review, 202 publications were selected based on specific eligibility criteria. The distribution of the research between the medical and non-medical domain has been analyzed and further categorized into fields of research within the reviewed domains. In this review, the equipment used for gathering EEG data and signal processing methods have also been reviewed. Additionally, current challenges in the field and possibilities for the future have been analyzed

Directory of Open Access Journals

Teeside University's Research Repository

PubMed Central

How well do deep learning-based methods for land cover classification and object detection perform on high resolution remote sensing imagery?

Author: Han L
Han L
Zhang X
Zhu L
Publication venue: 'MDPI AG'
Publication date: 28/01/2020
Field of study

© 2020 by the authors. Land cover information plays an important role in mapping ecological and environmental changes in Earth's diverse landscapes for ecosystem monitoring. Remote sensing data have been widely used for the study of land cover, enabling efficient mapping of changes of the Earth surface from Space. Although the availability of high-resolution remote sensing imagery increases significantly every year, traditional land cover analysis approaches based on pixel and object levels are not optimal. Recent advancement in deep learning has achieved remarkable success on image recognition field and has shown potential in high spatial resolution remote sensing applications, including classification and object detection. In this paper, a comprehensive review on land cover classification and object detection approaches using high resolution imagery is provided. Through two case studies, we demonstrated the applications of the state-of-the-art deep learning models to high spatial resolution remote sensing data for land cover classification and object detection and evaluated their performances against traditional approaches. For a land cover classification task, the deep-learning-based methods provide an end-to-end solution by using both spatial and spectral information. They have shown better performance than the traditional pixel-based method, especially for the categories of different vegetation. For an objective detection task, the deep-learning-based object detection method achieved more than 98% accuracy in a large area; its high accuracy and efficiency could relieve the burden of the traditional, labour-intensive method. However, considering the diversity of remote sensing data, more training datasets are required in order to improve the generalisation and the robustness of deep learning-based models

E-space: Manchester Metropolitan University's Research Repository

Deep learning-based change detection in remote sensing images:a review

Author: Asad Muhammad
Aslam Muhammad
Cao Guo
Khan Zia
Shafique Ayesha
Publication venue: 'MDPI AG'
Publication date: 11/02/2022
Field of study

Images gathered from different satellites are vastly available these days due to the fast development of remote sensing (RS) technology. These images significantly enhance the data sources of change detection (CD). CD is a technique of recognizing the dissimilarities in the images acquired at distinct intervals and are used for numerous applications, such as urban area development, disaster management, land cover object identification, etc. In recent years, deep learning (DL) techniques have been used tremendously in change detection processes, where it has achieved great success because of their practical applications. Some researchers have even claimed that DL approaches outperform traditional approaches and enhance change detection accuracy. Therefore, this review focuses on deep learning techniques, such as supervised, unsupervised, and semi-supervised for different change detection datasets, such as SAR, multispectral, hyperspectral, VHR, and heterogeneous images, and their advantages and disadvantages will be highlighted. In the end, some significant challenges are discussed to understand the context of improvements in change detection datasets and deep learning models. Overall, this review will be beneficial for the future development of CD methods

Multidisciplinary Digital Publishing Institute

Research Repository and Portal - University of the West of Scotland