Search CORE

775 research outputs found

Using Various Features in Machine Learning to Obtain High Levels of Performance for Recognition of Japanese Notational Variants

Author: Aramaki Eiji
Fujita Atsushi
Kazama Junichi
Kojima Masahiro
Kuroda Kow
Murata Masaki
Torisawa Kentaro
Tsuchida Masaaki
Watanabe Yasuhiko
Publication venue: Institute of Digital Enhancement of Cognitive Processing, Waseda University
Publication date: 01/01/2011
Field of study

Comparison of echo state network output layer classification methods on noisy data

Author: Prater Ashley
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/03/2017
Field of study

Echo state networks are a recently developed type of recurrent neural network where the internal layer is fixed with random weights, and only the output layer is trained on specific data. Echo state networks are increasingly being used to process spatiotemporal data in real-world settings, including speech recognition, event detection, and robot control. A strength of echo state networks is the simple method used to train the output layer - typically a collection of linear readout weights found using a least squares approach. Although straightforward to train and having a low computational cost to use, this method may not yield acceptable accuracy performance on noisy data. This study compares the performance of three echo state network output layer methods to perform classification on noisy data: using trained linear weights, using sparse trained linear weights, and using trained low-rank approximations of reservoir states. The methods are investigated experimentally on both synthetic and natural datasets. The experiments suggest that using regularized least squares to train linear output weights is superior on data with low noise, but using the low-rank approximations may significantly improve accuracy on datasets contaminated with higher noise levels.Comment: 8 pages. International Joint Conference on Neural Networks (IJCNN 2017

arXiv.org e-Print Archive

Crossref

Relative Positional Encoding for Transformers with Linear Complexity

Author: Cífka Ondřej
Liutkus Antoine
Richard Gaël
Wu Shih-Lun
Yang Yi-Hsuan
Şimşekli Umut
Publication venue
Publication date: 10/06/2021
Field of study

Recent advances in Transformer models allow for unprecedented sequence lengths, due to linear space and time complexity. In the meantime, relative positional encoding (RPE) was proposed as beneficial for classical Transformers and consists in exploiting lags instead of absolute positions for inference. Still, RPE is not available for the recent linear-variants of the Transformer, because it requires the explicit computation of the attention matrix, which is precisely what is avoided by such methods. In this paper, we bridge this gap and present Stochastic Positional Encoding as a way to generate PE that can be used as a replacement to the classical additive (sinusoidal) PE and provably behaves like RPE. The main theoretical contribution is to make a connection between positional encoding and cross-covariance structures of correlated Gaussian processes. We illustrate the performance of our approach on the Long-Range Arena benchmark and on music generation.Comment: ICML 2021 (long talk) camera-ready. 24 page

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL-Rennes 1

An review of automatic drum transcription

Author: Dittmar Christian
Hockman Jason
Lerch Alexander
Muller Meinard
Southall Carl
Vogl Richard
Widmer Gerhard
Wu Chih-Wei
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/04/2018
Field of study

In Western popular music, drums and percussion are an important means to emphasize and shape the rhythm, often deﬁning the musical style. If computers were able to analyze the drum part in recorded music, it would enable a variety of rhythm-related music processing tasks. Especially the detection and classiﬁcation of drum sound events by computational methods is considered to be an important and challenging research problem in the broader ﬁeld of Music Information Retrieval. Over the last two decades, several authors have attempted to tackle this problem under the umbrella term Automatic Drum Transcription(ADT).This paper presents a comprehensive review of ADT research, including a thorough discussion of the task-speciﬁc challenges, categorization of existing techniques, and evaluation of several state-of-the-art systems. To provide more insights on the practice of ADT systems, we focus on two families of ADT techniques, namely methods based on Nonnegative Matrix Factorization and Recurrent Neural Networks. We explain the methods’ technical details and drum-speciﬁc variations and evaluate these approaches on publicly available datasets with a consistent experimental setup. Finally, the open issues and under-explored areas in ADT research are identiﬁed and discussed, providing future directions in this ﬁel

Crossref

Birmingham City University Open Access Repository

BCU Open Access

Automatic Speech Recognition for Low-resource Languages and Accents Using Multilingual and Crosslingual Information

Author: Vu Ngoc Thang
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2014
Field of study

This thesis explores methods to rapidly bootstrap automatic speech recognition systems for languages, which lack resources for speech and language processing. We focus on finding approaches which allow using data from multiple languages to improve the performance for those languages on different levels, such as feature extraction, acoustic modeling and language modeling. Under application aspects, this thesis also includes research work on non-native and Code-Switching speech

KITopen

Script Effects as the Hidden Drive of the Mind, Cognition, and Culture

Author: Pae Hye K.
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access volume reveals the hidden power of the script we read in and how it shapes and drives our minds, ways of thinking, and cultures. Expanding on the Linguistic Relativity Hypothesis (i.e., the idea that language affects the way we think), this volume proposes the “Script Relativity Hypothesis” (i.e., the idea that the script in which we read affects the way we think) by offering a unique perspective on the effect of script (alphabets, morphosyllabaries, or multi-scripts) on our attention, perception, and problem-solving. Once we become literate, fundamental changes occur in our brain circuitry to accommodate the new demand for resources. The powerful effects of literacy have been demonstrated by research on literate versus illiterate individuals, as well as cross-scriptal transfer, indicating that literate brain networks function differently, depending on the script being read. This book identifies the locus of differences between the Chinese, Japanese, and Koreans, and between the East and the West, as the neural underpinnings of literacy. To support the “Script Relativity Hypothesis”, it reviews a vast corpus of empirical studies, including anthropological accounts of human civilization, social psychology, cognitive psychology, neuropsychology, applied linguistics, second language studies, and cross-cultural communication. It also discusses the impact of reading from screens in the digital age, as well as the impact of bi-script or multi-script use, which is a growing trend around the globe. As a result, our minds, ways of thinking, and cultures are now growing closer together, not farther apart. ; Examines the origin, emergence, and co-evolution of written language, the human mind, and culture within the purview of script effects Investigates how the scripts we read over time shape our cognition, mind, and thought patterns Provides a new outlook on the four representative writing systems of the world Discusses the consequences of literacy for the functioning of the min

OAPEN Library

Max Planck Institute for Psycholinguistics: Annual report 1996

Author
Publication venue: Max Planck Institute for Psycholinguistics
Publication date: 01/01/1996
Field of study

MPG.PuRe

Music Encoding Conference Proceedings 2021, 19–22 July, 2021 University of Alicante (Spain): Onsite & Online

Author: Münnich Stefan
Rizo David
Publication venue: 'Universidad de Alicante Servicio de Publicaciones'
Publication date: 18/05/2022
Field of study

Este documento incluye los artículos y pósters presentados en el Music Encoding Conference 2021 realizado en Alicante entre el 19 y el 22 de julio de 2022.Funded by project Multiscore, MCIN/AEI/10.13039/50110001103

Repositorio Institucional de la Universidad de Alicante