Search CORE

51,436 research outputs found

VLSI implementation of an energy-aware wake-up detector for an acoustic surveillance sensor network

Author: Andreou Andreas
Goldberg David H.
Julian Pedro Marcelo
Pouliquen Philippe O.
Riddle Laurence
Rosasco Rich
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/11/2006
Field of study

We present a low-power VLSI wake-up detector for a sensor network that uses acoustic signals to localize ground-base vehicles. The detection criterion is the degree of low-frequency periodicity in the acoustic signal, and the periodicity is computed from the "bumpiness" of the autocorrelation of a one-bit version of the signal. We then describe a CMOS ASIC that implements the periodicity estimation algorithm. The ASIC is functional and its core consumes 835 nanowatts. It was integrated into an acoustic enclosure and deployed in field tests with synthesized sounds and ground-based vehicles.Fil: Goldberg, David H.. Johns Hopkins University; Estados UnidosFil: Andreou, Andreas. Johns Hopkins University; Estados UnidosFil: Julian, Pedro Marcelo. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Nacional del Sur. Departamento de Ingeniería Eléctrica y de Computadoras; ArgentinaFil: Pouliquen, Philippe O.. Johns Hopkins University; Estados UnidosFil: Riddle, Laurence. Signal Systems Corporation; Estados UnidosFil: Rosasco, Rich. Signal Systems Corporation; Estados Unido

CONICET Digital

Phone-aware Neural Language Identification

Author: Chen Yixiang
Li Lantian
Shi Ying
Tang Zhiyuan
Wang Dong
Publication venue
Publication date: 22/05/2017
Field of study

Pure acoustic neural models, particularly the LSTM-RNN model, have shown great potential in language identification (LID). However, the phonetic information has been largely overlooked by most of existing neural LID models, although this information has been used in the conventional phonetic LID systems with a great success. We present a phone-aware neural LID architecture, which is a deep LSTM-RNN LID system but accepts output from an RNN-based ASR system. By utilizing the phonetic knowledge, the LID performance can be significantly improved. Interestingly, even if the test language is not involved in the ASR training, the phonetic knowledge still presents a large contribution. Our experiments conducted on four languages within the Babel corpus demonstrated that the phone-aware approach is highly effective.Comment: arXiv admin note: text overlap with arXiv:1705.0315

arXiv.org e-Print Archive

Crossref

On the efficient representation and execution of deep acoustic models

Author: Alvarez Raziel
Bakhtin Anton
Prabhavalkar Rohit
Publication venue
Publication date: 16/12/2016
Field of study

In this paper we present a simple and computationally efficient quantization scheme that enables us to reduce the resolution of the parameters of a neural network from 32-bit floating point values to 8-bit integer values. The proposed quantization scheme leads to significant memory savings and enables the use of optimized hardware instructions for integer arithmetic, thus significantly reducing the cost of inference. Finally, we propose a "quantization aware" training process that applies the proposed scheme during network training and find that it allows us to recover most of the loss in accuracy introduced by quantization. We validate the proposed techniques by applying them to a long short-term memory-based acoustic model on an open-ended large vocabulary speech recognition task.Comment: Accepted conference paper: "The Annual Conference of the International Speech Communication Association (Interspeech), 2016

arXiv.org e-Print Archive

Crossref

A Compact and Discriminative Feature Based on Auditory Summary Statistics for Acoustic Scene Classification

Author: Deng Shiwen
Han Jiqing
Song Hongwei
Publication venue: 'International Speech Communication Association'
Publication date: 10/04/2019
Field of study

One of the biggest challenges of acoustic scene classification (ASC) is to find proper features to better represent and characterize environmental sounds. Environmental sounds generally involve more sound sources while exhibiting less structure in temporal spectral representations. However, the background of an acoustic scene exhibits temporal homogeneity in acoustic properties, suggesting it could be characterized by distribution statistics rather than temporal details. In this work, we investigated using auditory summary statistics as the feature for ASC tasks. The inspiration comes from a recent neuroscience study, which shows the human auditory system tends to perceive sound textures through time-averaged statistics. Based on these statistics, we further proposed to use linear discriminant analysis to eliminate redundancies among these statistics while keeping the discriminative information, providing an extreme com-pact representation for acoustic scenes. Experimental results show the outstanding performance of the proposed feature over the conventional handcrafted features.Comment: Accepted as a conference paper of Interspeech 201

arXiv.org e-Print Archive

Crossref