Search CORE

4,671 research outputs found

Change blindness: eradication of gestalt strategies

Author: Goddard Paul
Wilson Steve
Publication venue: 'Pion Ltd'
Publication date: 01/08/2011
Field of study

Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

University of Lincoln Institutional Repository

NAS-VAD: Neural Architecture Search for Voice Activity Detection

Author: Ko Jong Hwan
Park Jinhyeok
Rho Daniel
Publication venue: 'International Speech Communication Association'
Publication date: 29/03/2022
Field of study

Various neural network-based approaches have been proposed for more robust and accurate voice activity detection (VAD). Manual design of such neural architectures is an error-prone and time-consuming process, which prompted the development of neural architecture search (NAS) that automatically design and optimize network architectures. While NAS has been successfully applied to improve performance in a variety of tasks, it has not yet been exploited in the VAD domain. In this paper, we present the first work that utilizes NAS approaches on the VAD task. To effectively search architectures for the VAD task, we propose a modified macro structure and a new search space with a much broader range of operations that includes attention operations. The results show that the network structures found by the propose NAS framework outperform previous manually designed state-of-the-art VAD models in various noise-added and real-world-recorded datasets. We also show that the architectures searched on a particular dataset achieve improved generalization performance on unseen audio datasets. Our code and models are available at https://github.com/daniel03c1/NAS_VAD.Comment: Submitted to Interspeech 202

arXiv.org e-Print Archive

Cueing in a perceptual task causes long-lasting interference that generalizes across context to affect only late perceptual learning and is remediated by the passage of time

Author: Ramesh Rajan
Publication venue
Publication date: 10/10/2008
Field of study

Perceptual learning, the improvement in sensory discriminations with practise, is also subject to stimulus-specific interference from temporal jitter in a learning session or manipulations applied between or immediately after sessions. We demonstrate a novel form of perceptual interference where even a brief cueing exposure to a complex speech-in-noise task produces a forward interference on subsequent speech-in-noise learning. This potent interference generalizes across cueing context but specifically affects only late learning in the subsequent task, is resistant to the remediating effects of sleep and persists across an overnight delay involving sleep, and can be evoked by a single exposure 1 day before the learning. Learning in the speech-in-noise task is due to generalized improvements in discriminating and extracting signals (speech) from noise and we hypothesize that the forward interference represents interference with improvements in access to higher-level representations in rapid perception of ecologically-familiar complex signals such as speech from background noise

Nature Precedings

A Survey of Bandwidth Optimization Techniques and Patterns in VoIP Services and Applications

Author: Agbanusi Nneka Chikazo
Daniel Uchenna Peter
Danjuma Kwetishe Joro
Publication venue
Publication date: 01/03/2014
Field of study

This article surveys the various techniques adopted for optimising bandwidth for VoIP services over the period 1999-2014. The improvement of bandwidth can be realized through; silence suppression measure of repressing the silent portions (packets) in a voice conversation using Voice Activity Detection algorithm; by so doing, the transmission rate during the inactive periods of speech is reduced, and thus, the mean transmission rate can be reduced. A second measure is packet header reduction which defines a process of multiplexing and de-multiplexing packet headers to curb excesses. Voice/ Packet Header compression is considered the most productive of all the techniques, offering a scheme where VoIP packets are compressed from the 40 bytes of size to a smaller byte size of 2 bytes. When combined with aggregation, compression potentially yields a compressed size of up to 1 byte. In either case, bandwidth save is reached using compression and decompression codecs of varying data and bit rates. It is envisaged that an improvement in the performance of codecs would yield a better result in terms of enhancing results favourably in Voice over broadband networksComment: 8 pages, 7 figures. ISSN (Print): 1694-0814 | ISSN (Online): 1694-078

arXiv.org e-Print Archive

UCL Discovery

Survey of Noise Estimation Algorithms for Speech Enhancement Using Spectral Subtraction

Author: Miss. Anuja Chougule, Dr. Mrs. V. V. Patil,
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/12/2014
Field of study

Speech enhancement means speech improvement. Actually the speech enhancement is performed by using various techniques and different algorithms. Over the past several years there has been attention focused on the problem of enhancement of speech degraded by additive background noise. For many applications background suppression is required. The spectral - subtractive algorithm is one of the first algorithm proposed for additive background noise and it has gone through many modifications with time. For spectral subtraction method noise estimation is important for that there are various noise estimation algorithms. All these noise estimation algorithms are important for removing background noise

International Journal on Recent and Innovation Trends in Computing and Communication

Features for voice activity detection: a comparative analysis

Author: Gerhard Schmidt
Markus Buck
Simon Graf
Tobias Herbig
Publication venue: Springer Nature
Publication date: 01/01/2015
Field of study

Springer - Publisher Connector

Recommended from our members

The role of HG in the analysis of temporal iteration and interaural correlation

Author: Barrett DJK
Hall DA
Publication venue
Publication date: 01/01/2004
Field of study

Nottingham Trent Institutional Repository (IRep)

Auditory-Motor Adaptation to Frequency-Altered Auditory Feedback Occurs When Participants Ignore Feedback

Author: Hawco Colin
Jones Jeffery A.
Keough Dwayne Nicholas
Publication venue: Scholars Commons @ Laurier
Publication date: 01/03/2013
Field of study

Background Auditory feedback is important for accurate control of voice fundamental frequency (F0). The purpose of this study was to address whether task instructions could influence the compensatory responding and sensorimotor adaptation that has been previously found when participants are presented with a series of frequency-altered feedback (FAF) trials. Trained singers and musically untrained participants (nonsingers) were informed that their auditory feedback would be manipulated in pitch while they sang the target vowel [/ɑ /]. Participants were instructed to either ‘compensate’ for, or ‘ignore’ the changes in auditory feedback. Whole utterance auditory feedback manipulations were either gradually presented (‘ramp’) in -2 cent increments down to -100 cents (1 semitone) or were suddenly (’constant‘) shifted down by 1 semitone. Results Results indicated that singers and nonsingers could not suppress their compensatory responses to FAF, nor could they reduce the sensorimotor adaptation observed during both the ramp and constant FAF trials. Conclusions Compared to previous research, these data suggest that musical training is effective in suppressing compensatory responses only when FAF occurs after vocal onset (500-2500 ms). Moreover, our data suggest that compensation and adaptation are automatic and are influenced little by conscious control

Wilfrid Laurier University