4,671 research outputs found
Change blindness: eradication of gestalt strategies
Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task
NAS-VAD: Neural Architecture Search for Voice Activity Detection
Various neural network-based approaches have been proposed for more robust
and accurate voice activity detection (VAD). Manual design of such neural
architectures is an error-prone and time-consuming process, which prompted the
development of neural architecture search (NAS) that automatically design and
optimize network architectures. While NAS has been successfully applied to
improve performance in a variety of tasks, it has not yet been exploited in the
VAD domain. In this paper, we present the first work that utilizes NAS
approaches on the VAD task. To effectively search architectures for the VAD
task, we propose a modified macro structure and a new search space with a much
broader range of operations that includes attention operations. The results
show that the network structures found by the propose NAS framework outperform
previous manually designed state-of-the-art VAD models in various noise-added
and real-world-recorded datasets. We also show that the architectures searched
on a particular dataset achieve improved generalization performance on unseen
audio datasets. Our code and models are available at
https://github.com/daniel03c1/NAS_VAD.Comment: Submitted to Interspeech 202
Cueing in a perceptual task causes long-lasting interference that generalizes across context to affect only late perceptual learning and is remediated by the passage of time
Perceptual learning, the improvement in sensory discriminations with practise, is also subject to stimulus-specific interference from temporal jitter in a learning session or manipulations applied between or immediately after sessions. We demonstrate a novel form of perceptual interference where even a brief cueing exposure to a complex speech-in-noise task produces a forward interference on subsequent speech-in-noise learning. This potent interference generalizes across cueing context but specifically affects only late learning in the subsequent task, is resistant to the remediating effects of sleep and persists across an overnight delay involving sleep, and can be evoked by a single exposure 1 day before the learning. Learning in the speech-in-noise task is due to generalized improvements in discriminating and extracting signals (speech) from noise and we hypothesize that the forward interference represents interference with improvements in access to higher-level representations in rapid perception of ecologically-familiar complex signals such as speech from background noise
A Survey of Bandwidth Optimization Techniques and Patterns in VoIP Services and Applications
This article surveys the various techniques adopted for optimising bandwidth
for VoIP services over the period 1999-2014. The improvement of bandwidth can
be realized through; silence suppression measure of repressing the silent
portions (packets) in a voice conversation using Voice Activity Detection
algorithm; by so doing, the transmission rate during the inactive periods of
speech is reduced, and thus, the mean transmission rate can be reduced. A
second measure is packet header reduction which defines a process of
multiplexing and de-multiplexing packet headers to curb excesses. Voice/ Packet
Header compression is considered the most productive of all the techniques,
offering a scheme where VoIP packets are compressed from the 40 bytes of size
to a smaller byte size of 2 bytes. When combined with aggregation, compression
potentially yields a compressed size of up to 1 byte. In either case, bandwidth
save is reached using compression and decompression codecs of varying data and
bit rates. It is envisaged that an improvement in the performance of codecs
would yield a better result in terms of enhancing results favourably in Voice
over broadband networksComment: 8 pages, 7 figures. ISSN (Print): 1694-0814 | ISSN (Online):
1694-078
Survey of Noise Estimation Algorithms for Speech Enhancement Using Spectral Subtraction
Speech enhancement means speech improvement. Actually the speech enhancement is performed by using various techniques and different algorithms. Over the past several years there has been attention focused on the problem of enhancement of speech degraded by additive background noise. For many applications background suppression is required. The spectral - subtractive algorithm is one of the first algorithm proposed for additive background noise and it has gone through many modifications with time. For spectral subtraction method noise estimation is important for that there are various noise estimation algorithms. All these noise estimation algorithms are important for removing background noise
Recommended from our members
The role of HG in the analysis of temporal iteration and interaural correlation
Auditory-Motor Adaptation to Frequency-Altered Auditory Feedback Occurs When Participants Ignore Feedback
Background
Auditory feedback is important for accurate control of voice fundamental frequency (F0). The purpose of this study was to address whether task instructions could influence the compensatory responding and sensorimotor adaptation that has been previously found when participants are presented with a series of frequency-altered feedback (FAF) trials. Trained singers and musically untrained participants (nonsingers) were informed that their auditory feedback would be manipulated in pitch while they sang the target vowel [/ɑ /]. Participants were instructed to either ‘compensate’ for, or ‘ignore’ the changes in auditory feedback. Whole utterance auditory feedback manipulations were either gradually presented (‘ramp’) in -2 cent increments down to -100 cents (1 semitone) or were suddenly (’constant‘) shifted down by 1 semitone. Results
Results indicated that singers and nonsingers could not suppress their compensatory responses to FAF, nor could they reduce the sensorimotor adaptation observed during both the ramp and constant FAF trials. Conclusions
Compared to previous research, these data suggest that musical training is effective in suppressing compensatory responses only when FAF occurs after vocal onset (500-2500 ms). Moreover, our data suggest that compensation and adaptation are automatic and are influenced little by conscious control
- …