7,203 research outputs found

    Frame Theory for Signal Processing in Psychoacoustics

    Full text link
    This review chapter aims to strengthen the link between frame theory and signal processing tasks in psychoacoustics. On the one side, the basic concepts of frame theory are presented and some proofs are provided to explain those concepts in some detail. The goal is to reveal to hearing scientists how this mathematical theory could be relevant for their research. In particular, we focus on frame theory in a filter bank approach, which is probably the most relevant view-point for audio signal processing. On the other side, basic psychoacoustic concepts are presented to stimulate mathematicians to apply their knowledge in this field

    Practical Hidden Voice Attacks against Speech and Speaker Recognition Systems

    Full text link
    Voice Processing Systems (VPSes), now widely deployed, have been made significantly more accurate through the application of recent advances in machine learning. However, adversarial machine learning has similarly advanced and has been used to demonstrate that VPSes are vulnerable to the injection of hidden commands - audio obscured by noise that is correctly recognized by a VPS but not by human beings. Such attacks, though, are often highly dependent on white-box knowledge of a specific machine learning model and limited to specific microphones and speakers, making their use across different acoustic hardware platforms (and thus their practicality) limited. In this paper, we break these dependencies and make hidden command attacks more practical through model-agnostic (blackbox) attacks, which exploit knowledge of the signal processing algorithms commonly used by VPSes to generate the data fed into machine learning systems. Specifically, we exploit the fact that multiple source audio samples have similar feature vectors when transformed by acoustic feature extraction algorithms (e.g., FFTs). We develop four classes of perturbations that create unintelligible audio and test them against 12 machine learning models, including 7 proprietary models (e.g., Google Speech API, Bing Speech API, IBM Speech API, Azure Speaker API, etc), and demonstrate successful attacks against all targets. Moreover, we successfully use our maliciously generated audio samples in multiple hardware configurations, demonstrating effectiveness across both models and real systems. In so doing, we demonstrate that domain-specific knowledge of audio signal processing represents a practical means of generating successful hidden voice command attacks

    Investigating Perceptual Congruence Between Data and Display Dimensions in Sonification

    Get PDF
    The relationships between sounds and their perceived meaning and connotations are complex, making auditory perception an important factor to consider when designing sonification systems. Listeners often have a mental model of how a data variable should sound during sonification and this model is not considered in most data:sound mappings. This can lead to mappings that are difficult to use and can cause confusion. To investigate this issue, we conducted a magnitude estimation experiment to map how roughness, noise and pitch relate to the perceived magnitude of stress, error and danger. These parameters were chosen due to previous findings which suggest perceptual congruency between these auditory sensations and conceptual variables. Results from this experiment show that polarity and scaling preference are dependent on the data:sound mapping. This work provides polarity and scaling values that may be directly utilised by sonification designers to improve auditory displays in areas such as accessible and mobile computing, process-monitoring and biofeedback

    Paranoid Schizophrenia Negative Symptoms Features in Case of Presence of Musical Ear

    Get PDF
    In our work, we propose one of the options for a prognostic criterion, which at the beginning of the disease can provide sufficient evidence to predict the form and severity of negative symptoms in schizophrenia.Aim. To investigate the influence of the presence of ear on music on the degree of severity of deficiency symptoms in paranoid schizophrenia.The study was conducted on the basis of the third clinical department of the Lviv Regional Clinical Psychiatric Hospital for the period of 2015. 40 patients with paranoid form of schizophrenia, aged 18 to 35, were examined, of which: group I – 20 patients with advanced ear on music (average age 28.60±1.01 years) and group II – 20 patients with no ear on music (average age 27.30±1.15 years). The main methods of studying the observation groups were: clinical-psychopathological, pathopsychological, and statistical. The pathopsychological study of the evaluation of negative symptoms was conducted using the "Qualitative Assessment Scale for Positivity, Negative and General Psychopathological Syndromes" (PANSS – Positive and Negative Syndrome Scale), namely, its PANSS-NS subscale. Comparison of the probability of the difference between the average indices of unrelated groups was carried out using the Mann-Whitney method, comparing the relative parameters of the distribution structure by the xi-square criterion.Analysis of the results of the study shows that in patients with developed ear on music, the level of deficiency symptoms of negative symptoms under the PANSS-NS subclass is 2.2 times lower (p <0.01) than in patients with no developed ear on music: 2.04±0.14 against 4.46±0.17 points, respectively. Comparing the key indicators of the PANSS-NS subscale in patients with paranoid schizophrenia with advanced ear on music, it was found that the manifestations of "Violations of abstract thinking" (N5 – 2.35±0.15 points), "Violation of spontaneity and smoothness in the conversation" (N6 – 2.30±0.15 points) and "Stereotyped thinking" (N7 – 2.20±0.16 points). All these negative symptoms were in patients with muscular earache with significantly lower scores: from lack of severity (1 point) to weakness (3 points). The lack of expressiveness (1 point) was most common in N4 "Passive-apathy social strangeness " - 35.00±10.67 % of patients, very weak severity (2 points) - for N1 "Blurred passion" - 75.00±9.68 % of patients (p <0.05 with the proportion of negative symptoms 1 and 3 points), weakness (3 points) - for N5 – 45.00±11.12 % of patients (p <0.05 with the proportion of negative symptoms 1 point ) The highest proportion (70.00±10.25 %, p <0.05 with a share of negative symptoms of 6 points) of patients with paranoid schizophrenia without ear on music had a high severity (5 points) of rigidity and stereotyping of thinking (N7).The obtained data prove the influence of the factor of the presence of ear on music on deficit syndrome, as well as on the forms and degree of severity of negative symptoms in paranoid schizophrenia

    MLP: a MATLAB toolbox for rapid and reliable auditory threshold estimation

    Get PDF
    In this paper, we present MLP, a MATLAB toolbox enabling auditory thresholds estimation via the adaptive Maximum Likelihood procedure proposed by David Green (1990, 1993). This adaptive procedure is particularly appealing for those psychologists that need to estimate thresholds with a good degree of accuracy and in a short time. Together with a description of the toolbox, the current text provides an introduction to the threshold estimation theory and a theoretical explanation of the maximum likelihood adaptive procedure. MLP comes with a graphical interface and it is provided with several built-in, classic psychoacoustics experiments ready to use at a mouse click

    Urgency is a Non-monotonic Function of Pulse Rate

    Get PDF
    Magnitude estimation was used to assess the experience of urgency in pulse-train stimuli (pulsed white noise) ranging from 3.13 to 200 Hz. At low pulse rates, pulses were easily resolved. At high pulse rates, pulses fused together leading to a tonal sensation with a clear pitch level. Urgency ratings followed a nonmonotonic (polynomial) function with local maxima at 17.68 and 200 Hz. The same stimuli were also used in response time and pitch scaling experiments. Response times were negatively correlated with urgency ratings. Pitch scaling results indicated that urgency of pulse trains is mediated by the perceptual constructs of speed and pitch

    Virtual Reality and Sound Localization

    Get PDF
    Psychoacoustics is the scientific study of sound perception. Within this field, Virtual Reality is a technique that uses two synthesis speakers to simulate a sine tone coming from anywhere in open space. Using this method it is possible to independently control specific binaural cues in a free-field environment. This study analyzes listener responses to these controlled sine tones to investigate the relative importance of certain binaural cues at different frequencies

    The Effect of Interchannel Time Difference on Localisation in Vertical Stereophony

    Get PDF
    Listening tests were conducted in order to analyse the localisation of band-limited stimuli in vertical stereophony. The test stimuli were seven octave bands of pink noise, with centre frequencies ranging from 125–8000Hz, as well as broadband pink noise. Stimuli were presented from vertically arranged loudspeakers either monophonically or as vertical phantom images, created with the upper loudspeaker delayed with respect to the lower by 0, 0.5, 1, 5 and 10ms (i.e. interchannel time difference). The experimental data obtained showed that localisation under the aforementioned conditions is generally governed by the so-called “pitch-height” effect, with the high frequency stimuli generally being localised significantly higher than the low frequency stimuli for all conditions. The effect of interchannel time difference was found to be significant on localisation judgments for both the 1000-4000Hz octave bands and the broadband pink noise; it is suggested that this was related to the effects of comb filtering. Additionally, no evidence could be found to support the existence of the precedence effect in vertical stereophony

    Product Sound Design: An Inter-Disciplinary Approach?

    Get PDF
    The practice of product sound design is relatively new within the field of product development. Consequently, the responsibilities and the role of a (sound) designer are not very clear. However, practice shows that various disciplines such as design engineering, acoustics, psychoacoustics, psychology, and musicology contribute to the improvement of product sounds. We propose that sound design should be conducted by experts who have knowledge in the afore-mentioned fields. In other words, we suggest that product sound design should be an independent field that encompasses an inter-disciplinary approach. Keywords: sound design; sound designer; product sounds; design processes; multi-disciplinary, inter-disciplinary</p
    • …
    corecore