194 research outputs found

    Proceedings of the 7th Sound and Music Computing Conference

    Get PDF
    Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010

    Single channel overlapped-speech detection and separation of spontaneous conversations

    Get PDF
    PhD ThesisIn the thesis, spontaneous conversation containing both speech mixture and speech dialogue is considered. The speech mixture refers to speakers speaking simultaneously (i.e. the overlapped-speech). The speech dialogue refers to only one speaker is actively speaking and the other is silent. That Input conversation is firstly processed by the overlapped-speech detection. Two output signals are then segregated into dialogue and mixture formats. The dialogue is processed by speaker diarization. Its outputs are the individual speech of each speaker. The mixture is processed by speech separation. Its outputs are independent separated speech signals of the speaker. When the separation input contains only the mixture, blind speech separation approach is used. When the separation is assisted by the outputs of the speaker diarization, it is informed speech separation. The research presents novel: overlapped-speech detection algorithm, and two speech separation algorithms. The proposed overlapped-speech detection is an algorithm to estimate the switching instants of the input. Optimization loop is adapted to adopt the best capsulated audio features and to avoid the worst. The optimization depends on principles of the pattern recognition, and k-means clustering. For of 300 simulated conversations, averages of: False-Alarm Error is 1.9%, Missed-Speech Error is 0.4%, and Overlap-Speaker Error is 1%. Approximately, these errors equal the errors of best recent reliable speaker diarization corpuses. The proposed blind speech separation algorithm consists of four sequential techniques: filter-bank analysis, Non-negative Matrix Factorization (NMF), speaker clustering and filter-bank synthesis. Instead of the required speaker segmentation, effective standard framing is contributed. Average obtained objective tests (SAR, SDR and SIR) of 51 simulated conversations are: 5.06dB, 4.87dB and 12.47dB respectively. For the proposed informed speech separation algorithm, outputs of the speaker diarization are a generated-database. The database associated the speech separation by creating virtual targeted-speech and mixture. The contributed virtual signals are trained to facilitate the separation by homogenising them with the NMF-matrix elements of the real mixture. Contributed masking optimized the resulting speech. Average obtained SAR, SDR and SIR of 341 simulated conversations are 9.55dB, 1.12dB, and 2.97dB respectively. Per the objective tests of the two speech separation algorithms, they are in the mid-range of the well-known NMF-based audio and speech separation methods

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task

    Exploring Animal Behavior Through Sound: Volume 1

    Get PDF
    This open-access book empowers its readers to explore the acoustic world of animals. By listening to the sounds of nature, we can study animal behavior, distribution, and demographics; their habitat characteristics and needs; and the effects of noise. Sound recording is an efficient and affordable tool, independent of daylight and weather; and recorders may be left in place for many months at a time, continuously collecting data on animals and their environment. This book builds the skills and knowledge necessary to collect and interpret acoustic data from terrestrial and marine environments. Beginning with a history of sound recording, the chapters provide an overview of off-the-shelf recording equipment and analysis tools (including automated signal detectors and statistical methods); audiometric methods; acoustic terminology, quantities, and units; sound propagation in air and under water; soundscapes of terrestrial and marine habitats; animal acoustic and vibrational communication; echolocation; and the effects of noise. This book will be useful to students and researchers of animal ecology who wish to add acoustics to their toolbox, as well as to environmental managers in industry and government

    Magnetic Tape Recording for the Eighties

    Get PDF
    The practical and theoretical aspects of state-of-the-art magnetic tape recording technology are reviewed. Topics covered include the following: (1) analog and digital magnetic tape recording, (2) tape and head wear, (3) wear testing, (4) magnetic tape certification, (5) care, handling, and management of magnetic tape, (6) cleaning, packing, and winding of magnetic tape, (7) tape reels, bands, and packaging, (8) coding techniques for high-density digital recording, and (9) tradeoffs of coding techniques

    On the Recognition of Emotion from Physiological Data

    Get PDF
    This work encompasses several objectives, but is primarily concerned with an experiment where 33 participants were shown 32 slides in order to create ‗weakly induced emotions‘. Recordings of the participants‘ physiological state were taken as well as a self report of their emotional state. We then used an assortment of classifiers to predict emotional state from the recorded physiological signals, a process known as Physiological Pattern Recognition (PPR). We investigated techniques for recording, processing and extracting features from six different physiological signals: Electrocardiogram (ECG), Blood Volume Pulse (BVP), Galvanic Skin Response (GSR), Electromyography (EMG), for the corrugator muscle, skin temperature for the finger and respiratory rate. Improvements to the state of PPR emotion detection were made by allowing for 9 different weakly induced emotional states to be detected at nearly 65% accuracy. This is an improvement in the number of states readily detectable. The work presents many investigations into numerical feature extraction from physiological signals and has a chapter dedicated to collating and trialing facial electromyography techniques. There is also a hardware device we created to collect participant self reported emotional states which showed several improvements to experimental procedure

    Exploring Animal Behavior Through Sound: Volume 1

    Get PDF
    This open-access book empowers its readers to explore the acoustic world of animals. By listening to the sounds of nature, we can study animal behavior, distribution, and demographics; their habitat characteristics and needs; and the effects of noise. Sound recording is an efficient and affordable tool, independent of daylight and weather; and recorders may be left in place for many months at a time, continuously collecting data on animals and their environment. This book builds the skills and knowledge necessary to collect and interpret acoustic data from terrestrial and marine environments. Beginning with a history of sound recording, the chapters provide an overview of off-the-shelf recording equipment and analysis tools (including automated signal detectors and statistical methods); audiometric methods; acoustic terminology, quantities, and units; sound propagation in air and under water; soundscapes of terrestrial and marine habitats; animal acoustic and vibrational communication; echolocation; and the effects of noise. This book will be useful to students and researchers of animal ecology who wish to add acoustics to their toolbox, as well as to environmental managers in industry and government
    • …
    corecore