26,653 research outputs found

    Gatecrashing the Visual Cocktail Party: How Visual and Semantic Similarity Modulate the Own Name Benefit in the Attentional Blink

    Get PDF
    The "visual cocktail party effect" refers to superior report of a participant's own name, under conditions of inattention. An early selection account suggests this advantage stems from enhanced visual processing (Treisman, 1960; Shapiro, Caldwell & Sorensen, 1997). A late selection account suggests the advantage occurs when semantic information allowing identification as ones own name is retrieved (Deutsch & Deutsch 1963; Mack & Rock 1998). In the context of Inattentional Blindness (IB) the advantage does not generalise to a minor modification of a participants own name, despite extensive visual similarity, supporting the late selection account (Mack & Rock 1998). The current study applied the name modification manipulation in the context of the Attentional Blink (AB). Participants were presented with rapid streams of names, and identifed a white target name, whilst also reporting the presence of one of two possible probes. The probe names appeared either close (the third item following the target: lag 3), or far in time from the target (the eight item following the target: lag 8). The results revealed a robust AB; reports of the probe were reduced at lag 3 relative to lag 8. The AB was also greatly reduced for the own name compared to another name; a visual cocktail party effect. In contrast to the findings of Mack and Rock for IB the reduced AB extended to the modified own name. The results suggest different loci for the visual cocktail party effect in the AB (word recognition) compared to IB (semantic processing)

    Cortical transformation of spatial processing for solving the cocktail party problem: a computational model(1,2,3).

    Get PDF
    In multisource, "cocktail party" sound environments, human and animal auditory systems can use spatial cues to effectively separate and follow one source of sound over competing sources. While mechanisms to extract spatial cues such as interaural time differences (ITDs) are well understood in precortical areas, how such information is reused and transformed in higher cortical regions to represent segregated sound sources is not clear. We present a computational model describing a hypothesized neural network that spans spatial cue detection areas and the cortex. This network is based on recent physiological findings that cortical neurons selectively encode target stimuli in the presence of competing maskers based on source locations (Maddox et al., 2012). We demonstrate that key features of cortical responses can be generated by the model network, which exploits spatial interactions between inputs via lateral inhibition, enabling the spatial separation of target and interfering sources while allowing monitoring of a broader acoustic space when there is no competition. We present the model network along with testable experimental paradigms as a starting point for understanding the transformation and organization of spatial information from midbrain to cortex. This network is then extended to suggest engineering solutions that may be useful for hearing-assistive devices in solving the cocktail party problem.R01 DC000100 - NIDCD NIH HHSPublished versio

    Vocal Processing with Spectral Analysis

    Get PDF
    A well-known signal processing issue is that of the “cocktail party problem,” which A well-known signal processing issue is that of the “cocktail party problem,” which refers to the need to be able to separate speakers from a mixture of voices. A solution to this problem could provide insight into signal separation in a variety of signal processing fields. In this study, a method of vocal signal processing was examined to determine if principal component analysis of spectral data could be used to characterize differences between speakers and if these differences could be used to separate mixtures of vocal signals. Processing was done on a set of voice recordings from thirty different speakers to create a projection matrix that could be used by an algorithm to identify the source of an unknown recording from one of the thirty speakers. Two different identification algorithms were tested. The first had an average correct prediction rate of 15.69%, while the second had an average correct prediction rate of 10.47%. Additionally, one principal component derived from the processing provided a notable distinction between principal values for male and female speakers. Males tended to produce positive principal values, while females tended to produce negative values. The success of the algorithm could be improved by implementing differentiation between time segments of speech and segments of silence. The incorporation of this distinction into the signal processing method was recommended as a topic for future study

    Brain activity during shadowing of audiovisual cocktail party speech, contributions of auditory-motor integration and selective attention

    Get PDF
    Selective listening to cocktail-party speech involves a network of auditory and inferior frontal cortical regions. However, cognitive and motor cortical regions are differentially activated depending on whether the task emphasizes semantic or phonological aspects of speech. Here we tested whether processing of cocktail-party speech differs when participants perform a shadowing (immediate speech repetition) task compared to an attentive listening task in the presence of irrelevant speech. Participants viewed audiovisual dialogues with concurrent distracting speech during functional imaging. Participants either attentively listened to the dialogue, overtly repeated (i.e., shadowed) attended speech, or performed visual or speech motor control tasks where they did not attend to speech and responses were not related to the speech input. Dialogues were presented with good or poor auditory and visual quality. As a novel result, we show that attentive processing of speech activated the same network of sensory and frontal regions during listening and shadowing. However, in the superior temporal gyrus (STG), peak activations during shadowing were posterior to those during listening, suggesting that an anterior-posterior distinction is present for motor vs. perceptual processing of speech already at the level of the auditory cortex. We also found that activations along the dorsal auditory processing stream were specifically associated with the shadowing task. These activations are likely to be due to complex interactions between perceptual, attention dependent speech processing and motor speech generation that matches the heard speech. Our results suggest that interactions between perceptual and motor processing of speech relies on a distributed network of temporal and motor regions rather than any specific anatomical landmark as suggested by some previous studies.Peer reviewe

    Speech Signal Enhancement in Cocktail Party Scenarios by Deep Learning based Virtual Sensing of Head-Mounted Microphones

    Get PDF
    The cocktail party effect refers to the human sense of hearing’s ability to pay attention to a single conversation while filtering out all other background noise. To mimic this human hearing ability for people with hearing loss, scientists integrate beamforming algorithms into the signal processing path of hearing aids or implants’ audio processors. Although these algorithms’ performance strongly depends on the number and spatial arrangement of the microphones, most devices are equipped with a small number of microphones mounted close to each other on the audio processor housing. We measured and evaluated the impact of the number and spatial arrangement of hearing aid or head-mounted microphones on the performance of the established Minimum Variance Distortionless Response beamformer in cocktail party scenarios. The measurements revealed that the optimal microphone placement exploits monaural cues (pinna-effect), is close to the target signal, and creates a large distance spread due to its spatial arrangement. However, this microphone placement is impractical for hearing aid or implant users, as it includes microphone positions such as on the forehead. To overcome microphones’ placement at impractical positions, we propose a deep virtual sensing estimation of the corresponding audio signals. The results of objective measures and a subjective listening test with 20 participants showed that the virtually sensed microphone signals significantly improved the speech quality, especially in cocktail party scenarios with low signal-to-noise ratios. Subjective speech quality was assessed using a 3-alternative forced choice procedure to determine which of the presented speech mixtures was most pleasant to understand. Hearing aid and cochlear implant (CI) users might benefit from the presented approach using virtually sensed microphone signals, especially in noisy environments
    • …
    corecore