78 research outputs found

    Perceptual Similarities Amongst Novel, 3D Objects

    No full text

    MEG, PSYCHOPHYSICAL AND COMPUTATIONAL STUDIES OF LOUDNESS, TIMBRE, AND AUDIOVISUAL INTEGRATION

    Get PDF
    Natural scenes and ecological signals are inherently complex and understanding of their perception and processing is incomplete. For example, a speech signal contains not only information at various frequencies, but is also not static; the signal is concurrently modulated temporally. In addition, an auditory signal may be paired with additional sensory information, as in the case of audiovisual speech. In order to make sense of the signal, a human observer must process the information provided by low-level sensory systems and integrate it across sensory modalities and with cognitive information (e.g., object identification information, phonetic information). The observer must then create functional relationships between the signals encountered to form a coherent percept. The neuronal and cognitive mechanisms underlying this integration can be quantified in several ways: by taking physiological measurements, assessing behavioral output for a given task and modeling signal relationships. While ecological tokens are complex in a way that exceeds our current understanding, progress can be made by utilizing synthetic signals that encompass specific essential features of ecological signals. The experiments presented here cover five aspects of complex signal processing using approximations of ecological signals : (i) auditory integration of complex tones comprised of different frequencies and component power levels; (ii) audiovisual integration approximating that of human speech; (iii) behavioral measurement of signal discrimination; (iv) signal classification via simple computational analyses and (v) neuronal processing of synthesized auditory signals approximating speech tokens. To investigate neuronal processing, magnetoencephalography (MEG) is employed to assess cortical processing non-invasively. Behavioral measures are employed to evaluate observer acuity in signal discrimination and to test the limits of perceptual resolution. Computational methods are used to examine the relationships in perceptual space and physiological processing between synthetic auditory signals, using features of the signals themselves as well as biologically-motivated models of auditory representation. Together, the various methodologies and experimental paradigms advance the understanding of ecological signal analytics concerning the complex interactions in ecological signal structure

    Re-Sonification of Objects, Events, and Environments

    Get PDF
    abstract: Digital sound synthesis allows the creation of a great variety of sounds. Focusing on interesting or ecologically valid sounds for music, simulation, aesthetics, or other purposes limits the otherwise vast digital audio palette. Tools for creating such sounds vary from arbitrary methods of altering recordings to precise simulations of vibrating objects. In this work, methods of sound synthesis by re-sonification are considered. Re-sonification, herein, refers to the general process of analyzing, possibly transforming, and resynthesizing or reusing recorded sounds in meaningful ways, to convey information. Applied to soundscapes, re-sonification is presented as a means of conveying activity within an environment. Applied to the sounds of objects, this work examines modeling the perception of objects as well as their physical properties and the ability to simulate interactive events with such objects. To create soundscapes to re-sonify geographic environments, a method of automated soundscape design is presented. Using recorded sounds that are classified based on acoustic, social, semantic, and geographic information, this method produces stochastically generated soundscapes to re-sonify selected geographic areas. Drawing on prior knowledge, local sounds and those deemed similar comprise a locale's soundscape. In the context of re-sonifying events, this work examines processes for modeling and estimating the excitations of sounding objects. These include plucking, striking, rubbing, and any interaction that imparts energy into a system, affecting the resultant sound. A method of estimating a linear system's input, constrained to a signal-subspace, is presented and applied toward improving the estimation of percussive excitations for re-sonification. To work toward robust recording-based modeling and re-sonification of objects, new implementations of banded waveguide (BWG) models are proposed for object modeling and sound synthesis. Previous implementations of BWGs use arbitrary model parameters and may produce a range of simulations that do not match digital waveguide or modal models of the same design. Subject to linear excitations, some models proposed here behave identically to other equivalently designed physical models. Under nonlinear interactions, such as bowing, many of the proposed implementations exhibit improvements in the attack characteristics of synthesized sounds.Dissertation/ThesisPh.D. Electrical Engineering 201

    Final Report to NSF of the Standards for Facial Animation Workshop

    Get PDF
    The human face is an important and complex communication channel. It is a very familiar and sensitive object of human perception. The facial animation field has increased greatly in the past few years as fast computer graphics workstations have made the modeling and real-time animation of hundreds of thousands of polygons affordable and almost commonplace. Many applications have been developed such as teleconferencing, surgery, information assistance systems, games, and entertainment. To solve these different problems, different approaches for both animation control and modeling have been developed

    INTERACTIONS WITH LANGUAGE IN HUMAN OLFACTORY PERCEPTION

    Get PDF
    People are notoriously bad at identifying odors by name. Why might this be? Theories range from competition for cognitive resources to poor neural connectivity to inferiority at the level of sensory transduction and perception. Here we suggest that human olfaction on its own is a measurably precise, rich, and nuanced sense. And further, we suggest that the addition of labels automatically and irresistibly changes people’s experience of odors. In the context of this thesis, we use language as a tool for understanding olfaction specifically. But also, the study of olfaction can be used as a tool for understanding perception more generally. Difficulty naming odors can be an exploitable feature rather than a bug in the system. It means that certain aspects of perception and cognition that are entangled for other sensory systems are separable in olfaction. In this thesis, we address the important gap in olfactory understanding, specifically the way odors interact with language. In Chapter 1, we found that behavioral similarity ratings for a set of everyday odors showed high agreement across participants. Adding labels to odors caused systematic shifts in response patterns that induced people to incorporate more conceptual and physical features of source objects into their evaluation of odors. In Chapter 2, we extended our previous findings by asking whether shifts in similarity responses reflected perceptual or conceptual changes. We found a dissociation between mental representations of odors and olfactory perception. Despite reliable changes in odor experience previously reported by participants, we found no change in performance in an odor mixture discrimination task when labels were added to odor stimuli. In Chapter 3, we evaluated the types of guesses people made when trying to identify odors without any visual or context clues. Follow-up analyses demonstrated that odor naming ability is widely distributed, even within a relatively homogeneous test population (and even after controlling for low-level olfactory perceptual ability and general verbal ability) and that some odor stimuli are reliably easier to name than others. Taken together, these results suggest there is a greater depth and complexity to human olfactory experience than previously thought. Similarity ratings of odors are not only malleable with verbal context, they are separable from olfactory perception and they reflect previously unknown dimensions of odor experience

    Neural models of inter-cortical networks in the primate visual system for navigation, attention, path perception, and static and kinetic figure-ground perception

    Full text link
    Vision provides the primary means by which many animals distinguish foreground objects from their background and coordinate locomotion through complex environments. The present thesis focuses on mechanisms within the visual system that afford figure-ground segregation and self-motion perception. These processes are modeled as emergent outcomes of dynamical interactions among neural populations in several brain areas. This dissertation specifies and simulates how border-ownership signals emerge in cortex, and how the medial superior temporal area (MSTd) represents path of travel and heading, in the presence of independently moving objects (IMOs). Neurons in visual cortex that signal border-ownership, the perception that a border belongs to a figure and not its background, have been identified but the underlying mechanisms have been unclear. A model is presented that demonstrates that inter-areal interactions across model visual areas V1-V2-V4 afford border-ownership signals similar to those reported in electrophysiology for visual displays containing figures defined by luminance contrast. Competition between model neurons with different receptive field sizes is crucial for reconciling the occlusion of one object by another. The model is extended to determine border-ownership when object borders are kinetically-defined, and to detect the location and size of shapes, despite the curvature of their boundary contours. Navigation in the real world requires humans to travel along curved paths. Many perceptual models have been proposed that focus on heading, which specifies the direction of travel along straight paths, but not on path curvature. In primates, MSTd has been implicated in heading perception. A model of V1, medial temporal area (MT), and MSTd is developed herein that demonstrates how MSTd neurons can simultaneously encode path curvature and heading. Human judgments of heading are accurate in rigid environments, but are biased in the presence of IMOs. The model presented here explains the bias through recurrent connectivity in MSTd and avoids the use of differential motion detectors which, although used in existing models to discount the motion of an IMO relative to its background, is not biologically plausible. Reported modulation of the MSTd population due to attention is explained through competitive dynamics between subpopulations responding to bottom-up and top- down signals

    Neural Representations of a Real-World Environment

    Get PDF
    The ability to represent the spatial structure of the environment is critical for successful navigation. Extensive research using animal models has revealed the existence of specialized neurons that appear to code for spatial information in their firing patterns. However, little is known about which regions of the human brain support representations of large-scale space. To address this gap in the literature, we performed three functional magnetic resonance imaging (fMRI) experiments aimed at characterizing the representations of locations, headings, landmarks, and distances in a large environment for which our subjects had extensive real-world navigation experience: their college campus. We scanned University of Pennsylvania students while they made decisions about places on campus and then tested for spatial representations using multivoxel pattern analysis and fMRI adaptation. In Chapter 2, we tested for representations of the navigator\u27s current location and heading, information necessary for self-localization. In Chapter 3, we tested whether these location and heading representations were consistent across perception and spatial imagery. Finally, in Chapter 4, we tested for representations of landmark identity and the distances between landmarks. Across the three experiments, we observed that specific regions of medial temporal and medial parietal cortex supported long-term memory representations of navigationally-relevant spatial information. These results serve to elucidate the functions of these regions and offer a framework for understanding the relationship between spatial representations in the medial temporal lobe and in high-level visual regions. We discuss our findings in the context of the broader spatial cognition literature, including implications for studies of both humans and animal models

    The role of visual processing in haptic representation - Recognition tasks with novel 3D objects

    Get PDF
    In perceiving and recognizing everyday objects we use different senses combined together (multisensory process). However, in the past authors concentrated almost completely on vision. However, it is also true that we can touch objects in order to acquire a whole series of information. Moreover, the combination of these two sensory modalities provides complete information about the explored object. So, I first analyzed the available literature on visual and haptic object representation and recognition separately; then, I concentrated on crossmodal visuo-haptic object representation. Finally I exposed and discussed the results obtained with the three experiments I conducted during my Ph.D. studies. These seem to be in line, as already previously proposed by different authors (Newell et al., 2005; Cattaneo et al. 2008; Lacey et al., 2009), with the existence of a supramodal object representation, which is unhooked from the encoding sensory moodality

    A Parametric Sound Object Model for Sound Texture Synthesis

    Get PDF
    This thesis deals with the analysis and synthesis of sound textures based on parametric sound objects. An overview is provided about the acoustic and perceptual principles of textural acoustic scenes, and technical challenges for analysis and synthesis are considered. Four essential processing steps for sound texture analysis are identifi ed, and existing sound texture systems are reviewed, using the four-step model as a guideline. A theoretical framework for analysis and synthesis is proposed. A parametric sound object synthesis (PSOS) model is introduced, which is able to describe individual recorded sounds through a fi xed set of parameters. The model, which applies to harmonic and noisy sounds, is an extension of spectral modeling and uses spline curves to approximate spectral envelopes, as well as the evolution of parameters over time. In contrast to standard spectral modeling techniques, this representation uses the concept of objects instead of concatenated frames, and it provides a direct mapping between sounds of diff erent length. Methods for automatic and manual conversion are shown. An evaluation is presented in which the ability of the model to encode a wide range of di fferent sounds has been examined. Although there are aspects of sounds that the model cannot accurately capture, such as polyphony and certain types of fast modulation, the results indicate that high quality synthesis can be achieved for many different acoustic phenomena, including instruments and animal vocalizations. In contrast to many other forms of sound encoding, the parametric model facilitates various techniques of machine learning and intelligent processing, including sound clustering and principal component analysis. Strengths and weaknesses of the proposed method are reviewed, and possibilities for future development are discussed

    Change blindness: eradication of gestalt strategies

    Get PDF
    Arrays of eight, texture-defined rectangles were used as stimuli in a one-shot change blindness (CB) task where there was a 50% chance that one rectangle would change orientation between two successive presentations separated by an interval. CB was eliminated by cueing the target rectangle in the first stimulus, reduced by cueing in the interval and unaffected by cueing in the second presentation. This supports the idea that a representation was formed that persisted through the interval before being 'overwritten' by the second presentation (Landman et al, 2003 Vision Research 43149–164]. Another possibility is that participants used some kind of grouping or Gestalt strategy. To test this we changed the spatial position of the rectangles in the second presentation by shifting them along imaginary spokes (by ±1 degree) emanating from the central fixation point. There was no significant difference seen in performance between this and the standard task [F(1,4)=2.565, p=0.185]. This may suggest two things: (i) Gestalt grouping is not used as a strategy in these tasks, and (ii) it gives further weight to the argument that objects may be stored and retrieved from a pre-attentional store during this task
    • …
    corecore