120 research outputs found

    Detecting multiple, simultaneous talkers through localising speech recorded by ad-hoc microphone arrays

    Get PDF
    This paper proposes a novel approach to detecting multiple, simultaneous talkers in multi-party meetings using localisation of active speech sources recorded with an ad-hoc microphone array. Cues indicating the relative distance between sources and microphones are derived from speech signals and room impulse responses recorded by each of the microphones distributed at unknown locations within a room. Multiple active sources are localised by analysing a surface formed from these cues and derived at different locations within the room. The number of localised active sources per each frame or utterance is then counted to estimate when multiple sources are active. The proposed approach does not require prior information about the number and locations of sources or microphones. Synchronisation between microphones is also not required. A meeting scenario with competing speakers is simulated and results show that simultaneously active sources can be detected with an average accuracy of 75% and the number of active sources counted accurately 65% of the time

    Listening to Distances and Hearing Shapes:Inverse Problems in Room Acoustics and Beyond

    Get PDF
    A central theme of this thesis is using echoes to achieve useful, interesting, and sometimes surprising results. One should have no doubts about the echoes' constructive potential; it is, after all, demonstrated masterfully by Nature. Just think about the bat's intriguing ability to navigate in unknown spaces and hunt for insects by listening to echoes of its calls, or about similar (albeit less well-known) abilities of toothed whales, some birds, shrews, and ultimately people. We show that, perhaps contrary to conventional wisdom, multipath propagation resulting from echoes is our friend. When we think about it the right way, it reveals essential geometric information about the sources--channel--receivers system. The key idea is to think of echoes as being more than just delayed and attenuated peaks in 1D impulse responses; they are actually additional sources with their corresponding 3D locations. This transformation allows us to forget about the abstract \emph{room}, and to replace it by more familiar \emph{point sets}. We can then engage the powerful machinery of Euclidean distance geometry. A problem that always arises is that we do not know \emph{a priori} the matching between the peaks and the points in space, and solving the inverse problem is achieved by \emph{echo sorting}---a tool we developed for learning correct labelings of echoes. This has applications beyond acoustics, whenever one deals with waves and reflections, or more generally, time-of-flight measurements. Equipped with this perspective, we first address the ``Can one hear the shape of a room?'' question, and we answer it with a qualified ``yes''. Even a single impulse response uniquely describes a convex polyhedral room, whereas a more practical algorithm to reconstruct the room's geometry uses only first-order echoes and a few microphones. Next, we show how different problems of localization benefit from echoes. The first one is multiple indoor sound source localization. Assuming the room is known, we show that discretizing the Helmholtz equation yields a system of sparse reconstruction problems linked by the common sparsity pattern. By exploiting the full bandwidth of the sources, we show that it is possible to localize multiple unknown sound sources using only a single microphone. We then look at indoor localization with known pulses from the geometric echo perspective introduced previously. Echo sorting enables localization in non-convex rooms without a line-of-sight path, and localization with a single omni-directional sensor, which is impossible without echoes. A closely related problem is microphone position calibration; we show that echoes can help even without assuming that the room is known. Using echoes, we can localize arbitrary numbers of microphones at unknown locations in an unknown room using only one source at an unknown location---for example a finger snap---and get the room's geometry as a byproduct. Our study of source localization outgrew the initial form factor when we looked at source localization with spherical microphone arrays. Spherical signals appear well beyond spherical microphone arrays; for example, any signal defined on Earth's surface lives on a sphere. This resulted in the first slight departure from the main theme: We develop the theory and algorithms for sampling sparse signals on the sphere using finite rate-of-innovation principles and apply it to various signal processing problems on the sphere

    Shapes from Echoes: Uniqueness from Point-to-Plane Distance Matrices

    Full text link
    We study the problem of localizing a configuration of points and planes from the collection of point-to-plane distances. This problem models simultaneous localization and mapping from acoustic echoes as well as the notable "structure from sound" approach to microphone localization with unknown sources. In our earlier work we proposed computational methods for localization from point-to-plane distances and noted that such localization suffers from various ambiguities beyond the usual rigid body motions; in this paper we provide a complete characterization of uniqueness. We enumerate equivalence classes of configurations which lead to the same distance measurements as a function of the number of planes and points, and algebraically characterize the related transformations in both 2D and 3D. Here we only discuss uniqueness; computational tools and heuristics for practical localization from point-to-plane distances using sound will be addressed in a companion paper.Comment: 13 pages, 13 figure

    Seeing with sound: Investigating the behavioural applications and neural correlates of human echolocation

    Get PDF
    Some blind humans use the reflected echoes from self-produced signals to perceive their silent surroundings. Although the use of echolocation is well documented in animals such as bats and dolphins, comparatively little is known about human echolocation. The overarching goal of the work presented in this thesis was to shed light on some of the basic functions of human echolocation, including the perception of the shape, size, and material. I addressed these aspects of echolocation using behavioural psychophysics and neuroimaging. In Chapter 2 I show that blind echolocators were able to accurately identify the shape of 2D objects, but that their ability to do so was dependent on the use of head and body movements to ‘scan’ the objects’ edges. I suggest that these scanning movements may be similar to the many saccades made by sighted individuals when visually surveying an object or scene. In Chapter 3 I addressed the possibility that object size perception via echolocation shows size constancy – a perceptual phenomenon associated with vision. The results revealed that an expert echolocator accurately perceived the true physical size of objects independent of their distance, even though changes to distance directly affect size-related echo information. The results of this study highlight the ‘visual’ nature of echolocation, and suggest further parallels between the two modalities than previously known or theorized. Chapter 4 presents the results of a functional neuroimaging study aimed at uncovering the neural correlates of material processing via echolocation. By having echolocators listen to recordings of echoes reflected from surfaces of different materials, I show not only that they can determine the material properties of objects, but also that the neural processing underlying this ability may make use of a visual- and auditory-material processing area in the parahippocampal cortex. Taken together, the work presented in the current thesis describes some of the recent contributions to our understanding of human echolocation, with a particular emphasis on its apparent parallels with vision and visual processing. The results of this work show that accurate and reliable information can be extracted from echoes, thus supporting echolocation as a viable resource for the blind

    Practical and Rich User Digitization

    Full text link
    A long-standing vision in computer science has been to evolve computing devices into proactive assistants that enhance our productivity, health and wellness, and many other facets of our lives. User digitization is crucial in achieving this vision as it allows computers to intimately understand their users, capturing activity, pose, routine, and behavior. Today's consumer devices - like smartphones and smartwatches provide a glimpse of this potential, offering coarse digital representations of users with metrics such as step count, heart rate, and a handful of human activities like running and biking. Even these very low-dimensional representations are already bringing value to millions of people's lives, but there is significant potential for improvement. On the other end, professional, high-fidelity comprehensive user digitization systems exist. For example, motion capture suits and multi-camera rigs that digitize our full body and appearance, and scanning machines such as MRI capture our detailed anatomy. However, these carry significant user practicality burdens, such as financial, privacy, ergonomic, aesthetic, and instrumentation considerations, that preclude consumer use. In general, the higher the fidelity of capture, the lower the user's practicality. Most conventional approaches strike a balance between user practicality and digitization fidelity. My research aims to break this trend, developing sensing systems that increase user digitization fidelity to create new and powerful computing experiences while retaining or even improving user practicality and accessibility, allowing such technologies to have a societal impact. Armed with such knowledge, our future devices could offer longitudinal health tracking, more productive work environments, full body avatars in extended reality, and embodied telepresence experiences, to name just a few domains.Comment: PhD thesi

    Recognition of activities of daily living

    Get PDF
    Activities of daily living (ADL) are things we normally do in daily living, including any daily activity such as feeding ourselves, bathing, dressing, grooming, work, homemaking, and leisure. The ability or inability to perform ADLs can be used as a very practical measure of human capability in many types of disorder and disability. Oftentimes in a health care facility, with the help of observations by nurses and self-reporting by residents, professional staff manually collect ADL data and enter data into the system. Technologies in smart homes can provide some solutions to detecting and monitoring a resident’s ADL. Typically multiple sensors can be deployed, such as surveillance cameras in the smart home environment, and contacted sensors affixed to the resident’s body. Note that the traditional technologies incur costly and laborious sensor deployment, and cause uncomfortable feeling of contacted sensors with increased inconvenience. This work presents a novel system facilitated via mobile devices to collect and analyze mobile data pertaining to the human users’ ADL. By employing only one smart phone, this system, named ADL recognition system, significantly reduces set-up costs and saves manpower. It encapsulates rather sophisticated technologies under the hood, such as an agent-based information management platform integrating both the mobile end and the cloud, observer patterns and a time-series based motion analysis mechanism over sensory data. As a single-point deployment system, ADL recognition system provides further benefits that enable the replay of users’ daily ADL routines, in addition to the timely assessment of their life habits

    American Square Dance Vol. 41, No. 1 (Jan. 1986)

    Get PDF
    Monthly square dance magazine that began publication in 1945
    • …
    corecore