121 research outputs found

    FOOTSTEP DETECTION AND CLASSIFICATION USING DISTRIBUTED MICROPHONES

    Get PDF
    ABSTRACT This paper addresses footstep detection and classification with multiple microphones distributed on the floor. We propose to introduce geometrical features such as position and velocity of a sound source for classification which is estimated by amplitude-based localization. It does not require precise inter-microphone time synchronization unlike a conventional microphone array technique. To classify various types of sound events, we introduce four types of features, i.e., time-domain, spectral and Cepstral features in addition to the geometrical features. We constructed a prototype system for footstep detection and classification based on the proposed ideas with eight microphones aligned in a 2-by-4 grid manner. Preliminary classification experiments showed that classification accuracy for four types of sound sources such as a walking footstep, running footstep, handclap, and utterance maintains over 70% even when the signal-to-noise ratio is low, like 0 dB. We also confirmed two advantages with the proposed footstep detection and classification. One is that the proposed features can be applied to classification of other sound sources besides footsteps. The other is that the use of a multichannel approach further improves noise-robustness by selecting the best microphone among the microphones, and providing geometrical information on a sound source

    An unsupervised acoustic fall detection system using source separation for sound interference suppression

    Get PDF
    We present a novel unsupervised fall detection system that employs the collected acoustic signals (footstep sound signals) from an elderly person׳s normal activities to construct a data description model to distinguish falls from non-falls. The measured acoustic signals are initially processed with a source separation (SS) technique to remove the possible interferences from other background sound sources. Mel-frequency cepstral coefficient (MFCC) features are next extracted from the processed signals and used to construct a data description model based on a one class support vector machine (OCSVM) method, which is finally applied to distinguish fall from non-fall sounds. Experiments on a recorded dataset confirm that our proposed fall detection system can achieve better performance, especially with high level of interference from other sound sources, as compared with existing single microphone based methods

    Acoustic Based Footstep Detection in Pervasive Healthcare

    Get PDF
    Passive detection of footsteps in domestic settings can allow the development of assistive technologies that can monitor mobility patterns of older adults in their home environment. Acoustic footstep detection is a promising approach for nonintrusive detection of footsteps. So far there has been limited work in developing robust acoustic footstep detection systems that can operate in noisy home environments. In this paper, we propose a novel application of the Attention based Recurrent Deep Neural Network to detect human footsteps in noisy overlapping audio streams. The model is trained on synthetic data which simulates the acoustic scene in a home environment. To evaluate performance, we reproduced two footstep detection models from literature and compared them using the newly developed Polyphonic Sound Detection Scores (PSDS). Our model achieved the highest PSDS and is close to the highest score achieved by generic indoor AED models in DCASE. The proposed system is designed to both detect and track footsteps within a home setting, and to enhance state-of-the-art digital health-care solutions for empowering older adults to live autonomously in their own homes

    Simultaneous Localization and Mapping with Power Network Electromagnetic Field

    Get PDF
    Various sensing modalities have been exploited for indoor location sensing, each of which has well understood limitations, however. This paper presents a first systematic study on using the electromagnetic field (EMF) induced by a building's electric power network for simultaneous localization and mapping (SLAM). A basis of this work is a measurement study showing that the power network EMF sensed by either a customized sensor or smartphone's microphone as a side-channel sensor is spatially distinct and temporally stable. Based on this, we design a SLAM approach that can reliably detect loop closures based on EMF sensing results. With the EMF feature map constructed by SLAM, we also design an efficient online localization scheme for resource-constrained mobiles. Evaluation in three indoor spaces shows that the power network EMF is a promising modality for location sensing on mobile devices, which is able to run in real time and achieve sub-meter accuracy

    Latitude, longitude, and beyond:mining mobile objects' behavior

    Get PDF
    Rapid advancements in Micro-Electro-Mechanical Systems (MEMS), and wireless communications, have resulted in a surge in data generation. Mobility data is one of the various forms of data, which are ubiquitously collected by different location sensing devices. Extensive knowledge about the behavior of humans and wildlife is buried in raw mobility data. This knowledge can be used for realizing numerous viable applications ranging from wildlife movement analysis, to various location-based recommendation systems, urban planning, and disaster relief. With respect to what mentioned above, in this thesis, we mainly focus on providing data analytics for understanding the behavior and interaction of mobile entities (humans and animals). To this end, the main research question to be addressed is: How can behaviors and interactions of mobile entities be determined from mobility data acquired by (mobile) wireless sensor nodes in an accurate and efficient manner? To answer the above-mentioned question, both application requirements and technological constraints are considered in this thesis. On the one hand, applications requirements call for accurate data analytics to uncover hidden information about individual behavior and social interaction of mobile entities, and to deal with the uncertainties in mobility data. Technological constraints, on the other hand, require these data analytics to be efficient in terms of their energy consumption and to have low memory footprint, and processing complexity

    Presence studies as an evaluation method for user experiences in multimodal virtual environments

    Get PDF

    Classification Of Civilian Vehicle Sounds Using A Large Database Of Vehicle Sounds

    Get PDF
    We have completed the building of an extensive database of civilian vehicle sounds. The database consists of correlated acoustic and seismic signatures of a large number (exceeding 850) of civilian vehicles. Each acoustic signature is obtained through two high-quality microphones separated by 25 feet, and whose signals are exactly synchronized. In this work, spectral and tristimulus features of civilian vehicle sounds are computed and then submitted to further processing using principal component analysis. The “super†features, derived after principal component analysis is performed, are then used for classification. In this research effort, the performance of the quadratic classifier with that of the neural network classifier is compared. Results presented here show that the neural network classifier out-performs the quadratic classifier in distinguishing different and same branded vehicle sounds. The classification usually has small (at times 0%) classification errors

    STARSS23: An Audio-Visual Dataset of Spatial Recordings of Real Scenes with Spatiotemporal Annotations of Sound Events

    Full text link
    While direction of arrival (DOA) of sound events is generally estimated from multichannel audio data recorded in a microphone array, sound events usually derive from visually perceptible source objects, e.g., sounds of footsteps come from the feet of a walker. This paper proposes an audio-visual sound event localization and detection (SELD) task, which uses multichannel audio and video information to estimate the temporal activation and DOA of target sound events. Audio-visual SELD systems can detect and localize sound events using signals from a microphone array and audio-visual correspondence. We also introduce an audio-visual dataset, Sony-TAu Realistic Spatial Soundscapes 2023 (STARSS23), which consists of multichannel audio data recorded with a microphone array, video data, and spatiotemporal annotation of sound events. Sound scenes in STARSS23 are recorded with instructions, which guide recording participants to ensure adequate activity and occurrences of sound events. STARSS23 also serves human-annotated temporal activation labels and human-confirmed DOA labels, which are based on tracking results of a motion capture system. Our benchmark results demonstrate the benefits of using visual object positions in audio-visual SELD tasks. The data is available at https://zenodo.org/record/7880637.Comment: 27 pages, 9 figures, accepted for publication in NeurIPS 2023 Track on Datasets and Benchmark
    • …
    corecore