6,520 research outputs found

    Deep Learning for Audio Signal Processing

    Full text link
    Given the recent surge in developments of deep learning, this article provides a review of the state-of-the-art deep learning techniques for audio signal processing. Speech, music, and environmental sound processing are considered side-by-side, in order to point out similarities and differences between the domains, highlighting general methods, problems, key references, and potential for cross-fertilization between areas. The dominant feature representations (in particular, log-mel spectra and raw waveform) and deep learning models are reviewed, including convolutional neural networks, variants of the long short-term memory architecture, as well as more audio-specific neural network models. Subsequently, prominent deep learning application areas are covered, i.e. audio recognition (automatic speech recognition, music information retrieval, environmental sound detection, localization and tracking) and synthesis and transformation (source separation, audio enhancement, generative models for speech, sound, and music synthesis). Finally, key issues and future questions regarding deep learning applied to audio signal processing are identified.Comment: 15 pages, 2 pdf figure

    Adaptive sampling in autonomous marine sensor networks

    Get PDF
    Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution June 2006In this thesis, an innovative architecture for real-time adaptive and cooperative control of autonomous sensor platforms in a marine sensor network is described in the context of the autonomous oceanographic network scenario. This architecture has three major components, an intelligent, logical sensor that provides high-level environmental state information to a behavior-based autonomous vehicle control system, a new approach to behavior-based control of autonomous vehicles using multiple objective functions that allows reactive control in complex environments with multiple constraints, and an approach to cooperative robotics that is a hybrid between the swarm cooperation and intentional cooperation approaches. The mobility of the sensor platforms is a key advantage of this strategy, allowing dynamic optimization of the sensor locations with respect to the classification or localization of a process of interest including processes which can be time varying, not spatially isotropic and for which action is required in real-time. Experimental results are presented for a 2-D target tracking application in which fully autonomous surface craft using simulated bearing sensors acquire and track a moving target in open water. In the first example, a single sensor vehicle adaptively tracks a target while simultaneously relaying the estimated track to a second vehicle acting as a classification platform. In the second example, two spatially distributed sensor vehicles adaptively track a moving target by fusing their sensor information to form a single target track estimate. In both cases the goal is to adapt the platform motion to minimize the uncertainty of the target track parameter estimates. The link between the sensor platform motion and the target track estimate uncertainty is fully derived and this information is used to develop the behaviors for the sensor platform control system. The experimental results clearly illustrate the significant processing gain that spatially distributed sensors can achieve over a single sensor when observing a dynamic phenomenon as well as the viability of behavior-based control for dealing with uncertainty in complex situations in marine sensor networks.Supported by the Office of Naval Research, with a 3-year National Defense Science and Engineering Grant Fellowship and research assistantships through the Generic Ocean Array Technology Sonar (GOATS) project, contract N00014-97-1-0202 and contract N00014-05-G-0106 Delivery Order 008, PLUSNET: Persistent Littoral Undersea Surveillance Network

    Reflection-Aware Sound Source Localization

    Full text link
    We present a novel, reflection-aware method for 3D sound localization in indoor environments. Unlike prior approaches, which are mainly based on continuous sound signals from a stationary source, our formulation is designed to localize the position instantaneously from signals within a single frame. We consider direct sound and indirect sound signals that reach the microphones after reflecting off surfaces such as ceilings or walls. We then generate and trace direct and reflected acoustic paths using inverse acoustic ray tracing and utilize these paths with Monte Carlo localization to estimate a 3D sound source position. We have implemented our method on a robot with a cube-shaped microphone array and tested it against different settings with continuous and intermittent sound signals with a stationary or a mobile source. Across different settings, our approach can localize the sound with an average distance error of 0.8m tested in a room of 7m by 7m area with 3m height, including a mobile and non-line-of-sight sound source. We also reveal that the modeling of indirect rays increases the localization accuracy by 40% compared to only using direct acoustic rays.Comment: Submitted to ICRA 2018. The working video is available at (https://youtu.be/TkQ36lMEC-M

    An objective based classification of aggregation techniques for wireless sensor networks

    No full text
    Wireless Sensor Networks have gained immense popularity in recent years due to their ever increasing capabilities and wide range of critical applications. A huge body of research efforts has been dedicated to find ways to utilize limited resources of these sensor nodes in an efficient manner. One of the common ways to minimize energy consumption has been aggregation of input data. We note that every aggregation technique has an improvement objective to achieve with respect to the output it produces. Each technique is designed to achieve some target e.g. reduce data size, minimize transmission energy, enhance accuracy etc. This paper presents a comprehensive survey of aggregation techniques that can be used in distributed manner to improve lifetime and energy conservation of wireless sensor networks. Main contribution of this work is proposal of a novel classification of such techniques based on the type of improvement they offer when applied to WSNs. Due to the existence of a myriad of definitions of aggregation, we first review the meaning of term aggregation that can be applied to WSN. The concept is then associated with the proposed classes. Each class of techniques is divided into a number of subclasses and a brief literature review of related work in WSN for each of these is also presented

    Engineering data compendium. Human perception and performance. User's guide

    Get PDF
    The concept underlying the Engineering Data Compendium was the product of a research and development program (Integrated Perceptual Information for Designers project) aimed at facilitating the application of basic research findings in human performance to the design and military crew systems. The principal objective was to develop a workable strategy for: (1) identifying and distilling information of potential value to system design from the existing research literature, and (2) presenting this technical information in a way that would aid its accessibility, interpretability, and applicability by systems designers. The present four volumes of the Engineering Data Compendium represent the first implementation of this strategy. This is the first volume, the User's Guide, containing a description of the program and instructions for its use

    Audio‐Visual Speaker Tracking

    Get PDF
    Target motion tracking found its application in interdisciplinary fields, including but not limited to surveillance and security, forensic science, intelligent transportation system, driving assistance, monitoring prohibited area, medical science, robotics, action and expression recognition, individual speaker discrimination in multi‐speaker environments and video conferencing in the fields of computer vision and signal processing. Among these applications, speaker tracking in enclosed spaces has been gaining relevance due to the widespread advances of devices and technologies and the necessity for seamless solutions in real‐time tracking and localization of speakers. However, speaker tracking is a challenging task in real‐life scenarios as several distinctive issues influence the tracking process, such as occlusions and an unknown number of speakers. One approach to overcome these issues is to use multi‐modal information, as it conveys complementary information about the state of the speakers compared to single‐modal tracking. To use multi‐modal information, several approaches have been proposed which can be classified into two categories, namely deterministic and stochastic. This chapter aims at providing multimedia researchers with a state‐of‐the‐art overview of tracking methods, which are used for combining multiple modalities to accomplish various multimedia analysis tasks, classifying them into different categories and listing new and future trends in this field

    Tracking the Fine Scale Movements of Fish using Autonomous Maritime Robotics: A Systematic State of the Art Review

    Get PDF
    This paper provides a systematic state of the art review on tracking the fine scale movements of fish with the use of autonomous maritime robotics. Knowledge of migration patterns and the localization of specific species of fish at a given time is vital to many aspects of conservation. This paper reviews these technologies and provides insight into what systems are being used and why. The review results show that a larger amount of complex systems that use a deep learning techniques are used over more simplistic approaches to the design. Most results found in the study involve Autonomous Underwater Vehicles, which generally require the most complex array of sensors. The results also provide insight into future research such as methods involving swarm intelligence, which has seen an increase in use in recent years. This synthesis of current and future research will be helpful to research teams working to create an autonomous vehicle with intentions to track, navigate or survey

    Towards an optimal design for ecosystem-level ocean observatories

    Get PDF
    Four operational factors, together with high development cost, currently limit the use of ocean observatories in ecological and fisheries applications: 1) limited spatial coverage; 2) limited integration of multiple types of technologies; 3) limitations in the experimental design for in situ studies; and 4) potential unpredicted bias in monitoring outcomes due to the infrastructure’s presence and functioning footprint. To address these limitations, we propose a novel concept of a standardized “ecosystem observatory module” structure composed of a central node and three tethered satellite pods together with permanent mobile platforms. The module would be designed with a rigid spatial configuration to optimize overlap among multiple observation technologies each providing 360° coverage around the module, including permanent stereo-video cameras, acoustic imaging sonar cameras, horizontal multi-beam echosounders and a passive acoustic array. The incorporation of multiple integrated observation technologies would enable unprecedented quantification of macrofaunal composition, abundance and density surrounding the module, as well as the ability to track the movements of individual fishes and macroinvertebrates. Such a standardized modular design would allow for the hierarchical spatial connection of observatory modules into local module clusters and larger geographic module networks, providing synoptic data within and across linked ecosystems suitable for fisheries and ecosystem level monitoring on multiple scales.Peer ReviewedPostprint (author's final draft
    corecore