589 research outputs found

    Design exploration and performance strategies towards power-efficient FPGA-based achitectures for sound source localization

    Get PDF
    Many applications rely on MEMS microphone arrays for locating sound sources prior to their execution. Those applications not only are executed under real-time constraints but also are often embedded on low-power devices. These environments become challenging when increasing the number of microphones or requiring dynamic responses. Field-Programmable Gate Arrays (FPGAs) are usually chosen due to their flexibility and computational power. This work intends to guide the design of reconfigurable acoustic beamforming architectures, which are not only able to accurately determine the sound Direction-Of-Arrival (DoA) but also capable to satisfy the most demanding applications in terms of power efficiency. Design considerations of the required operations performing the sound location are discussed and analysed in order to facilitate the elaboration of reconfigurable acoustic beamforming architectures. Performance strategies are proposed and evaluated based on the characteristics of the presented architecture. This power-efficient architecture is compared to a different architecture prioritizing performance in order to reveal the unavoidable design trade-offs

    Localization of sound sources : a systematic review

    Get PDF
    Sound localization is a vast field of research and advancement which is used in many useful applications to facilitate communication, radars, medical aid, and speech enhancement to but name a few. Many different methods are presented in recent times in this field to gain benefits. Various types of microphone arrays serve the purpose of sensing the incoming sound. This paper presents an overview of the importance of using sound localization in different applications along with the use and limitations of ad-hoc microphones over other microphones. In order to overcome these limitations certain approaches are also presented. Detailed explanation of some of the existing methods that are used for sound localization using microphone arrays in the recent literature is given. Existing methods are studied in a comparative fashion along with the factors that influence the choice of one method over the others. This review is done in order to form a basis for choosing the best fit method for our use

    Sound Event Localization, Detection, and Tracking by Deep Neural Networks

    Get PDF
    In this thesis, we present novel sound representations and classification methods for the task of sound event localization, detection, and tracking (SELDT). The human auditory system has evolved to localize multiple sound events, recognize and further track their motion individually in an acoustic environment. This ability of humans makes them context-aware and enables them to interact with their surroundings naturally. Developing similar methods for machines will provide an automatic description of social and human activities around them and enable machines to be context-aware similar to humans. Such methods can be employed to assist the hearing impaired to visualize sounds, for robot navigation, and to monitor biodiversity, the home, and cities. A real-life acoustic scene is complex in nature, with multiple sound events that are temporally and spatially overlapping, including stationary and moving events with varying angular velocities. Additionally, each individual sound event class, for example, a car horn can have a lot of variabilities, i.e., different cars have different horns, and within the same model of the car, the duration and the temporal structure of the horn sound is driver dependent. Performing SELDT in such overlapping and dynamic sound scenes while being robust is challenging for machines. Hence we propose to investigate the SELDT task in this thesis and use a data-driven approach using deep neural networks (DNNs). The sound event detection (SED) task requires the detection of onset and offset time for individual sound events and their corresponding labels. In this regard, we propose to use spatial and perceptual features extracted from multichannel audio for SED using two different DNNs, recurrent neural networks (RNNs) and convolutional recurrent neural networks (CRNNs). We show that using multichannel audio features improves the SED performance for overlapping sound events in comparison to traditional single-channel audio features. The proposed novel features and methods produced state-of-the-art performance for the real-life SED task and won the IEEE AASP DCASE challenge consecutively in 2016 and 2017. Sound event localization is the task of spatially locating the position of individual sound events. Traditionally, this has been approached using parametric methods. In this thesis, we propose a CRNN for detecting the azimuth and elevation angles of multiple temporally overlapping sound events. This is the first DNN-based method performing localization in complete azimuth and elevation space. In comparison to parametric methods which require the information of the number of active sources, the proposed method learns this information directly from the input data and estimates their respective spatial locations. Further, the proposed CRNN is shown to be more robust than parametric methods in reverberant scenarios. Finally, the detection and localization tasks are performed jointly using a CRNN. This method additionally tracks the spatial location with time, thus producing the SELDT results. This is the first DNN-based SELDT method and is shown to perform equally with stand-alone baselines for SED, localization, and tracking. The proposed SELDT method is evaluated on nine datasets that represent anechoic and reverberant sound scenes, stationary and moving sources with varying velocities, a different number of overlapping sound events and different microphone array formats. The results show that the SELDT method can track multiple overlapping sound events that are both spatially stationary and moving

    A Secure Real-time Multimedia Streaming through Robust and Lightweight AES Encryption in UAV Networks for Operational Scenarios in Military Domain

    Get PDF
    multimodal data encryption and decryption for security applications in protected environments like espionage, situational awareness, monitoring, and counter-UAV. Data is captured from drones equipped with microphone arrays and cameras. This is performed by exploiting acoustic event analysis, video tracking, and recognition, performed on a ground station. All the communications are delivered in a secure data channel. Integrity and secrecy of the sensitive data acquired by drones must be guaranteed until the data is delivered in real-time from UAVs to the destination node. A possible data exploit may cause critical problems if the data is intercepted by malicious attackers. Being the drones equipped with low energy consuming devices with low computational power, like single-board-computers, a real-time lightweight application-level AES encryption, in addition to the MAC encryption of the wireless communication channel, has been considered. In the experiment, the encryption and decryption process has been optimized, even under adverse transmission conditions ensuring continuous data encryption even if some packets are lost or the connection is repeatedly dropped and reestablished

    Biologically inspired learning system

    Get PDF
    Learning Systems used on robots require either a-priori knowledge in the form of models, rules of thumb or databases or require that robot to physically execute multitudes of trial solutions. The first requirement limits the robot’s ability to operate in unstructured changing environments, and the second limits the robot’s service life and resources. In this research a generalized approach to learning was developed through a series of algorithms that can be used for construction of behaviors that are able to cope with unstructured environments through adaptation of both internal parameters and system structure as a result of a goal based supervisory mechanism. Four main learning algorithms have been developed, along with a goal directed random exploration routine. These algorithms all use the concept of learning from a recent memory in order to save the robot/agent from having to exhaustively execute all trial solutions. The first algorithm is a reactive online learning algorithm that uses a supervised learning to find the sensor/action combinations that promote realization of a preprogrammed goal. It produces a feed forward neural network controller that is used to control the robot. The second algorithm is similar to first in that it uses a supervised learning strategy, but it produces a neural network that considers past values, thus providing a non-reactive solution. The third algorithm is a departure from the first two in that uses a non-supervised learning technique to learn the best actions for each situation the robot encounters. The last algorithm builds a graph of the situations encountered by agent/robot in order to learn to associate the best actions with sensor inputs. It uses an unsupervised learning approach based on shortest paths to a goal situation in the graph in order to generate a non-reactive feed forward neural network. Test results were good, the first and third algorithms were tested in a formation maneuvering task in both simulation and onboard mobile robots, while the second and fourth were tested simulation

    From data acquisition to data fusion : a comprehensive review and a roadmap for the identification of activities of daily living using mobile devices

    Get PDF
    This paper focuses on the research on the state of the art for sensor fusion techniques, applied to the sensors embedded in mobile devices, as a means to help identify the mobile device user’s daily activities. Sensor data fusion techniques are used to consolidate the data collected from several sensors, increasing the reliability of the algorithms for the identification of the different activities. However, mobile devices have several constraints, e.g., low memory, low battery life and low processing power, and some data fusion techniques are not suited to this scenario. The main purpose of this paper is to present an overview of the state of the art to identify examples of sensor data fusion techniques that can be applied to the sensors available in mobile devices aiming to identify activities of daily living (ADLs)

    Contextual Beamforming: Exploiting Location and AI for Enhanced Wireless Telecommunication Performance

    Full text link
    The pervasive nature of wireless telecommunication has made it the foundation for mainstream technologies like automation, smart vehicles, virtual reality, and unmanned aerial vehicles. As these technologies experience widespread adoption in our daily lives, ensuring the reliable performance of cellular networks in mobile scenarios has become a paramount challenge. Beamforming, an integral component of modern mobile networks, enables spatial selectivity and improves network quality. However, many beamforming techniques are iterative, introducing unwanted latency to the system. In recent times, there has been a growing interest in leveraging mobile users' location information to expedite beamforming processes. This paper explores the concept of contextual beamforming, discussing its advantages, disadvantages and implications. Notably, the study presents an impressive 53% improvement in signal-to-noise ratio (SNR) by implementing the adaptive beamforming (MRT) algorithm compared to scenarios without beamforming. It further elucidates how MRT contributes to contextual beamforming. The importance of localization in implementing contextual beamforming is also examined. Additionally, the paper delves into the use of artificial intelligence schemes, including machine learning and deep learning, in implementing contextual beamforming techniques that leverage user location information. Based on the comprehensive review, the results suggest that the combination of MRT and Zero forcing (ZF) techniques, alongside deep neural networks (DNN) employing Bayesian Optimization (BO), represents the most promising approach for contextual beamforming. Furthermore, the study discusses the future potential of programmable switches, such as Tofino, in enabling location-aware beamforming
    • …
    corecore