1,172 research outputs found
MicNest: Long-Range Instant Acoustic Localization of Drones in Precise Landing
We present MicNest: an acoustic localization system enabling precise landing of aerial drones. Drone landing is a crucial step in a drone's operation, especially as high-bandwidth wireless networks, such as 5G, enable beyond-line-of-sight operation in a shared airspace and applications such as instant asset delivery with drones gain traction. In MicNest, multiple microphones are deployed on a landing platform in carefully devised configurations. The drone carries a speaker transmitting purposefully-designed acoustic pulses. The drone may be localized as long as the pulses are correctly detected. Doing so is challenging: i) because of limited transmission power, propagation attenuation, background noise, and propeller interference, the Signal-to-Noise Ratio (SNR) of received pulses is intrinsically low; ii) the pulses experience non-linear Doppler distortion due to the physical drone dynamics while airborne; iii) as location information is to be used during landing, the processing latency must be reduced to effectively feed the flight control loop. To tackle these issues, we design a novel pulse detector, Matched Filter Tree (MFT), whose idea is to convert pulse detection to a tree search problem. We further present three practical methods to accelerate tree search jointly. Our real-world experiments show that MicNest is able to localize a drone 120 m away with 0.53% relative localization error at 20 Hz location update frequency
Smartphone-based vehicle telematics: a ten-year anniversary
This is the author accepted manuscript. The final version is available from the publisher via the DOI in this recordJust as it has irrevocably reshaped social life, the fast growth of smartphone ownership is now beginning to revolutionize the driving experience and change how we think about automotive insurance, vehicle safety systems, and traffic research. This paper summarizes the first ten years of research in smartphone-based vehicle telematics, with a focus on user-friendly implementations and the challenges that arise due to the mobility of the smartphone. Notable academic and industrial projects are reviewed, and system aspects related to sensors, energy consumption, and human-machine interfaces are examined. Moreover, we highlight the differences between traditional and smartphone-based automotive navigation, and survey the state of the art in smartphone-based transportation mode classification, vehicular ad hoc networks, cloud computing, driver classification, and road condition monitoring. Future advances are expected to be driven by improvements in sensor technology, evidence of the societal benefits of current implementations, and the establishment of industry standards for sensor fusion and driver assessment
Recommended from our members
Ultra-Low-Power IoT Solutions for Sound Source Localization: Combining Mixed-Signal Processing and Machine Learning
With the prevalence of smartphones, pedestrians and joggers today often walk or run while listening to music. Since they are deprived of auditory stimuli that could provide important cues to dangers, they are at a much greater risk of being hit by cars or other vehicles. We start this research into building a wearable system that uses multichannel audio sensors embedded in a headset to help detect and locate cars from their honks and engine and tire noises. Based on this detection, the system can warn pedestrians of the imminent danger of approaching cars. We demonstrate that using a segmented architecture and implementation consisting of headset-mounted audio sensors, front-end hardware that performs signal processing and feature extraction, and machine-learning-based classification on a smartphone, we are able to provide early danger detection in real time, from up to 80m distance, with greater than 80% precision and 90% recall, and alert the user on time (about 6s in advance for a car traveling at 30mph).
The time delay between audio signals in a microphone array is the most important feature for sound-source localization. This work also presents a polarity-coincidence, adaptive time-delay estimation (PCC-ATDE) mixed-signal technique that uses 1-bit quantized signals and a negative-feedback architecture to directly determine the time delay between signals in the analog inputs and convert it to a digital number. This direct conversion, without a multibit ADC and further digital-signal processing, allows for ultra low power consumption. A prototype chip in 0:18μm CMOS with 4 analog inputs consumes 78nW with a 3-channel 8-bit digital time-delay output while sampling at 50kHz with a 20μs resolution and 6.06 ENOB. We present a theoretical analysis for the nonlinear, signal-dependent feedback loop of the PCC-ATDE. A delay-domain model of the system is developed to estimate the power bandwidth of the converter and predict its dynamic response. Results are validated with experiments using real-life stimuli, captured with a microphone array, that demonstrate the technique’s ability to localize a sound source. The chip is further integrated in an embedded platform and deployed as an audio-based vehicle-bearing IoT system.
Finally, we investigate the signal’s envelope, an important feature for a host of applications enabled by machine-learning algorithms. Conventionally, the raw analog signal is digitized first, followed by feature extraction in the digital domain. This work presents an ultra-low-power envelope-to-digital converter (EDC) consisting of a passive switched-capacitor envelope detector and an inseparable successive approximation-register analog-to-digital converter (ADC). The two blocks integrate directly at different sampling rates without a buffer between them thanks to the ping-pong operation of their sampling capacitors. An EDC prototype was fabricated in 180nm CMOS. It provides 7.1 effective bits of ADC resolution and supports input signal bandwidth up to 5kHz and an envelope bandwidth up to 50Hz while consuming 9.6nW
Deep Learning for Environmentally Robust Speech Recognition: An Overview of Recent Developments
Eliminating the negative effect of non-stationary environmental noise is a
long-standing research topic for automatic speech recognition that stills
remains an important challenge. Data-driven supervised approaches, including
ones based on deep neural networks, have recently emerged as potential
alternatives to traditional unsupervised approaches and with sufficient
training, can alleviate the shortcomings of the unsupervised methods in various
real-life acoustic environments. In this light, we review recently developed,
representative deep learning approaches for tackling non-stationary additive
and convolutional degradation of speech with the aim of providing guidelines
for those involved in the development of environmentally robust speech
recognition systems. We separately discuss single- and multi-channel techniques
developed for the front-end and back-end of speech recognition systems, as well
as joint front-end and back-end training frameworks
Acoustic Sensing: Mobile Applications and Frameworks
Acoustic sensing has attracted significant attention from both academia and industry due to its ubiquity. Since smartphones and many IoT devices are already equipped with microphones and speakers, it requires nearly zero additional deployment cost. Acoustic sensing is also versatile. For example, it can detect obstacles for distracted pedestrians (BumpAlert), remember indoor locations through recorded echoes (EchoTag), and also understand the touch force applied to mobile devices (ForcePhone).
In this dissertation, we first propose three acoustic sensing applications, BumpAlert, EchoTag, and ForcePhone, and then introduce a cross-platform sensing framework called LibAS. LibAS is designed to facilitate the development of acoustic sensing applications. For example, LibAS can let developers prototype and validate their sensing ideas and apps on commercial devices without the detailed knowledge of platform-dependent programming. LibAS is shown to require less than 30 lines of code in Matlab to implement the prototype of ForcePhone on Android/iOS/Tizen/Linux devices.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143971/1/yctung_1.pd
Acoustic Localization System for Precise Drone Landing
We present MICNEST: an acoustic localization system enabling precise drone landing. In MICNEST, multiple microphones are deployed on a landing platform in carefully devised configurations. The drone carries a speaker transmitting purposefully-designed acoustic pulses. The drone may be localized as long as the pulses are correctly detected. Doing so is challenging: i) because of limited transmission power, propagation attenuation, background noise, and propeller interference, the Signal-to-Noise Ratio (SNR) of received pulses is intrinsically low; ii) the pulses experience non-linear Doppler distortion due to the physical drone dynamics; iii) as location information is used during landing, the processing latency must be reduced to effectively feed the flight control loop. To tackle these issues, we design a novel pulse detector, Matched Filter Tree (MFT), whose idea is to convert pulse detection to a tree search problem. We further present three practical methods to accelerate tree search jointly. Our experiments show that MICNEST can localize a drone 120 m away with 0.53% relative localization error at 20 Hz location update frequency. For navigating drone landing, MICNEST can achieve a success rate of 94 %. The average landing error (distance between landing point and target point) is only 4.3 cm
Collaborative, Intelligent, and Adaptive Systems for the Low-Power Internet of Things
With the emergence of the Internet of Things (IoT), more and more devices are getting equipped with communication capabilities, often via wireless radios. Their deployments pave the way for new and mission-critical applications: cars will communicate with nearby vehicles to coordinate at intersections; industrial wireless closed-loop systems will improve operational safety in factories; while swarms of drones will coordinate to plan collision-free trajectories. To achieve these goals, IoT devices will need to communicate, coordinate, and collaborate over the wireless medium. However, these envisioned applications necessitate new characteristics that current solutions and protocols cannot fulfill: IoT devices require consistency guarantees from their communication and demand for adaptive behavior in complex and dynamic environments.In this thesis, we design, implement, and evaluate systems and mechanisms to enable safe coordination and adaptivity for the smallest IoT devices. To ensure consistent coordination, we bring fault-tolerant consensus to low-power wireless communication and introduce Wireless Paxos, a flavor of the Paxos algorithm specifically tailored to low-power IoT. We then present STARC, a wireless coordination mechanism for intersection management combining commit semantics with synchronous transmissions. To enable adaptivity in the wireless networking stack, we introduce Dimmer and eAFH. Dimmer combines Reinforcement Learning and Multi-Armed Bandits to adapt its communication parameters and counteract the adverse effects of wireless interference at runtime while optimizing energy consumption in normal conditions. eAFH provides dynamic channel management in Bluetooth Low Energy by excluding and dynamically re-including channels in scenarios with mobility. Finally, we demonstrate with BlueSeer that a device can classify its environment, i.e., recognize whether it is located in a home, office, street, or transport, solely from received Bluetooth Low Energy signals fed into an embedded machine learning model. BlueSeer therefore increases the intelligence of the smallest IoT devices, allowing them to adapt their behaviors to their current surroundings
Acoustic-channel attack and defence methods for personal voice assistants
Personal Voice Assistants (PVAs) are increasingly used as interface to digital environments. Voice commands are used to interact with phones, smart homes or cars. In the US alone the number of smart speakers such as Amazon’s Echo and Google Home has grown by 78% to 118.5 million and 21% of the US population own at least one device. Given the increasing dependency of society on PVAs, security and privacy of these has become a major concern of users, manufacturers and policy makers. Consequently, a steep increase in research efforts addressing security and privacy of PVAs can be observed in recent years. While some security and privacy research applicable to the PVA domain predates their recent increase in popularity and many new research strands have emerged, there lacks research dedicated to PVA security and privacy. The most important interaction interface between users and a PVA is the acoustic channel and acoustic channel related security and privacy studies are desirable and required. The aim of the work presented in this thesis is to enhance the cognition of security and privacy issues of PVA usage related to the acoustic channel, to propose principles and solutions to key usage scenarios to mitigate potential security threats, and to present a novel type of dangerous attack which can be launched only by using a PVA alone. The five core contributions of this thesis are: (i) a taxonomy is built for the research domain of PVA security and privacy issues related to acoustic channel. An extensive research overview on the state of the art is provided, describing a comprehensive research map for PVA security and privacy. It is also shown in this taxonomy where the contributions of this thesis lie; (ii) Work has emerged aiming to generate adversarial audio inputs which sound harmless to humans but can trick a PVA to recognise harmful commands. The majority of work has been focused on the attack side, but there rarely exists work on how to defend against this type of attack. A defence method against white-box adversarial commands is proposed and implemented as a prototype. It is shown that a defence Automatic Speech Recognition (ASR) can work in parallel with the PVA’s main one, and adversarial audio input is detected if the difference in the speech decoding results between both ASR surpasses a threshold. It is demonstrated that an ASR that differs in architecture and/or training data from the the PVA’s main ASR is usable as protection ASR; (iii) PVAs continuously monitor conversations which may be transported to a cloud back end where they are stored, processed and maybe even passed on to other service providers. A user has limited control over this process when a PVA is triggered without user’s intent or a PVA belongs to others. A user is unable to control the recording behaviour of surrounding PVAs, unable to signal privacy requirements and unable to track conversation recordings. An acoustic tagging solution is proposed aiming to embed additional information into acoustic signals processed by PVAs. A user employs a tagging device which emits an acoustic signal when PVA activity is assumed. Any active PVA will embed this tag into their recorded audio stream. The tag may signal a cooperating PVA or back-end system that a user has not given a recording consent. The tag may also be used to trace when and where a recording was taken if necessary. A prototype tagging device based on PocketSphinx is implemented. Using Google Home Mini as the PVA, it is demonstrated that the device can tag conversations and the tagging signal can be retrieved from conversations stored in the Google back-end system; (iv) Acoustic tagging provides users the capability to signal their permission to the back-end PVA service, and another solution inspired by Denial of Service (DoS) is proposed as well for protecting user privacy. Although PVAs are very helpful, they are also continuously monitoring conversations. When a PVA detects a wake word, the immediately following conversation is recorded and transported to a cloud system for further analysis. An active protection mechanism is proposed: reactive jamming. A Protection Jamming Device (PJD) is employed to observe conversations. Upon detection of a PVA wake word the PJD emits an acoustic jamming signal. The PJD must detect the wake word faster than the PVA such that the jamming signal still prevents wake word detection by the PVA. An evaluation of the effectiveness of different jamming signals and overlap between wake words and the jamming signals is carried out. 100% jamming success can be achieved with an overlap of at least 60% with a negligible false positive rate; (v) Acoustic components (speakers and microphones) on a PVA can potentially be re-purposed to achieve acoustic sensing. This has great security and privacy implication due to the key role of PVAs in digital environments. The first active acoustic side-channel attack is proposed. Speakers are used to emit human inaudible acoustic signals and the echo is recorded via microphones, turning the acoustic system of a smartphone into a sonar system. The echo signal can be used to profile user interaction with the device. For example, a victim’s finger movement can be monitored to steal Android unlock patterns. The number of candidate unlock patterns that an attacker must try to authenticate herself to a Samsung S4 phone can be reduced by up to 70% using this novel unnoticeable acoustic side-channel
- …