660 research outputs found
Deep learning-based denoising streamed from mobile phones improves speech-in-noise understanding for hearing aid users
The hearing loss of almost half a billion people is commonly treated with
hearing aids. However, current hearing aids often do not work well in
real-world noisy environments. We present a deep learning based denoising
system that runs in real time on iPhone 7 and Samsung Galaxy S10 (25ms
algorithmic latency). The denoised audio is streamed to the hearing aid,
resulting in a total delay of around 75ms. In tests with hearing aid users
having moderate to severe hearing loss, our denoising system improves audio
across three tests: 1) listening for subjective audio ratings, 2) listening for
objective speech intelligibility, and 3) live conversations in a noisy
environment for subjective ratings. Subjective ratings increase by more than
40%, for both the listening test and the live conversation compared to a fitted
hearing aid as a baseline. Speech reception thresholds, measuring speech
understanding in noise, improve by 1.6 dB SRT. Ours is the first denoising
system that is implemented on a mobile device, streamed directly to users'
hearing aids using only a single channel as audio input while improving user
satisfaction on all tested aspects, including speech intelligibility. This
includes overall preference of the denoised and streamed signal over the
hearing aid, thereby accepting the higher latency for the significant
improvement in speech understanding
Smartphone Apps in the Context of Tinnitus: Systematic Review
Smartphones containing sophisticated high-end hardware and offering high computational capabilities at extremely manageable costs have become mainstream and an integral part of users' lives. Widespread adoption of smartphone devices has encouraged the development of many smartphone applications, resulting in a well-established ecosystem, which is easily discoverable and accessible via respective marketplaces of differing mobile platforms. These smartphone applications are no longer exclusively limited to entertainment purposes but are increasingly established in the scientific and medical field. In the context of tinnitus, the ringing in the ear, these smartphone apps range from relief, management, self-help, all the way to interfacing external sensors to better understand the phenomenon. In this paper, we aim to bring forth the smartphone applications in and around tinnitus. Based on the PRISMA guidelines, we systematically analyze and investigate the current state of smartphone apps, that are directly applied in the context of tinnitus. In particular, we explore Google Scholar, CiteSeerX, Microsoft Academics, Semantic Scholar for the identification of scientific contributions. Additionally, we search and explore Google’s Play and Apple's App Stores to identify relevant smartphone apps and their respective properties. This review work gives (1) an up-to-date overview of existing apps, and (2) lists and discusses scientific literature pertaining to the smartphone apps used within the context of tinnitus
Fog Computing in Medical Internet-of-Things: Architecture, Implementation, and Applications
In the era when the market segment of Internet of Things (IoT) tops the chart
in various business reports, it is apparently envisioned that the field of
medicine expects to gain a large benefit from the explosion of wearables and
internet-connected sensors that surround us to acquire and communicate
unprecedented data on symptoms, medication, food intake, and daily-life
activities impacting one's health and wellness. However, IoT-driven healthcare
would have to overcome many barriers, such as: 1) There is an increasing demand
for data storage on cloud servers where the analysis of the medical big data
becomes increasingly complex, 2) The data, when communicated, are vulnerable to
security and privacy issues, 3) The communication of the continuously collected
data is not only costly but also energy hungry, 4) Operating and maintaining
the sensors directly from the cloud servers are non-trial tasks. This book
chapter defined Fog Computing in the context of medical IoT. Conceptually, Fog
Computing is a service-oriented intermediate layer in IoT, providing the
interfaces between the sensors and cloud servers for facilitating connectivity,
data transfer, and queryable local database. The centerpiece of Fog computing
is a low-power, intelligent, wireless, embedded computing node that carries out
signal conditioning and data analytics on raw data collected from wearables or
other medical sensors and offers efficient means to serve telehealth
interventions. We implemented and tested an fog computing system using the
Intel Edison and Raspberry Pi that allows acquisition, computing, storage and
communication of the various medical data such as pathological speech data of
individuals with speech disorders, Phonocardiogram (PCG) signal for heart rate
estimation, and Electrocardiogram (ECG)-based Q, R, S detection.Comment: 29 pages, 30 figures, 5 tables. Keywords: Big Data, Body Area
Network, Body Sensor Network, Edge Computing, Fog Computing, Medical
Cyberphysical Systems, Medical Internet-of-Things, Telecare, Tele-treatment,
Wearable Devices, Chapter in Handbook of Large-Scale Distributed Computing in
Smart Healthcare (2017), Springe
FedTherapist: Mental Health Monitoring with User-Generated Linguistic Expressions on Smartphones via Federated Learning
Psychiatrists diagnose mental disorders via the linguistic use of patients.
Still, due to data privacy, existing passive mental health monitoring systems
use alternative features such as activity, app usage, and location via mobile
devices. We propose FedTherapist, a mobile mental health monitoring system that
utilizes continuous speech and keyboard input in a privacy-preserving way via
federated learning. We explore multiple model designs by comparing their
performance and overhead for FedTherapist to overcome the complex nature of
on-device language model training on smartphones. We further propose a
Context-Aware Language Learning (CALL) methodology to effectively utilize
smartphones' large and noisy text for mental health signal sensing. Our
IRB-approved evaluation of the prediction of self-reported depression, stress,
anxiety, and mood from 46 participants shows higher accuracy of FedTherapist
compared with the performance with non-language features, achieving 0.15 AUROC
improvement and 8.21% MAE reduction.Comment: Accepted to the 2023 Conference on Empirical Methods in Natural
Language Processing (EMNLP 2023
Opportunistic and Context-aware Affect Sensing on Smartphones: The Concept, Challenges and Opportunities
Opportunistic affect sensing offers unprecedented potential for capturing
spontaneous affect ubiquitously, obviating biases inherent in the laboratory
setting. Facial expression and voice are two major affective displays, however
most affect sensing systems on smartphone avoid them due to extensive power
requirement. Encouragingly, due to the recent advent of low-power DSP (Digital
Signal Processing) co-processor and GPU (Graphics Processing Unit) technology,
audio and video sensing are becoming more feasible. To properly evaluate
opportunistically captured facial expression and voice, contextual information
about the dynamic audio-visual stimuli needs to be inferred. This paper
discusses recent advances of affect sensing on the smartphone and identifies
the key barriers and potential solutions of implementing opportunistic and
context-aware affect sensing on smartphone platforms
Prioritizing Content of Interest in Multimedia Data Compression
Image and video compression techniques make data transmission and storage in digital multimedia systems more efficient and feasible for the system's limited storage and bandwidth. Many generic image and video compression techniques such as JPEG and H.264/AVC have been standardized and are now widely adopted. Despite their great success, we observe that these standard compression techniques are not the best solution for data compression in special types of multimedia systems such as microscopy videos and low-power wireless broadcast systems. In these application-specific systems where the content of interest in the multimedia data is known and well-defined, we should re-think the design of a data compression pipeline. We hypothesize that by identifying and prioritizing multimedia data's content of interest, new compression methods can be invented that are far more effective than standard techniques. In this dissertation, a set of new data compression methods based on the idea of prioritizing the content of interest has been proposed for three different kinds of multimedia systems. I will show that the key to designing efficient compression techniques in these three cases is to prioritize the content of interest in the data. The definition of the content of interest of multimedia data depends on the application. First, I show that for microscopy videos, the content of interest is defined as the spatial regions in the video frame with pixels that don't only contain noise. Keeping data in those regions with high quality and throwing out other information yields to a novel microscopy video compression technique. Second, I show that for a Bluetooth low energy beacon based system, practical multimedia data storage and transmission is possible by prioritizing content of interest. I designed custom image compression techniques that preserve edges in a binary image, or foreground regions of a color image of indoor or outdoor objects. Last, I present a new indoor Bluetooth low energy beacon based augmented reality system that integrates a 3D moving object compression method that prioritizes the content of interest.Doctor of Philosoph
The Caltech CSN project collects sensor data from thousands of personal devices for realtime response to dangerous earthquakes
The proliferation of smartphones and other powerful sensor-equipped consumer devices enables a new class of Web application: community sense and response (CSR) systems, distinguished from standard Web applications by their use of community-owned commercial sensor hardware. Just as social networks connect and share human-generated content, CSR systems gather, share, and act on sensory data from users' Internet-enabled devices. Here, we discuss the Caltech Community Seismic Network (CSN) as a prototypical CSR system harnessing accelerometers in smartphones and consumer electronics, including the systems and algorithmic challenges of designing, building, and evaluating a scalable network for real-time awareness of dangerous earthquakes
Private Communication Detection via Side-Channel Attacks
Private communication detection (PCD) enables an ordinary network user to discover communication patterns (e.g., call time, length, frequency, and initiator) between two or more private parties. Analysis of communication patterns between private parties has historically been a powerful tool used by intelligence, military, law-enforcement and business organizations because it can reveal the strength of tie between these parties. Ordinary users are assumed to have neither eavesdropping capabilities (e.g., the network may employ strong anonymity measures) nor the legal authority (e.g. no ability to issue a warrant to network providers) to collect private-communication records. We show that PCD is possible by ordinary users merely by sending packets to various network end-nodes and analyzing the responses. Three approaches for PCD are proposed based on a new type of side channels caused by resource contention, and defenses are proposed. The Resource-Saturation PCD exploits the resource contention (e.g., a fixed-size buffer) by sending carefully designed packets and monitoring different responses. Its effectiveness has been demonstrated on three commercial closed-source VoIP phones. The Stochastic PCD shows that timing side channels in the form of probing responses, which are caused by distinct resource-contention responses when different applications run in end nodes, enable effective PCD despite network and proxy-generated noise (e.g., jitter, delays). It was applied to WiFi and Instant Messaging for resource contention in the radio channel and the keyboard, respectively. Similar analysis enables practical Sybil node detection. Finally, the Service-Priority PCD utilizes the fact that 3G/2G mobile communication systems give higher priority to voice service than data service. This allows detection of the busy status of smartphones, and then discovery of their call records by correlating the busy status. This approach was successfully applied to iPhone and Android phones in AT&T's network. An additional, unanticipated finding was that an Internet user could disable a 2G phone's voice service by probing it with short enough intervals (e.g., 1 second). PCD defenses can be traditional side-channel countermeasures or PCD-specific ones, e.g., monitoring and blocking suspicious periodic network traffic
- …