202 research outputs found

    Fog Computing in Medical Internet-of-Things: Architecture, Implementation, and Applications

    Full text link
    In the era when the market segment of Internet of Things (IoT) tops the chart in various business reports, it is apparently envisioned that the field of medicine expects to gain a large benefit from the explosion of wearables and internet-connected sensors that surround us to acquire and communicate unprecedented data on symptoms, medication, food intake, and daily-life activities impacting one's health and wellness. However, IoT-driven healthcare would have to overcome many barriers, such as: 1) There is an increasing demand for data storage on cloud servers where the analysis of the medical big data becomes increasingly complex, 2) The data, when communicated, are vulnerable to security and privacy issues, 3) The communication of the continuously collected data is not only costly but also energy hungry, 4) Operating and maintaining the sensors directly from the cloud servers are non-trial tasks. This book chapter defined Fog Computing in the context of medical IoT. Conceptually, Fog Computing is a service-oriented intermediate layer in IoT, providing the interfaces between the sensors and cloud servers for facilitating connectivity, data transfer, and queryable local database. The centerpiece of Fog computing is a low-power, intelligent, wireless, embedded computing node that carries out signal conditioning and data analytics on raw data collected from wearables or other medical sensors and offers efficient means to serve telehealth interventions. We implemented and tested an fog computing system using the Intel Edison and Raspberry Pi that allows acquisition, computing, storage and communication of the various medical data such as pathological speech data of individuals with speech disorders, Phonocardiogram (PCG) signal for heart rate estimation, and Electrocardiogram (ECG)-based Q, R, S detection.Comment: 29 pages, 30 figures, 5 tables. Keywords: Big Data, Body Area Network, Body Sensor Network, Edge Computing, Fog Computing, Medical Cyberphysical Systems, Medical Internet-of-Things, Telecare, Tele-treatment, Wearable Devices, Chapter in Handbook of Large-Scale Distributed Computing in Smart Healthcare (2017), Springe

    Non-linear analysis of cello pitch and timbre

    Get PDF
    Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Architecture, 1992.Includes bibliographical references (leaves 62-65).by Andrew Choon-ki Hong.M.S

    Portfolio of Compositions with Commentaries

    Get PDF
    This portfolio analyses the creative means by which a number of audio and visual compositions were realised. It attempts to dissect the influential factors in the creation of such pieces and to explore the technological processes involved in the creation of such work. It is a personal analysis of a body of work which represents a hybrid of influences, spanning several years. It is supported by three DVDs, which contain audio and visual material and software files which were used in the composition and performance of these works

    Automatic annotation of musical audio for interactive applications

    Get PDF
    PhDAs machines become more and more portable, and part of our everyday life, it becomes apparent that developing interactive and ubiquitous systems is an important aspect of new music applications created by the research community. We are interested in developing a robust layer for the automatic annotation of audio signals, to be used in various applications, from music search engines to interactive installations, and in various contexts, from embedded devices to audio content servers. We propose adaptations of existing signal processing techniques to a real time context. Amongst these annotation techniques, we concentrate on low and mid-level tasks such as onset detection, pitch tracking, tempo extraction and note modelling. We present a framework to extract these annotations and evaluate the performances of different algorithms. The first task is to detect onsets and offsets in audio streams within short latencies. The segmentation of audio streams into temporal objects enables various manipulation and analysis of metrical structure. Evaluation of different algorithms and their adaptation to real time are described. We then tackle the problem of fundamental frequency estimation, again trying to reduce both the delay and the computational cost. Different algorithms are implemented for real time and experimented on monophonic recordings and complex signals. Spectral analysis can be used to label the temporal segments; the estimation of higher level descriptions is approached. Techniques for modelling of note objects and localisation of beats are implemented and discussed. Applications of our framework include live and interactive music installations, and more generally tools for the composers and sound engineers. Speed optimisations may bring a significant improvement to various automated tasks, such as automatic classification and recommendation systems. We describe the design of our software solution, for our research purposes and in view of its integration within other systems.EU-FP6-IST-507142 project SIMAC (Semantic Interaction with Music Audio Contents); EPSRC grants GR/R54620; GR/S75802/01

    Portfolio of Compositions with Commentaries

    Get PDF
    This portfolio analyses the creative means by which a number of audio and visual compositions were realised. It attempts to dissect the influential factors in the creation of such pieces and to explore the technological processes involved in the creation of such work. It is a personal analysis of a body of work which represents a hybrid of influences, spanning several years. It is supported by three DVDs, which contain audio and visual material and software files which were used in the composition and performance of these works

    Proceedings of the Linux Audio Conference 2018

    Get PDF
    These proceedings contain all papers presented at the Linux Audio Conference 2018. The conference took place at c-base, Berlin, from June 7th - 10th, 2018 and was organized in cooperation with the Electronic Music Studio at TU Berlin

    Perceptual strategies in active and passive hearing of neotropical bats

    Get PDF
    Basic spectral and temporal sound properties, such as frequency content and timing, are evaluated by the auditory system to build an internal representation of the external world and to generate auditory guided behaviour. Using echolocating bats as model system, I investigated aspects of spectral and temporal processing during echolocation and in relation to passive listening, and the echo-acoustic object recognition for navigation. In the first project (chapter 2), the spectral processing during passive and active hearing was compared in the echolocting bat Phyllostomus discolor. Sounds are ubiquitously used for many vital behaviours, such as communication, predator and prey detection, or echolocation. The frequency content of a sound is one major component for the correct perception of the transmitted information, but it is distorted while travelling from the sound source to the receiver. In order to correctly determine the frequency content of an acoustic signal, the receiver needs to compensate for these distortions. We first investigated whether P. discolor compensates for distortions of the spectral shape of transmitted sounds during passive listening. Bats were trained to discriminate lowpass filtered from highpass filtered acoustic impulses, while hearing a continuous white noise background with a flat spectral shape. We then assessed their spontaneous classification of acoustic impulses with varying spectral content depending on the background’s spectral shape (flat or lowpass filtered). Lowpass filtered noise background increased the proportion of highpass classifications of the same filtered impulses, compared to white noise background. Like humans, the bats thus compensated for the background’s spectral shape. In an active-acoustic version of the identical experiment, the bats had to classify filtered playbacks of their emitted echolocation calls instead of passively presented impulses. During echolocation, the classification of the filtered echoes was independent of the spectral shape of the passively presented background noise. Likewise, call structure did not change to compensate for the background’s spectral shape. Hence, auditory processing differs between passive and active hearing, with echolocation representing an independent mode with its own rules of auditory spectral analysis. The second project (chapter 3) was concerned with the accurate measurement of the time of occurrence of auditory signals, and as such also distance in echolocation. In addition, the importance of passive listening compared to echolocation turned out to be an unexpected factor in this study. To measure the distance to objects, called ranging, bats measure the time delay between an outgoing call and its returning echo. Ranging accuracy received considerable interest in echolocation research for several reasons: (i) behaviourally, it is of importance for the bat’s ability to locate objects and navigate its surrounding, (ii) physiologically, the neuronal implementation of precise measurements of very short time intervals is a challenge and (iii) the conjectured echo-acoustic receiver of bats is of interest for signal processing. Here, I trained the nectarivorous bat Glossophaga soricina to detect a jittering real target and found a biologically plausible distance accuracy of 4–7 mm, corresponding to a temporal accuracy of 20–40 μs. However, presumably all bats did not learn to use the jittering echo delay as the first and most prominent cue, but relied on passive acoustic listening first, which could only be prevented by the playback of masking noise. This shows that even a non-gleaning bat heavily relies on passive acoustic cues and that the measuring of short time intervals is difficult. This result questions other studies reporting a sub-microsecond time jitter threshold. The third project (chapter 4) linked the perception of echo-acoustic stimuli to the appropriate behavioural reactions, namely evasive flight manoeuvres around virtual objects presented in the flight paths of wild, untrained bats. Echolocating bats are able to orient in complete darkness only by analysing the echoes of their emitted calls. They detect, recognize and classify objects based on the spectro-temporal reflection pattern received at the two ears. Auditory object analysis, however, is inevitably more complicated than visual object analysis, because the one-dimensional acoustic time signal only transmits range information, i.e., the object’s distance and its longitudinal extent. All other object dimensions like width and height have to be inferred from comparative analysis of the signals at both ears and over time. The purpose of this study was to measure perceived object dimensions in wild, experimentally naïve bats by video-recording and analysing the bats’ evasive flight manoeuvres in response to the presentation of virtual echo-acoustic objects with independently manipulated acoustic parameters. Flight manoeuvres were analysed by extracting the flight paths of all passing bats. As a control to our method, we also recorded the flight paths of bats in response to a real object. Bats avoided the real object by flying around it. However, we did not find any flight path changes in response to the presentation of several virtual objects. We assume that the missing spatial extent of virtual echo-acoustic objects, due to playback from only one loudspeaker, was the main reason for the failure to evoke evasive flight manoeuvres. This study therefore emphasises for the first time the importance of the spatial dimension of virtual objects, which were up to now neglected in virtual object presentations

    Deep Learning Methods for Instrument Separation and Recognition

    Get PDF
    This thesis explores deep learning methods for timbral information processing in polyphonic music analysis. It encompasses two primary tasks: Music Source Separation (MSS) and Instrument Recognition, with focus on applying domain knowledge and utilising dense arrangements of skip-connections in the frameworks in order to reduce the number of trainable parameters and create more efficient models. Musically-motivated Convolutional Neural Network (CNN) architectures are introduced, emphasizing kernels with vertical, square, and horizontal shapes. This design choice allows for the extraction of essential harmonic and percussive features, which enhances the discrimination of different instruments. Notably, this methodology proves valuable for Harmonic-Percussive Source Separation (HPSS) and instrument recognition tasks. A significant challenge in MSS is generalising to new instrument types and music styles. To address this, a versatile framework for adversarial unsupervised domain adaptation for source separation is proposed, particularly beneficial when labeled data for specific instruments is unavailable. The curation of the Tap & Fiddle dataset is another contribution of the research, offering mixed and isolated stem recordings of traditional Scandinavian fiddle tunes, along with foot-tapping accompaniments, fostering research in source separation and metrical expression analysis within these musical styles. Since our perception of timbre is affected in different ways by transient and stationary parts of sound, the research investigates the potential of Transient Stationary-Noise Decomposition (TSND) as a preprocessing step for frame-level recognition. A method that performs TSND of spectrograms and feeds the decomposed spectrograms to a neural classifier is proposed. Furthermore, this thesis introduces a novel deep learning-based approach for pitch streaming, treating the task as a note-level instrument classification. Such an approach is modular, meaning that it can also successfully stream predicted note-events and not only labelled ground truth note-event information to corresponding instruments. Therefore, the proposed pitch streaming method enables third-party multi-pitch estimation algorithms to perform multi-instrument AMT

    RADIC Voice Authentication: Replay Attack Detection using Image Classification for Voice Authentication Systems

    Get PDF
    Systems like Google Home, Alexa, and Siri that use voice-based authentication to verify their users’ identities are vulnerable to voice replay attacks. These attacks gain unauthorized access to voice-controlled devices or systems by replaying recordings of passphrases and voice commands. This shows the necessity to develop more resilient voice-based authentication systems that can detect voice replay attacks. This thesis implements a system that detects voice-based replay attacks by using deep learning and image classification of voice spectrograms to differentiate between live and recorded speech. Tests of this system indicate that the approach represents a promising direction for detecting voice-based replay attacks
    corecore