Search CORE

8,871 research outputs found

Perceptually Motivated Wavelet Packet Transform for Bioacoustic Signal Enhancement

Author: Cohen I.
Deller J. R.
Fu Q.
Jidong Tao
Michael T. Johnson
Osiejuk T. S.
Seyfarth R. M.
Shao Y.
Yao Ren
Publication venue: e-Publications@Marquette
Publication date: 01/07/2008
Field of study

A significant and often unavoidable problem in bioacoustic signal processing is the presence of background noise due to an adverse recording environment. This paper proposes a new bioacoustic signal enhancement technique which can be used on a wide range of species. The technique is based on a perceptually scaled wavelet packet decomposition using a species-specific Greenwood scale function. Spectral estimation techniques, similar to those used for human speech enhancement, are used for estimation of clean signal wavelet coefficients under an additive noise model. The new approach is compared to several other techniques, including basic bandpass filtering as well as classical speech enhancement methods such as spectral subtraction, Wiener filtering, and Ephraim–Malah filtering. Vocalizations recorded from several species are used for evaluation, including the ortolan bunting (Emberiza hortulana), rhesus monkey (Macaca mulatta), and humpback whale (Megaptera novaeanglia), with both additive white Gaussian noise and environment recording noise added across a range of signal-to-noise ratios (SNRs). Results, measured by both SNR and segmental SNR of the enhanced wave forms, indicate that the proposed method outperforms other approaches for a wide range of noise conditions

epublications@Marquette

Crossref

A Novel Combined System of Direction Estimation and Sound Zooming of Multiple Speakers

Author: Khaddour H.
Rund F.
Schimmel J.
Publication venue: 'Brno University of Technology'
Publication date: 01/06/2015
Field of study

This article presents a new system for estimation the direction of multiple speakers and zooming the sound of one of them at a time. The proposed system is a combination of two levels; namely, sound source direction estimation, and acoustic zooming. The sound source direction estimation uses so-called the energetic analysis method for estimation the direction of multiple speakers, whereas the acoustic zooming is based on modifying the parameters of the directional audio coding (DirAC) in order to zoom the sound of a selected speaker among the others. Both listening tests and objective assessments are performed to evaluate this system using different time-frequency transforms

Directory of Open Access Journals

Digital library of Brno University of Technology

Scalable and perceptual audio compression

Author: Raad Mohammed
Publication venue: School of Electrical, Computer and Telecommunications Engineering
Publication date: 01/01/2003
Field of study

This thesis deals with scalable perceptual audio compression. Two scalable perceptual solutions as well as a scalable to lossless solution are proposed and investigated. One of the scalable perceptual solutions is built around sinusoidal modelling of the audio signal whilst the other is built on a transform coding paradigm. The scalable coders are shown to scale both in a waveform matching manner as well as a psychoacoustic manner. In order to measure the psychoacoustic scalability of the systems investigated in this thesis, the similarity between the original signal\u27s psychoacoustic parameters and that of the synthesized signal are compared. The psychoacoustic parameters used are loudness, sharpness, tonahty and roughness. This analysis technique is a novel method used in this thesis and it allows an insight into the perceptual distortion that has been introduced by any coder analyzed in this manner

Research Online

A novel steganography approach for audio files

Author: Abdulrazzaq Sazeen T
Rodrigues Marcos
Siddeq Mohammed M
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/03/2020
Field of study

We present a novel robust and secure steganography technique to hide images into audio files aiming at increasing the carrier medium capacity. The audio files are in the standard WAV format, which is based on the LSB algorithm while images are compressed by the GMPR technique which is based on the Discrete Cosine Transform (DCT) and high frequency minimization encoding algorithm. The method involves compression-encryption of an image file by the GMPR technique followed by hiding it into audio data by appropriate bit substitution. The maximum number of bits without significant effect on audio signal for LSB audio steganography is 6 LSBs. The encrypted image bits are hidden into variable and multiple LSB layers in the proposed method. Experimental results from observed listening tests show that there is no significant difference between the stego audio reconstructed from the novel technique and the original signal. A performance evaluation has been carried out according to quality measurement criteria of Signal-to-Noise Ratio (SNR) and Peak Signal-to-Noise Ratio (PSNR)

Sheffield Hallam University Research Archive

BigEAR: Inferring the Ambient and Emotional Correlates from Smartphone-based Acoustic Big Data

Author: Dubey Harishchandra
Mankodiya Kunal
Mehl Matthias R.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

This paper presents a novel BigEAR big data framework that employs psychological audio processing chain (PAPC) to process smartphone-based acoustic big data collected when the user performs social conversations in naturalistic scenarios. The overarching goal of BigEAR is to identify moods of the wearer from various activities such as laughing, singing, crying, arguing, and sighing. These annotations are based on ground truth relevant for psychologists who intend to monitor/infer the social context of individuals coping with breast cancer. We pursued a case study on couples coping with breast cancer to know how the conversations affect emotional and social well being. In the state-of-the-art methods, psychologists and their team have to hear the audio recordings for making these inferences by subjective evaluations that not only are time-consuming and costly, but also demand manual data coding for thousands of audio files. The BigEAR framework automates the audio analysis. We computed the accuracy of BigEAR with respect to the ground truth obtained from a human rater. Our approach yielded overall average accuracy of 88.76% on real-world data from couples coping with breast cancer.Comment: 6 pages, 10 equations, 1 Table, 5 Figures, IEEE International Workshop on Big Data Analytics for Smart and Connected Health 2016, June 27, 2016, Washington DC, US

arXiv.org e-Print Archive

DigitalCommons@URI

Fog Computing in Medical Internet-of-Things: Architecture, Implementation, and Applications

Author: Abtahi Mohammadreza
Akbar Umer
Borthakur Debanjan
Constant Nicholas
Dubey Harishchandra
Mahler Leslie
Mankodiya Kunal
Monteiro Admir
Sun Yan
Yang Qing
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/06/2017
Field of study

In the era when the market segment of Internet of Things (IoT) tops the chart in various business reports, it is apparently envisioned that the field of medicine expects to gain a large benefit from the explosion of wearables and internet-connected sensors that surround us to acquire and communicate unprecedented data on symptoms, medication, food intake, and daily-life activities impacting one's health and wellness. However, IoT-driven healthcare would have to overcome many barriers, such as: 1) There is an increasing demand for data storage on cloud servers where the analysis of the medical big data becomes increasingly complex, 2) The data, when communicated, are vulnerable to security and privacy issues, 3) The communication of the continuously collected data is not only costly but also energy hungry, 4) Operating and maintaining the sensors directly from the cloud servers are non-trial tasks. This book chapter defined Fog Computing in the context of medical IoT. Conceptually, Fog Computing is a service-oriented intermediate layer in IoT, providing the interfaces between the sensors and cloud servers for facilitating connectivity, data transfer, and queryable local database. The centerpiece of Fog computing is a low-power, intelligent, wireless, embedded computing node that carries out signal conditioning and data analytics on raw data collected from wearables or other medical sensors and offers efficient means to serve telehealth interventions. We implemented and tested an fog computing system using the Intel Edison and Raspberry Pi that allows acquisition, computing, storage and communication of the various medical data such as pathological speech data of individuals with speech disorders, Phonocardiogram (PCG) signal for heart rate estimation, and Electrocardiogram (ECG)-based Q, R, S detection.Comment: 29 pages, 30 figures, 5 tables. Keywords: Big Data, Body Area Network, Body Sensor Network, Edge Computing, Fog Computing, Medical Cyberphysical Systems, Medical Internet-of-Things, Telecare, Tele-treatment, Wearable Devices, Chapter in Handbook of Large-Scale Distributed Computing in Smart Healthcare (2017), Springe

arXiv.org e-Print Archive

Crossref