Search CORE

4,788 research outputs found

Visually Indicated Sounds

Author: Adelson Edward H.
Freeman William T.
Isola Phillip
McDermott Josh
Owens Andrew
Torralba Antonio
Publication venue
Publication date: 29/04/2016
Field of study

Objects make distinctive sounds when they are hit or scratched. These sounds reveal aspects of an object's material properties, as well as the actions that produced them. In this paper, we propose the task of predicting what sound an object makes when struck as a way of studying physical interactions within a visual scene. We present an algorithm that synthesizes sound from silent videos of people hitting and scratching objects with a drumstick. This algorithm uses a recurrent neural network to predict sound features from videos and then produces a waveform from these features with an example-based synthesis procedure. We show that the sounds predicted by our model are realistic enough to fool participants in a "real or fake" psychophysical experiment, and that they convey significant information about material properties and physical interactions

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Recommended from our members

A speech envelope landmark for syllable encoding in human superior temporal gyrus.

Author: Chang Edward F
Oganian Yulia
Publication venue: eScholarship, University of California
Publication date: 01/11/2019
Field of study

The most salient acoustic features in speech are the modulations in its intensity, captured by the amplitude envelope. Perceptually, the envelope is necessary for speech comprehension. Yet, the neural computations that represent the envelope and their linguistic implications are heavily debated. We used high-density intracranial recordings, while participants listened to speech, to determine how the envelope is represented in human speech cortical areas on the superior temporal gyrus (STG). We found that a well-defined zone in middle STG detects acoustic onset edges (local maxima in the envelope rate of change). Acoustic analyses demonstrated that timing of acoustic onset edges cues syllabic nucleus onsets, while their slope cues syllabic stress. Synthesized amplitude-modulated tone stimuli showed that steeper slopes elicited greater responses, confirming cortical encoding of amplitude change, not absolute amplitude. Overall, STG encoding of the timing and magnitude of acoustic onset edges underlies the perception of speech temporal structure

eScholarship - University of California

Ultra-high-frequency piecewise-linear chaos using delayed feedback loops

Author: Chua L. O.
Damien Rontani
Daniel J. Gauthier
Seth D. Cohen
Publication venue: 'AIP Publishing'
Publication date: 14/08/2012
Field of study

We report on an ultra-high-frequency (> 1 GHz), piecewise-linear chaotic system designed from low-cost, commercially available electronic components. The system is composed of two electronic time-delayed feedback loops: A primary analog loop with a variable gain that produces multi-mode oscillations centered around 2 GHz and a secondary loop that switches the variable gain between two different values by means of a digital-like signal. We demonstrate experimentally and numerically that such an approach allows for the simultaneous generation of analog and digital chaos, where the digital chaos can be used to partition the system's attractor, forming the foundation for a symbolic dynamics with potential applications in noise-resilient communications and radar

arXiv.org e-Print Archive

Crossref

Fog Computing in Medical Internet-of-Things: Architecture, Implementation, and Applications

Author: Abtahi Mohammadreza
Akbar Umer
Borthakur Debanjan
Constant Nicholas
Dubey Harishchandra
Mahler Leslie
Mankodiya Kunal
Monteiro Admir
Sun Yan
Yang Qing
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/06/2017
Field of study

In the era when the market segment of Internet of Things (IoT) tops the chart in various business reports, it is apparently envisioned that the field of medicine expects to gain a large benefit from the explosion of wearables and internet-connected sensors that surround us to acquire and communicate unprecedented data on symptoms, medication, food intake, and daily-life activities impacting one's health and wellness. However, IoT-driven healthcare would have to overcome many barriers, such as: 1) There is an increasing demand for data storage on cloud servers where the analysis of the medical big data becomes increasingly complex, 2) The data, when communicated, are vulnerable to security and privacy issues, 3) The communication of the continuously collected data is not only costly but also energy hungry, 4) Operating and maintaining the sensors directly from the cloud servers are non-trial tasks. This book chapter defined Fog Computing in the context of medical IoT. Conceptually, Fog Computing is a service-oriented intermediate layer in IoT, providing the interfaces between the sensors and cloud servers for facilitating connectivity, data transfer, and queryable local database. The centerpiece of Fog computing is a low-power, intelligent, wireless, embedded computing node that carries out signal conditioning and data analytics on raw data collected from wearables or other medical sensors and offers efficient means to serve telehealth interventions. We implemented and tested an fog computing system using the Intel Edison and Raspberry Pi that allows acquisition, computing, storage and communication of the various medical data such as pathological speech data of individuals with speech disorders, Phonocardiogram (PCG) signal for heart rate estimation, and Electrocardiogram (ECG)-based Q, R, S detection.Comment: 29 pages, 30 figures, 5 tables. Keywords: Big Data, Body Area Network, Body Sensor Network, Edge Computing, Fog Computing, Medical Cyberphysical Systems, Medical Internet-of-Things, Telecare, Tele-treatment, Wearable Devices, Chapter in Handbook of Large-Scale Distributed Computing in Smart Healthcare (2017), Springe

arXiv.org e-Print Archive

Crossref

Analysis of variations in diesel engine idle vibration

Author: Ajovalasit M
Giacomin J
Publication venue: 'SAGE Publications'
Publication date: 01/01/2003
Field of study

The variations in diesel engine idle vibration caused by fuels of different composition and their contributions to the variations in steering wheel vibrations were assessed. The time-varying covariance method (TV-AutoCov) and time-frequency continuous wavelet transform (CWT) techniques were used to obtain the cyclic and instantaneous characteristics of the vibration data acquired from two turbocharged four-cylinder, four-stroke diesel engine vehicles at idle under 12 different fuel conditions. The analysis revealed that TV-AutoCov analysis was the most effective for detecting changes in cycle-to-cycle combustion energy (22.61 per cent), whereas changes in the instantaneous Values of the combustion peaks were best measured using the CWT method (2.47 per cent). On the other hand, both methods showed that diesel idle vibration was more affected by amplitude modulation ( 12.54 per cent) than frequency modulation (4.46 per cent). The results of this work suggest the use of amplitude modulated signals for studying the human subjective response to diesel idle vibration at the steering wheel in passenger cars

Archivio istituzionale della ricerca - Politecnico di Milano

Brunel University Research Archive

Spectral analysis of blood velocity in the human fetus

Author: Gallagher Francis J.
Publication venue: RIT Scholar Works
Publication date: 01/05/1995
Field of study

None provided

RIT Scholar Works

Reconstructing Speech from Human Auditory Cortex

Direct brain recordings from neurosurgical patients listening to speech reveal that the acoustic speech signals can be reconstructed from neural activity in auditory cortex

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

FigShare

Time and Frequency Independent Manipulation of Audio in Real Time

Author: Yost Christian Albert
Publication venue: Bard Digital Commons
Publication date: 01/01/2018
Field of study

Analog audio implies time-frequency dependence. With digitally sampled audio, this timefrequency dependence can be broken and either variable can be manipulated independently of the other, in real time. This paper will mostly focus on the frequency domain algorithm called the Phase Vocoder which breaks this time-frequency dependence. We will start by looking at Fourier Theory and the effect of discrete sampling. Then we will look at the Phase Vocoder\u27s theory of operation, as well as improvements made by Puckette, Laroche, and Dolson, to name a few. Through all of this, simple examples will be presented in order to gain intuition into the principles at hand. Towards the end, a time domain approach for time-frequency independence called Granular Synthesis will be explored. We will compare it to the Phase Vocoder, and see how our understanding of one changes how we think and make decisions for the other. Finally we will propose some ideas for further improvement to real-time time-frequency independent manipulation of audio

Bard College

Automatic Drum Transcription and Source Separation

Author: Fitzgerald Derry
Publication venue: Dublin Institute of Technology
Publication date: 01/06/2004
Field of study

While research has been carried out on automated polyphonic music transcription, to-date the problem of automated polyphonic percussion transcription has not received the same degree of attention. A related problem is that of sound source separation, which attempts to separate a mixture signal into its constituent sources. This thesis focuses on the task of polyphonic percussion transcription and sound source separation of a limited set of drum instruments, namely the drums found in the standard rock/pop drum kit. As there was little previous research on polyphonic percussion transcription a broad review of music information retrieval methods, including previous polyphonic percussion systems, was also carried out to determine if there were any methods which were of potential use in the area of polyphonic drum transcription. Following on from this a review was conducted of general source separation and redundancy reduction techniques, such as Independent Component Analysis and Independent Subspace Analysis, as these techniques have shown potential in separating mixtures of sources. Upon completion of the review it was decided that a combination of the blind separation approach, Independent Subspace Analysis (ISA), with the use of prior knowledge as used in music information retrieval methods, was the best approach to tackling the problem of polyphonic percussion transcription as well as that of sound source separation. A number of new algorithms which combine the use of prior knowledge with the source separation abilities of techniques such as ISA are presented. These include sub-band ISA, Prior Subspace Analysis (PSA), and an automatic modelling and grouping technique which is used in conjunction with PSA to perform polyphonic percussion transcription. These approaches are demonstrated to be effective in the task of polyphonic percussion transcription, and PSA is also demonstrated to be capable of transcribing drums in the presence of pitched instruments

Arrow@TUDublin