819 research outputs found
Investigation into the Perceptually Informed Data for Environmental Sound Recognition
Environmental sound is rich source of information that can be used to infer contexts. With the rise in ubiquitous computing, the desire of environmental sound recognition is rapidly growing. Primarily, the research aims to recognize the environmental sound using the perceptually informed data. The initial study is concentrated on understanding the current state-of-the-art techniques in environmental sound recognition. Then those researches are evaluated by a critical review of the literature. This study extracts three sets of features: Mel Frequency Cepstral Coefficients, Mel-spectrogram and sound texture statistics. Two kinds machine learning algorithms are cooperated with appropriate sound features. The models are compared with a low-level baseline model. It also presents a performance comparison between each model with the high-level human listeners. The study results in sound texture statistics model performing the best classification by achieving 45.1% of accuracy based on support vector machine with radial basis function kernel. Another Mel-spectrogram model based on Convolutional Neural Network also provided satisfactory results and have received predictive results greater than the benchmark test
Teaching a Robotic Child - Machine Learning Strategies for a Humanoid Robot from Social Interactions
Visually Indicated Sounds
Objects make distinctive sounds when they are hit or scratched. These sounds
reveal aspects of an object's material properties, as well as the actions that
produced them. In this paper, we propose the task of predicting what sound an
object makes when struck as a way of studying physical interactions within a
visual scene. We present an algorithm that synthesizes sound from silent videos
of people hitting and scratching objects with a drumstick. This algorithm uses
a recurrent neural network to predict sound features from videos and then
produces a waveform from these features with an example-based synthesis
procedure. We show that the sounds predicted by our model are realistic enough
to fool participants in a "real or fake" psychophysical experiment, and that
they convey significant information about material properties and physical
interactions
- …