Search CORE

2,186 research outputs found

Musical notes classification with Neuromorphic Auditory System using FPGA and a Convolutional Spiking Network

Author: Cerezuela Escudero Elena
Domínguez Morales Manuel Jesús
Jiménez Fernández Ángel Francisco
Jiménez Moreno Gabriel
Linares Barranco Alejandro
Paz Vicente Rafael
Publication venue: IEEE Computer Society
Publication date: 01/01/2015
Field of study

In this paper, we explore the capabilities of a sound classification system that combines both a novel FPGA cochlear model implementation and a bio-inspired technique based on a trained convolutional spiking network. The neuromorphic auditory system that is used in this work produces a form of representation that is analogous to the spike outputs of the biological cochlea. The auditory system has been developed using a set of spike-based processing building blocks in the frequency domain. They form a set of band pass filters in the spike-domain that splits the audio information in 128 frequency channels, 64 for each of two audio sources. Address Event Representation (AER) is used to communicate the auditory system with the convolutional spiking network. A layer of convolutional spiking network is developed and trained on a computer with the ability to detect two kinds of sound: artificial pure tones in the presence of white noise and electronic musical notes. After the training process, the presented system is able to distinguish the different sounds in real-time, even in the presence of white noise.Ministerio de Economía y Competitividad TEC2012-37868-C04-0

idUS. Depósito de Investigación Universidad de Sevilla

Recognition of Emotions using Energy Based Bimodal Information Fusion and Correlation

Author: Asawa Krishna
Manchanda Priyanka
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 05/02/2020
Field of study

Multi-sensor information fusion is a rapidly developing research area which forms the backbone of numerous essential technologies such as intelligent robotic control, sensor networks, video and image processing and many more. In this paper, we have developed a novel technique to analyze and correlate human emotions expressed in voice tone & facial expression. Audio and video streams captured to populate audio and video bimodal data sets to sense the expressed emotions in voice tone and facial expression respectively. An energy based mapping is being done to overcome the inherent heterogeneity of the recorded bi-modal signal. The fusion process uses sampled and mapped energy signal of both modalities’s data stream and further recognize the overall emotional component using Support Vector Machine (SVM) classifier with the accuracy 93.06%

Re-UNIR

Visual Voice Activity Detection in the Wild

Author: Iosifidis Alexandros
Nikolaidis Nikolaos
Patrona Foteini
Pitas Ioannis
Tefas Anastasios
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/02/2016
Field of study

Explore Bristol Research

RGB-D datasets using microsoft kinect or similar sensors: a survey

Author: Galili
Guan
Hu
Kolner
Mulvad
Nakazawa
Palushani
Palushani
Publication venue: Springer
Publication date: 01/01/2015
Field of study

RGB-D data has turned out to be a very useful representation of an indoor scene for solving fundamental computer vision problems. It takes the advantages of the color image that provides appearance information of an object and also the depth image that is immune to the variations in color, illumination, rotation angle and scale. With the invention of the low-cost Microsoft Kinect sensor, which was initially used for gaming and later became a popular device for computer vision, high quality RGB-D data can be acquired easily. In recent years, more and more RGB-D image/video datasets dedicated to various applications have become available, which are of great importance to benchmark the state-of-the-art. In this paper, we systematically survey popular RGB-D datasets for different applications including object recognition, scene classification, hand gesture recognition, 3D-simultaneous localization and mapping, and pose estimation. We provide the insights into the characteristics of each important dataset, and compare the popularity and the difficulty of those datasets. Overall, the main goal of this survey is to give a comprehensive description about the available RGB-D datasets and thus to guide researchers in the selection of suitable datasets for evaluating their algorithms

Northumbria Research Link

Crossref

Springer - Publisher Connector

Online Research Database In Technology