3 research outputs found

    A flexible bio-inspired hierarchical model for analyzing musical timbre

    Get PDF
    A flexible and multipurpose bio-inspired hierarchical model for analyzing musical timbre is presented in this paper. Inspired by findings in the fields of neuroscience, computational neuroscience, and psychoacoustics, not only does the model extract spectral and temporal characteristics of a signal, but it also analyzes amplitude modulations on different timescales. It uses a cochlear filter bank to resolve the spectral components of a sound, lateral inhibition to enhance spectral resolution, and a modulation filter bank to extract the global temporal envelope and roughness of the sound from amplitude modulations. The model was evaluated in three applications. First, it was used to simulate subjective data from two roughness experiments. Second, it was used for musical instrument classification using the k-NN algorithm and a Bayesian network. Third, it was applied to find the features that characterize sounds whose timbres were labeled in an audiovisual experiment. The successful application of the proposed model in these diverse tasks revealed its potential in capturing timbral information

    Implementación de un sistema de reconocimiento de voz en FPGA como interfaz hombre-máquina en aplicaciones de robótica

    Get PDF
    110 páginas : ilustraciones color, tablas, figuras.Las aplicaciones de reconocimiento de voz exigen gran cantidad de recursos y alta velocidad de procesamiento, características que no siempre están disponibles en sistemas de procesamiento secuencial basados en software, tales como los sistemas de cómputo basados en procesadores convencionales. Por esta razón, la comunidad científica ha optado por emplear dispositivos que por su arquitectura paralela se caracterizan por realizar un procesamiento más eficiente de este tipo de señales, tales como los sistemas basados en procesadores digitales de señales DSPs (Digital Signal Processer). Un ejemplo de esto es la aplicación presentada por Yu-Hung Kao (Yu-Hung & Rajasekaran, 2000), en donde se desarrolla un sistema reconocedor de voz de vocabulario dinámico. En (Prevedello, Ledbetter, Farkas, & Khorasani, 2014) se presenta un trabajo orientado a evaluar el impacto de los sistemas de reconocimiento de voz por software (SRS) en tiempos de respuesta de informes radiológicos.Bibliografía: hoja 76.PregradoIngeniero Electrónic

    Digital neuromorphic auditory systems

    Get PDF
    This dissertation presents several digital neuromorphic auditory systems. Neuromorphic systems are capable of running in real-time at a smaller computing cost and consume lower power than on widely available general computers. These auditory systems are considered neuromorphic as they are modelled after computational models of the mammalian auditory pathway and are capable of running on digital hardware, or more specifically on a field-programmable gate array (FPGA). The models introduced are categorised into three parts: a cochlear model, an auditory pitch model, and a functional primary auditory cortical (A1) model. The cochlear model is the primary interface of an input sound signal and transmits the 2D time-frequency representation of the sound to the pitch models as well as to the A1 model. In the pitch model, pitch information is extracted from the sound signal in the form of a fundamental frequency. From the A1 model, timbre information in the form of time-frequency envelope information of the sound signal is extracted. Since the computational auditory models mentioned above are required to be implemented on FPGAs that possess fewer computational resources than general-purpose computers, the algorithms in the models are optimised so that they fit on a single FPGA. The optimisation includes using simplified hardware-implementable signal processing algorithms. Computational resource information of each model on FPGA is extracted to understand the minimum computational resources required to run each model. This information includes the quantity of logic modules, register quantity utilised, and power consumption. Similarity comparisons are also made between the output responses of the computational auditory models on software and hardware using pure tones, chirp signals, frequency-modulated signal, moving ripple signals, and musical signals as input. The limitation of the responses of the models to musical signals at multiple intensity levels is also presented along with the use of an automatic gain control algorithm to alleviate such limitations. With real-world musical signals as their inputs, the responses of the models are also tested using classifiers – the response of the auditory pitch model is used for the classification of monophonic musical notes, and the response of the A1 model is used for the classification of musical instruments with their respective monophonic signals. Classification accuracy results are shown for model output responses on both software and hardware. With the hardware implementable auditory pitch model, the classification score stands at 100% accuracy for musical notes from the 4th and 5th octaves containing 24 classes of notes. With the hardware implementation auditory timbre model, the classification score is 92% accuracy for 12 classes musical instruments. Also presented is the difference in memory requirements of the model output responses on both software and hardware – pitch and timbre responses used for the classification exercises use 24 and 2 times less memory space for hardware than software
    corecore