Search CORE

1,841 research outputs found

An information theoretic characterisation of auditory encoding.

Author: Carlyon RP
Cusack R
Griffiths TD
Grube M
Kumar S
Overath T
von Kriegstein K
Warren JD
Publication venue
Publication date: 01/01/2007
Field of study

The entropy metric derived from information theory provides a means to quantify the amount of information transmitted in acoustic streams like speech or music. By systematically varying the entropy of pitch sequences, we sought brain areas where neural activity and energetic demands increase as a function of entropy. Such a relationship is predicted to occur in an efficient encoding mechanism that uses less computational resource when less information is present in the signal: we specifically tested the hypothesis that such a relationship is present in the planum temporale (PT). In two convergent functional MRI studies, we demonstrated this relationship in PT for encoding, while furthermore showing that a distributed fronto-parietal network for retrieval of acoustic information is independent of entropy. The results establish PT as an efficient neural engine that demands less computational resource to encode redundant signals than those with high information content

CiteSeerX

Directory of Open Access Journals

Reduced structural connectivity between left auditory thalamus and the motion-sensitive planum temporale in developmental dyslexia

Author: Blank Helen
Diaz Begona
Ruisinger Anja
Tschentscher Nadja
von Kriegstein Katharina
Publication venue
Publication date: 28/11/2018
Field of study

Developmental dyslexia is characterized by the inability to acquire typical reading and writing skills. Dyslexia has been frequently linked to cerebral cortex alterations; however recent evidence also points towards sensory thalamus dysfunctions: dyslexics showed reduced responses in the left auditory thalamus (medial geniculate body, MGB) during speech processing in contrast to neurotypical readers. In addition, in the visual modality, dyslexics have reduced structural connectivity between the left visual thalamus (lateral geniculate nucleus, LGN) and V5/MT, a cerebral cortex region involved in visual movement processing. Higher LGN-V5/MT connectivity in dyslexics was associated with the faster rapid naming of letters and numbers (RANln), a measure that is highly correlated with reading proficiency. We here tested two hypotheses that were directly derived from these previous findings. First, we tested the hypothesis that dyslexics have reduced structural connectivity between the left MGB and the auditory motion-sensitive part of the left planum temporale (mPT). Second, we hypothesized that the amount of left mPT-MGB connectivity correlates with dyslexics RANln scores. Using diffusion tensor imaging based probabilistic tracking we show that male adults with developmental dyslexia have reduced structural connectivity between the left MGB and the left mPT, confirming the first hypothesis. Stronger left mPT-MGB connectivity was not associated with faster RANnl scores in dyslexics, but in neurotypical readers. Our findings provide first evidence that reduced cortico-thalamic connectivity in the auditory modality is a feature of developmental dyslexia, and that it may also impact on reading related cognitive abilities in neurotypical readers

arXiv.org e-Print Archive

Face Video Competition

Author: Alba Castro JL
Chan CH
Costen N
Fang H
Kittler J
Marcel S
Mc Cool C
Paredes Palacios Roberto
Pavesic N
Poh N
Rua EA
Salah AA
Struc V
Villegas Mauricio
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2009
Field of study

The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-642-01793-3_73Person recognition using facial features, e.g., mug-shot images, has long been used in identity documents. However, due to the widespread use of web-cams and mobile devices embedded with a camera, it is now possible to realise facial video recognition, rather than resorting to just still images. In fact, facial video recognition offers many advantages over still image recognition; these include the potential of boosting the system accuracy and deterring spoof attacks. This paper presents the first known benchmarking effort of person identity verification using facial video data. The evaluation involves 18 systems submitted by seven academic institutes.The work of NPoh is supported by the advanced researcher fellowship PA0022121477of the Swiss NSF; NPoh, CHC and JK by the EU-funded Mobio project grant IST-214324; NPC and HF by the EPSRC grants EP/D056942 and EP/D054818; VS andNP by the Slovenian national research program P2-0250(C) Metrology and Biomet-ric System, the COST Action 2101 and FP7-217762 HIDE; and, AAS by the Dutch BRICKS/BSIK project.Poh, N.; Chan, C.; Kittler, J.; Marcel, S.; Mc Cool, C.; Rua, E.; Alba Castro, J.... (2009). Face Video Competition. En Advances in Biometrics: Third International Conference, ICB 2009, Alghero, Italy, June 2-5, 2009. Proceedings. 715-724. https://doi.org/10.1007/978-3-642-01793-3_73S715724Messer, K., Kittler, J., Sadeghi, M., Hamouz, M., Kostyn, A., Marcel, S., Bengio, S., Cardinaux, F., Sanderson, C., Poh, N., Rodriguez, Y., Kryszczuk, K., Czyz, J., Vandendorpe, L., Ng, J., Cheung, H., Tang, B.: Face authentication competition on the BANCA database. In: Zhang, D., Jain, A.K. (eds.) ICBA 2004. LNCS, vol. 3072, pp. 8–15. Springer, Heidelberg (2004)Messer, K., Kittler, J., Sadeghi, M., Hamouz, M., Kostin, A., Cardinaux, F., Marcel, S., Bengio, S., Sanderson, C., Poh, N., Rodriguez, Y., Czyz, J., Vandendorpe, L., McCool, C., Lowther, S., Sridharan, S., Chandran, V., Palacios, R.P., Vidal, E., Bai, L., Shen, L.-L., Wang, Y., Yueh-Hsuan, C., Liu, H.-C., Hung, Y.-P., Heinrichs, A., Muller, M., Tewes, A., vd Malsburg, C., Wurtz, R., Wang, Z., Xue, F., Ma, Y., Yang, Q., Fang, C., Ding, X., Lucey, S., Goss, R., Schneiderman, H.: Face authentication test on the BANCA database. In: Int’l. Conf. Pattern Recognition (ICPR), vol. 4, pp. 523–532 (2004)Phillips, P.J., Flynn, P.J., Scruggs, T., Bowyer, K.W., Chang, J., Hoffman, K., Marques, J., Min, J., Worek, W.: Overview of the Face Recognition Grand Challenge. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pp. 947–954 (2005)Bailly-Baillière, E., Bengio, S., Bimbot, F., Hamouz, M., Kittler, J., Marithoz, J., Matas, J., Messer, K., Popovici, V., Porée, F., Ruiz, B., Thiran, J.-P.: The BANCA Database and Evaluation Protocol. In: Kittler, J., Nixon, M.S. (eds.) AVBPA 2003. LNCS, vol. 2688. Springer, Heidelberg (2003)Turk, M., Pentland, A.: Eigenfaces for Recognition. Journal of Cognitive Neuroscience 3(1), 71–86 (1991)Martin, A., Doddington, G., Kamm, T., Ordowsk, M., Przybocki, M.: The DET Curve in Assessment of Detection Task Performance. In: Proc. Eurospeech 1997, Rhodes, pp. 1895–1898 (1997)Bengio, S., Marithoz, J.: The Expected Performance Curve: a New Assessment Measure for Person Authentication. In: The Speaker and Language Recognition Workshop (Odyssey), Toledo, pp. 279–284 (2004)Poh, N., Bengio, S.: Database, Protocol and Tools for Evaluating Score-Level Fusion Algorithms in Biometric Authentication. Pattern Recognition 39(2), 223–233 (2005)Martin, A., Przybocki, M., Campbell, J.P.: The NIST Speaker Recognition Evaluation Program, ch. 8. Springer, Heidelberg (2005

RiuNet

An auditory classifier employing a wavelet neural network implemented in a digital design

Author: Hughes Jonathan
Publication venue: RIT Scholar Works
Publication date: 03/10/2006
Field of study

This thesis addresses the problem of classifying audio as either voice or music. The goal was to solve this problem by means of digital logic circuit, capable of performing the classification in real time. Since digital audio is essentially a discrete non-periodic timeseries, it was necessary to extract features from the audio which are suitable for classification. The discrete wavelet transform combined with a feature extraction method was found to produce such features. The task of classifying these features was found to be best performed by an artificial neural network. Collectively known as a wavelet neural network, the digital logic design implementation of this architecture was effective in correctly identifying the test data sets. The wavelet neural network was first implemented as a software model, to develop the network architecture and parameters, and to determine ideal results. The unconstrained software simulation was capable of correctly classifying test data sets with greater than 90% accuracy. This model was not feasible as a digital logic design however, as the size of the implementation would have been prohibitive. The size of the resulting hardware model was constrained by reducing the widths of the data paths and storage registers. The hardware implementation of the wavelet processor consisted of a novel pipelined design with a novel data-flow control structure. The neural network training was performed entirely in software by way of a novel training algorithm, and the resulting weights were made to be available to be uploaded to the hardware model. The digital design of the wavelet neural network was modeled in VHDL and was synthesized with Synplicity Synplify, using Actel ProASICPlus APA600 synthesized library cells with a target clock frequency of 11.025 KHz, to match the sampling rate of the digital audio. The results of the synthesis indicated that the design could operate at 15.6 MHz, and required 96,265 logic cells. The resulting constrained wavelet neural network processor was capable of correctly classifying test data sets with greater than 70% accuracy. Additional modeling showed that with a reasonable increase in hardware size, greater than 86% accuracy is attainable. This thesis focused on classifying audio as either voice or music, and future research could readily extend this work to the problem of speaker recognition and multimedia indexing

RIT Scholar Works