1,284 research outputs found

    Automatic voice disorder detection using self-supervised representations

    Get PDF
    Many speech features and models, including Deep Neural Networks (DNN), are used for classification tasks between healthy and pathological speech with the Saarbruecken Voice Database (SVD). However, accuracy values of 80.71% for phrases or 82.8% for vowels /aiu/ are the highest reported for audio samples in SVD when the evaluation includes the wide amount of pathologies in the database, instead of a selection of some pathologies. This paper targets this top performance in the state-of-the-art Automatic Voice Disorder Detection (AVDD) systems. In the framework of a DNN-based AVDD system we study the capability of Self-Supervised (SS) representation learning for describing discriminative cues between healthy and pathological speech. The system processes the SS temporal sequence of features with a single feed-forward layer and Class-Token (CT) Transformer for obtaining the classification between healthy and pathological speech. Furthermore, there is evaluated a suitable data extension of the training set with out-of-domain data is also evaluated to deal with the low availability of data for using DNN-based models in voice pathology detection. Experimental results using audio samples corresponding to phrases in the SVD dataset, including all pathologies available, show classification accuracy values until 93.36%. This means that the proposed AVDD system achieved accuracy improvements of 4.1% without the training data extension, and 15.62% after the training data extension compared to the baseline system. Beyond the novelty of using SS representations for AVDD, the fact of obtaining accuracies over 90% in these conditions and using the whole set of pathologies in the SVD is a milestone for voice disorder-related research. Furthermore, the study on the amount of in-domain data in the training set related to the system performance show guidance for the data preparation stage. Lessons learned in this work suggest guidelines for taking advantage of DNN, to boost the performance in developing automatic systems for diagnosis, treatment, and monitoring of voice pathologies

    La televisión móvil: el estado de la cuestión

    Get PDF
    El presente artículo recoge el estado de la cuestión de la investigación realizada sobre televisión en movilidad. Aunque la televisión para móvil es ahora mismo un objeto sobre-investigado, en el fondo sabemos poco acerca de ella. Sobre todo, desconocemos en qué medida podría desencadenarse la apertura de un nuevo abanico de hábitos socioculturales en torno a la experiencia televisiva en movilidad. Y desconocemos cómo podría ayudar a desempeñar un papel en la promoción de la ciudadanía. Ello mercería un programa de investigación.The present article shows “the state of art” of the research on mobile television. A lot of research has been done and it certainly shed some light; but we still do not know if mobile television is more than a small handheld devise and if it could open up a new market for the content specifically tailored for it, unleash new socio-cultural habits and promote citenzenship among their users. This would deserve a research programme

    Character-Based Handwritten Text Recognition of Multilingual Documents

    Full text link
    [EN] An effective approach to transcribe handwritten text documents is to follow a sequential interactive approach. During the supervision phase, user corrections are incorporated into the system through an ongoing retraining process. In the case of multilingual documents with a high percentage of out-of-vocabulary (OOV) words, two principal issues arise. On the one hand, a minor yet important matter for this interactive approach is to identify the language of the current text line image to be transcribed, as a language dependent recognisers typically performs better than a monolingual recogniser. On the other hand, word-based language models suffer from data scarcity in the presence of a large number of OOV words, degrading their estimation and affecting the performance of the transcription system. In this paper, we successfully tackle both issues deploying character-based language models combined with language identification techniques on an entire 764-page multilingual document. The results obtained significantly reduce previously reported results in terms of transcription error on the same task, but showed that a language dependent approach is not effective on top of character-based recognition of similar languages.The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement n◦ 287755. Also supported by the Spanish Government (MIPRCV ”Consolider Ingenio 2010”, iTrans2 TIN2009-14511, MITTRAL TIN2009-14633-C03-01 and FPU AP2007-0286) and the Generalitat Valenciana (Prometeo/2009/014).Del Agua Teba, MA.; Serrano Martinez Santos, N.; Civera Saiz, J.; Juan Císcar, A. (2012). Character-Based Handwritten Text Recognition of Multilingual Documents. Communications in Computer and Information Science. 328:187-196. https://doi.org/10.1007/978-3-642-35292-8_20S187196328Graves, A., Liwicki, M., Fernandez, S., Bertolami, R., Bunke, H., Schmidhuber, J.: A novel connectionist system for unconstrained handwriting recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(5), 855–868 (2009)Serrano, N., Tarazón, L., Pérez, D., Ramos-Terrades, O., Juan, A.: The GIDOC prototype. In: Proc. of the 10th Int. Workshop on Pattern Recognition in Information Systems (PRIS 2010), Funchal, Portugal, pp. 82–89 (2010)Serrano, N., Pérez, D., Sanchis, A., Juan, A.: Adaptation from Partially Supervised Handwritten Text Transcriptions. In: Proc. of the 11th Int. Conf. on Multimodal Interfaces and the 6th Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI 2009), Cambridge, MA, USA, pp. 289–292 (2009)Serrano, N., Sanchis, A., Juan, A.: Balancing error and supervision effort in interactive-predictive handwriting recognition. In: Proc. of the Int. Conf. on Intelligent User Interfaces (IUI 2010), Hong Kong, China, pp. 373–376 (2010)Serrano, N., Giménez, A., Sanchis, A., Juan, A.: Active learning strategies in handwritten text recognition. In: Proc. of the 12th Int. Conf. on Multimodal Interfaces and the 7th Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI 2010), Beijing, China, vol. (86) (November 2010)Pérez, D., Tarazón, L., Serrano, N., Castro, F., Ramos-Terrades, O., Juan, A.: The GERMANA database. In: Proc. of the 10th Int. Conf. on Document Analysis and Recognition (ICDAR 2009), Barcelona, Spain, pp. 301–305 (2009)del Agua, M.A., Serrano, N., Juan, A.: Language Identification for Interactive Handwriting Transcription of Multilingual Documents. In: Vitrià, J., Sanches, J.M., Hernández, M. (eds.) IbPRIA 2011. LNCS, vol. 6669, pp. 596–603. Springer, Heidelberg (2011)Ghosh, D., Dube, T., Shivaprasad, P.: Script Recognition: A Review. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI) 32(12), 2142–2161 (2010)Bisani, M., Ney, H.: Open vocabulary speech recognition with flat hybrid models. In: Proc. of the European Conf. on Speech Communication and Technology, pp. 725–728 (2005)Szoke, I., Burget, L., Cernocky, J., Fapso, M.: Sub-word modeling of out of vocabulary words in spoken term detection. In: IEEE Spoken Language Technology Workshop, SLT 2008, pp. 273–276 (December 2008)Brakensiek, A., Rottl, J., Kosmala, A., Rigoll, G.: Off-Line handwriting recognition using various hybrid modeling techniques and character N-Grams. In: 7th International Workshop on Frontiers in Handwritten Recognition, pp. 343–352 (2000)Zamora, F., Castro, M.J., España, S., Gorbe, J.: Unconstrained offline handwriting recognition using connectionist character n-grams. In: The 2010 International Joint Conference on Neural Networks (IJCNN), pp. 1–7 (July 2010)Marti, U.V., Bunke, H.: The IAM-database: an English sentence database for off-line handwriting recognition. IJDAR, 39–46 (2002)Schultz, T., Kirchhoff, K.: Multilingual Speech Processing (2006)Stolcke, A.: SRILM – an extensible language modeling toolkit. In: Proc. of ICSLP 2002, pp. 901–904 (September 2002)Rybach, D., Gollan, C., Heigold, G., Hoffmeister, B., Lööf, J., Schlüter, R., Ney, H.: The RWTH aachen university open source speech recognition system. In: Interspeech, Brighton, U.K., pp. 2111–2114 (September 2009)Efron, B., Tibshirani, R.J.: An Introduction to Bootstrap. Chapman & Hall/CRC (1994

    La unidad de garantía de calidad (UGC) en los nuevos estudios oficiales de postgrado en Europa: primeras experiencias en un máster en la Universidad de Córdoba, España

    Get PDF
    El presente trabajo tiene como objetivo dar a conocer los resultados provisionales realizados por las nuevas Unidades de Garantía de Calidad formadas dentro de los estudios de Postgrado Europeos (Másteres) basados en la Declaración de Bolonia, del año 1999. Esta declaración, en la cual se sientan las bases del Espacio Europea de Educación Superior (EEES), pretende unificar la adecuación y evaluación de todos los estudios Universitarios de pre-postgrado en la Unión Europea. Estas unidades, cuyo funcionamiento pleno comienza con el curso 2010/2011 son las encargadas de velar por el cumplimiento de los estándares de calidad previstos en los distintos programas de estudio a lo largo de las Universidades Europeas. Se exponen las experiencias obtenidas en el primer año de trabajo de esta UGC en el Master de Medicina, Sanidad y Mejora Animal de la Facultad de Veterinaria de la Universidad de Córdoba, en España. Los distintos procesos evaluativos de los colectivos que componen la estructura de este master son detallados, así como las acciones propuestas en búsqueda de la mejora de la calidad del mismo.Eje temático 3: Los procesos de evaluación y acreditación b - La evaluación de las prácticas pedagógicas, los docentes y las institucionesFacultad de Ciencias Agrarias y Forestale

    Analysis of the concentration of emissions from the Spanish fleet of tugboats

    Get PDF
    At present, the sensitivity of society towards emissions in commercial maritime ports is increasing, which is reflected in the large number of studies on the control of emissions in them, perhaps because the most important commercial ports are located in cities with high population density. The objective of this work was to determine the greenhouse gas emissions caused by the activity of the Spanish tugboat fleet, studying the tugboat fleet of the eleven autonomous coastal Spanish communities from 2004 to 2017 and their impact on the carbon footprint of the country?s shipping sector. To do this, the methodology used by the International Maritime Organization for merchant ships to estimate the emissions of a tugboat fleet is formalized, and Gini concentration index methodology was applied to the concentration of emissions from this fleet. This has made it possible to obtain results on the distribution of the concentration of emissions from Spanish ports by region, age, and size, as well as to establish the profile of the tugboat port that pollutes the most and its carbon footprint. One of the results is that in the period analyzed, the concentration of emissions from the Spanish tugboat fleet increased if we looked at its distribution by region, and decreased if we look at its distribution by age and size. This is because tugboat activity was very different by region; however, their characteristics related to age and size evolved in a more homogeneous way

    Estimación del riesgo a la exposición de partículas suspendidas en el Valle de Toluca

    Get PDF
    Las partículas PM10 suspendidas en la atmósfera afectan el sistema respiratorio humano, además presentan un riesgo potencial cancerígeno debido a la gran cantidad de hidrocarburos que se quema en la atmósfera. Se estimó la exposición de las partículas PM10 suspendidas en el Valle de Toluca y sus alrededores, con la información de 8 años, proporcionada por la Red Automática Monitoreo de la Zona Metropolitana del Valle de Toluca (rama t). Para la estimación de riesgo se tomó en consideración la distribución de Gumbel-1 de Valores Extremos, asimismo se utilizaron diferentes periodos de retorno y la ocurrencia probabilística en intervalos de tiempo de 1, 5, 10, 12.5, 15, 17.5 y 20 años. Se infirió, estadísticamente, un alto grado de riesgo a la salud, debido a la magnitud de la concentración media de estas partículas y se predice que, de existir las condiciones actuales, esta relación estadística permanecerá invariante dentro de los próximos 20 años

    Language identification for interactive handwriting transcription of multilingual documents

    Full text link
    An effective approach to handwriting transcription of (old) documents is to follow a sequential, line-by-line transcription of the whole document, in which a continuously retrained system interacts with the user. In the case of multilingual documents, however, a minor yet important issue for this interactive approach is to first identify the language of the current text line image to be transcribed. In this paper, we propose a probabilistic framework and three techniques for this purpose. Empirical results are reported on an entire 764-page multilingual document for which previous empirical tests were limited to its first 180 pages, written only in Spanish. © 2011 Springer-Verlag.Work supported by the EC (FEDER, FSE), the Spanish Government (MICINN, MITyC, “Plan E”, under grants MIPRCV “Consolider Ingenio 2010”, MITTRAL TIN2009- 14633-C03-01 and FPU AP2007-02867), the Generalitat Valenciana (grant Prometeo/2009/014 and ACOMP/2010/051) and the UPV (grant 20080033).Del Agua Teba, MA.; Serrano Martinez Santos, N.; Juan Císcar, A. (2011). Language identification for interactive handwriting transcription of multilingual documents. En Pattern Recognition and Image Analysis. Springer Verlag (Germany). 6669:596-603. https://doi.org/10.1007/978-3-642-21257-4_74S5966036669del Agua, M.A.: Multilingualidad en el reconocimiento de texto manuscrito. Final Degree Project (2010)Ghosh, D., Dube, T., Shivaprasad, P.: Script Recognition: A Review. IEEE Trans. on Pattern Analysis and Machine Intelligence (PAMI) 32(12), 2142–2161 (2010)Pérez, D., Tarazón, L., Serrano, N., Castro, F., Ramos-Terrades, O., Juan, A.: The GERMANA database. In: Proc. of the 10th Int. Conf. on Document Analysis and Recognition (ICDAR 2009), Barcelona, Spain, pp. 301–305 (2009)Plötz, T., Fink, G.: Markov models for offline handwriting recognition: a survey. Int. J. on Document Analysis and Recognition (IJDAR) 12(4), 269–298 (2009)Serrano, N., Giménez, A., Sanchis, A., Juan, A.: Active learning strategies in handwritten text recognition. In: Proc. of the 12th Int. Conf. on Multimodal Interfaces and the 7th Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI 2010), Beijing (China), vol. 86 (November 2010)Serrano, N., Pérez, D., Sanchis, A., Juan, A.: Adaptation from Partially Supervised Handwritten Text Transcriptions. In: Proc. of the 11th Int. Conf. on Multimodal Interfaces and the 6th Workshop on Machine Learning for Multimodal Interaction (ICMI-MLMI 2009), Cambridge, MA (USA), pp. 289–292 (2009)Serrano, N., Sanchis, A., Juan, A.: Balancing error and supervision effort in interactive-predictive handwriting recognition. In: Proc. of the Int. Conf. on Intelligent User Interfaces (IUI 2010), Hong Kong (China), pp. 373–376 (2010)Serrano, N., Tarazón, L., Pérez, D., Ramos-Terrades, O., Juan, A.: The GIDOC prototype. In: Proc. of the 10th Int. Workshop on Pattern Recognition in Information Systems (PRIS 2010), Funchal (Portugal), pp. 82–89 (2010

    The TransLectures-UPV Toolkit

    Full text link
    The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-13623-3_28Over the past few years, online multimedia educational repositories have increased in number and popularity. The main aim of the transLectures project is to develop cost-effective solutions for producing accurate transcriptions and translations for large video lecture repositories, such as VideoLectures.NET or the Universitat Politècnica de València s repository, poliMedia. In this paper, we present the transLectures-UPV toolkit (TLK), which has been specifically designed to meet the requirements of the transLectures project, but can also be used as a conventional ASR toolkit. The main features of the current release include HMM training and decoding with speaker adaptation techniques (fCMLLR). TLK has been tested on the VideoLectures.NET and poliMedia repositories, yielding very competitive results. TLK has been released under the permissive open source Apache License v2.0 and can be directly downloaded from the transLectures website.The research leading to these results has received funding from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no 287755 (transLectures) and ICT Policy Support Programme (ICT PSP/2007-2013) as part of the Competitiveness and InnovationFramework Programme (CIP) under grant agreement no 621030 (EMMA), andthe Spanish MINECO Active2Trans (TIN2012-31723) research project.Del Agua Teba, MA.; Giménez Pastor, A.; Serrano Martinez Santos, N.; Andrés Ferrer, J.; Civera Saiz, J.; Sanchis Navarro, JA.; Juan Císcar, A. (2014). The TransLectures-UPV Toolkit. En Advances in Speech and Language Technologies for Iberian Languages: Second International Conference, IberSPEECH 2014, Las Palmas de Gran Canaria, Spain, November 19-21, 2014. Proceedings. Springer International Publishing. 269-278. https://doi.org/10.1007/978-3-319-13623-3_28S269278Final report on massive adaptation (M36). To be delivered on October 2014 (2014)First report on massive adaptation (M12), https://www.translectures.eu/wp-content/uploads/2013/05/transLectures-D3.1.1-18Nov2012.pdfOpencast Matterhorn, http://opencast.org/matterhorn/sclite - Score speech recognition system output, http://www1.icsi.berkeley.edu/Speech/docs/sctk-1.2/sclite.htmSecond report on massive adaptation (M24), https://www.translectures.eu//wp-content/uploads/2014/01/transLectures-D3.1.2-15Nov2013.pdfTLK: The transLectures-UPV Toolkit, https://www.translectures.eu/tlk/Baum, L.E., Petrie, T., Soules, G., Weiss, N.: A Maximization Technique Occurring in the Statistical Analysis of Probabilistic Functions of Markov Chains. The Annals of Mathematical Statistics 41(1), 164–171 (1970)Dahl, G.E., Yu, D., Deng, L., Acero, A.: Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition. IEEE Transactions on Audio, Speech, and Language Processing 20(1), 30–42 (2012)Digalakis, V., Rtischev, D., Neumeyer, L., Sa, E.: Speaker Adaptation Using Constrained Estimation of Gaussian Mixtures. IEEE Transactions on Speech and Audio Processing 3, 357–366 (1995)Huang, J.T., Li, J., Yu, D., Deng, L., Gong, Y.: Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers. In: Proc. of ICASSP (2013)Munteanu, C., Baecker, R., Penn, G., Toms, E., James, D.: The Effect of Speech Recognition Accuracy Rates on the Usefulness and Usability of Webcast Archives. In: Proc. of CHI, pp. 493–502 (2006)Ney, H., Ortmanns, S.: Progress in dynamic programming search for LVCSR. Proceedings of the IEEE 88(8), 1224–1240 (2000)Ortmanns, S., Ney, H., Eiden, A.: Language-model look-ahead for large vocabulary speech recognition. In: Proc. of ICSLP, vol. 4, pp. 2095–2098 (1996)Ortmanns, S., Ney, H., Aubert, X.: A word graph algorithm for large vocabulary continuous speech recognition. Computer Speech and Language 11(1), 43–72 (1997)Povey, D., et al.: The Kaldi Speech Recognition Toolkit. In: Proc. of ASRU (2011)Rumelhart, D., Hintont, G., Williams, R.: Learning representations by back-propagating errors. Nature 323(6088), 533–536 (1986)Rybach, D., et al.: The RWTH Aachen University Open Source Speech Recognition System. In: Proc. Interspeech, pp. 2111–2114 (2009)Seide, F., Li, G., Chen, X., Yu, D.: Feature engineering in Context-Dependent Deep Neural Networks for conversational speech transcription. In: Proc. of ASRU, pp. 24–29 (2011)Viterbi, A.: Error bounds for convolutional codes and an asymptotically optimum decoding algorithm. IEEE Transactions on Information Theory 13(2), 260–269 (1967)Young, S., et al.: The HTK Book. Cambridge University Engineering Department (1995)Young, S.J., Odell, J.J., Woodland, P.C.: Tree-based state tying for high accuracy acoustic modelling. In: Proc. of HLT, pp. 307–312 (1994

    Versatile Graphene-Based Platform for Robust Nanobiohybrid Interfaces

    Get PDF
    Technologically useful and robust graphene-based interfaces for devices require the introduction of highly selective, stable, and covalently bonded functionalities on the graphene surface, whilst essentially retaining the electronic properties of the pristine layer. This work demonstrates that highly controlled, ultrahigh vacuum covalent chemical functionalization of graphene sheets with a thiol-terminated molecule provides a robust and tunable platform for the development of hybrid nanostructures in different environments. We employ this facile strategy to covalently couple two representative systems of broad interest: metal nanoparticles, via S-metal bonds, and thiol-modified DNA aptamers, via disulfide bridges. Both systems, which have been characterized by a multi-technique approach, remain firmly anchored to the graphene surface even after several washing cycles. Atomic force microscopy images demonstrate that the conjugated aptamer retains the functionality required to recognize a target protein. This methodology opens a new route to the integration of high-quality graphene layers into diverse technological platforms, including plasmonics, optoelectronics, or biosensing. With respect to the latter, the viability of a thiol-functionalized chemical vapor deposition graphene-based solution-gated field-effect transistor array was assessed

    Pancreatic cancer escape variants that evade immunogene therapy through loss of sensitivity to IFNgamma-induced apoptosis

    Get PDF
    Combined injections into experimental tumor nodules of adenovirus encoding IL-12 and certain chemokines are capable to induce immune-mediated complete regressions. In this study, we found that the combination of two adenoviruses, one encoding IL-12 and other MIP3alpha (AdCMVIL-12+AdCMVMIP3alpha) was very successful in treating CT-26-derived colon carcinomas. However, in experimental tumors generated from the pancreatic carcinoma cell line Panc02 such combined treatment induces 50% of macroscopic complete regressions, although local relapses within 1 week are almost constant. We derived cell lines from such relapsing tumors and found that experimental malignancies derived from their inoculum were not amenable to treatment in any case with AdCMVIL-12+AdCMVMIP-3alpha. Importantly, relapsing cell lines were insensitive to in vitro induction of apoptosis by IFNgamma, in clear contrast with the original Panc02 cells. Comparative analyses by cDNA arrays of relapsing cell lines versus wild-type Panc02 were performed revealing an important number of genes (383) whose expression levels were modified more than two-fold. These changes grouped in certain gene ontology categories should harbor the mechanistic explanations of the acquired selective resistance to IFNgamma
    corecore