52 research outputs found

    First experiments on an HMM based double layer framework for automatic continuous speech recognition

    No full text
    The usual approach to automatic continuous speech recognition is what can be called the acoustic-phonetic modelling approach. In this approach, voice is considered to hold two different kinds of information acoustic and phonetic . Acoustic information is represented by some kind of feature extraction out of the voice signal, and phonetic information is extracted from the vocabulary of the task by means of a lexicon or some other procedure. The main assumption in this approach is that models can be constructed that capture the correlation existing between both kinds of information. The main limitation of acoustic-phonetic modelling in speech recognition is its poor treatment of the variability present both in the phonetic level and the acoustic one. In this paper, we propose the use of a slightly modified framework where the usual acoustic-phonetic modelling is divided into two different layers: one closer to the voice signal, and the other closer to the phonetics of the sentence. By doing so we expect an improvement of the modelling accuracy, as well as a better management of acoustic and phonetic variability. Experiments carried out so far, using a very simpli ed version of the proposed framework, show a signi cant improvement in the recognition of a large vocabulary continuous speech task, and represent a promising start point for future research.Peer ReviewedPostprint (published version

    Multidialectal acoustic modeling: a comparative study

    No full text
    In this paper, multidialectal acoustic modeling based on shar- ing data across dialects is addressed. A comparative study of different methods of combining data based on decision tree clustering algorithms is presented. Approaches evolved differ in the way of evaluating the similarity of sounds between di- alects, and the decision tree structure applied. Proposed systems are tested with Spanish dialects across Spain and Latin Amer- ica. All multidialectal proposed systems improve monodialectal performance using data from another dialect but it is shown that the way to share data is critical. The best combination between similarity measure and tree structure achieves an improvement of 7% over the results obtained with monodialectal systems.Peer ReviewedPostprint (published version

    Accurate Identification of ALK Positive Lung Carcinoma Patients: Novel FDA-Cleared Automated Fluorescence In Situ Hybridization Scanning System and Ultrasensitive Immunohistochemistry

    Full text link
    Background: Based on the excellent results of the clinical trials with ALK-inhibitors, the importance of accurately identifying ALK positive lung cancer has never been greater. However, there are increasing number of recent publications addressing discordances between FISH and IHC. The controversy is further fuelled by the different regulatory approvals. This situation prompted us to investigate two ALK IHC antibodies (using a novel ultrasensitive detection-amplification kit) and an automated ALK FISH scanning system (FDA-cleared) in a series of non-small cell lung cancer tumor samples. Methods: Forty-seven ALK FISH-positive and 56 ALK FISH-negative NSCLC samples were studied. All specimens were screened for ALK expression by two IHC antibodies (clone 5A4 from Novocastra and clone D5F3 from Ventana) and for ALK rearrangement by FISH (Vysis ALK FISH break-apart kit), which was automatically captured and scored by using Bioview's automated scanning system. Results: All positive cases with the IHC antibodies were FISH-positive. There was only one IHC-negative case with both antibodies which showed a FISH-positive result. The overall sensitivity and specificity of the IHC in comparison with FISH were 98% and 100%, respectively. Conclusions: The specificity of these ultrasensitive IHC assays may obviate the need for FISH confirmation in positive IHC cases. However, the likelihood of false negative IHC results strengthens the case for FISH testing, at least in some situation

    Silenci a la ciutat

    No full text
    Treballs de l'alumnat del Grau de Comunicació Audiovisual, Facultat d'Informació i Mitjans Audiovisuals, Universitat de Barcelona, De la idea a la pantalla. Curs: 2020-2021, Tutor: Josep Rovira. // Director: Marc Colomer Carbassé; Aj. Direcció: Edgar Rodríguez Prados; Productor: Edgar Rodríguez Prados; Aj. Producció: Thor Barroso Galeote; Guionista: Marc Colomer Carbassé; Dir. Fotografia: Thor Barroso Galeote; Càmera: Thor Barroso Galeote; Aj. càmera: Edgar Rodríguez Prados; Il·luminador: Edgar Rodríguez Prados; Direcció artística: Marc Colomer Carbassé; Direcció de so: Edgar Rodríguez Prados; Muntatge: Thor Barroso Galeote; Música: Marc Colomer Carbassé; Postproducció: Thor Barroso Galeote. Les imatges que apareixen han estat, pràcticament totes, registrades. Les úniques imatges no registrades són aquelles relacionades amb hospitals, PCR i altres imatges de l’àmbit sanitari. Aquestes han estat extretes de llibreries d’imatges de recurs. En l’inici del reportatge, del segon 00:14 al 00:39, s’utilitza un fragment de la cançó de The Doors, “The End”, la qual està protegida per drets d’autor.Amb l'arribada de la COVID-19, tots els esdeveniments, petits i grans, de música en directe, es van veure prorrogats o cancel·lats. La ciutat de Barcelona, l'any 2020, havia d'acollir grans concerts d'estrelles com Elton John o Paul McCartney; però també d'estrelles emergents com Nil Moliner o Khalid. Espais tan mítics com el Palau Sant Jordi o l'Estadi Olímpic Lluís Companys, acostumats a acollir milers de persones, ara es troben vuits, amb les persianes abaixades i, fins i tot, amb perill d'haver de tancar definitivament. Al mateix temps que aquests espais i els seus milers de treballadors s'han vist afectats, també ho han fet els milers de consumidors d'aquest tipus d'esdeveniments

    Multidialectal acoustic modeling: a comparative study

    No full text
    In this paper, multidialectal acoustic modeling based on shar- ing data across dialects is addressed. A comparative study of different methods of combining data based on decision tree clustering algorithms is presented. Approaches evolved differ in the way of evaluating the similarity of sounds between di- alects, and the decision tree structure applied. Proposed systems are tested with Spanish dialects across Spain and Latin Amer- ica. All multidialectal proposed systems improve monodialectal performance using data from another dialect but it is shown that the way to share data is critical. The best combination between similarity measure and tree structure achieves an improvement of 7% over the results obtained with monodialectal systems.Peer Reviewe

    Joint training of codebooks and acoustic models in automatic speech recognition using semi-continuous HMMs

    No full text
    In this paper, three different techniques for building semicontinuousHMMbased speech recognisers are compared: the classical one, using Euclidean generated codebooks and independently trained acoustic models; jointly reestimating the codebooks and models obtained with the classical method; and jointly creating codebooks and models growing their size from one centroid to the desired number of them. The way this growth may be done is carefully addressed, focusing on the selection of the splitting direction and the way splitting is implemented. Results in a large vocabulary task show the ef ciency of the approach, with noticeable improvements both in accuracy and CPU consumption. Moreover, this scheme enables the use of the concatenation of features, avoiding the independence assumption usually needed in semi-continuous HMM modelling, and leading to further improvements in accuracy and CPU.Peer ReviewedPostprint (published version

    Joint training of codebooks and acoustic models in automatic speech recognition using semi-continuous HMMs

    No full text
    In this paper, three different techniques for building semicontinuousHMMbased speech recognisers are compared: the classical one, using Euclidean generated codebooks and independently trained acoustic models; jointly reestimating the codebooks and models obtained with the classical method; and jointly creating codebooks and models growing their size from one centroid to the desired number of them. The way this growth may be done is carefully addressed, focusing on the selection of the splitting direction and the way splitting is implemented. Results in a large vocabulary task show the ef ciency of the approach, with noticeable improvements both in accuracy and CPU consumption. Moreover, this scheme enables the use of the concatenation of features, avoiding the independence assumption usually needed in semi-continuous HMM modelling, and leading to further improvements in accuracy and CPU.Peer Reviewe

    Multi-dialectal Spanish speech recognition

    No full text
    Spanish is a global language, spoken in a big number of different countries with a big dialectal variability‥ This paper deals with the suitability of using a single multi-dialectal acoustic modeling for all the Spanish variants spoken in Europe and Latin America. This paper deals with the suitability of using a single multi-dialectal acoustic modeling for all the Spanish variants spoken in Europe and Latin America. The objective is two fold. First, it allows to use all the available databases to jointly train and improve the same system. Second, it allows to use a single system for all the Spanish speakers. The paper describes the rule- based phonetic transcription used for each dialectal variant, the selection of the shared and the specific phonemes to be modeled in a multi-dialectal recognition system, and the results of a multi-dialectal system dealing with dialects in and out of the training set.Peer Reviewe

    First experiments on an HMM based double layer framework for automatic continuous speech recognition

    No full text
    The usual approach to automatic continuous speech recognition is what can be called the acoustic-phonetic modelling approach. In this approach, voice is considered to hold two different kinds of information acoustic and phonetic . Acoustic information is represented by some kind of feature extraction out of the voice signal, and phonetic information is extracted from the vocabulary of the task by means of a lexicon or some other procedure. The main assumption in this approach is that models can be constructed that capture the correlation existing between both kinds of information. The main limitation of acoustic-phonetic modelling in speech recognition is its poor treatment of the variability present both in the phonetic level and the acoustic one. In this paper, we propose the use of a slightly modified framework where the usual acoustic-phonetic modelling is divided into two different layers: one closer to the voice signal, and the other closer to the phonetics of the sentence. By doing so we expect an improvement of the modelling accuracy, as well as a better management of acoustic and phonetic variability. Experiments carried out so far, using a very simpli ed version of the proposed framework, show a signi cant improvement in the recognition of a large vocabulary continuous speech task, and represent a promising start point for future research.Peer Reviewe
    corecore