52 research outputs found
First experiments on an HMM based double layer framework for automatic continuous speech recognition
The usual approach to automatic continuous speech recognition is what can be called the acoustic-phonetic modelling approach. In this approach, voice is considered
to hold two different kinds of information acoustic and phonetic . Acoustic information is represented by some kind of feature extraction out of the voice signal, and phonetic information is extracted from the vocabulary of the task by means of a lexicon or some other procedure. The
main assumption in this approach is that models can be constructed that capture the correlation existing between
both kinds of information.
The main limitation of acoustic-phonetic modelling in speech recognition is its poor treatment of the variability
present both in the phonetic level and the acoustic one. In this paper, we propose the use of a slightly modified framework where the usual acoustic-phonetic modelling
is divided into two different layers: one closer to the voice signal, and the other closer to the phonetics of the sentence. By doing so we expect an improvement of
the modelling accuracy, as well as a better management of acoustic and phonetic variability. Experiments carried out so far, using a very simpli ed version of the proposed framework, show a signi cant improvement in the recognition of a large vocabulary continuous speech task, and represent a promising start point for
future research.Peer ReviewedPostprint (published version
Multidialectal acoustic modeling: a comparative study
In this paper, multidialectal acoustic modeling based on shar-
ing data across dialects is addressed. A comparative study of
different methods of combining data based on decision tree
clustering algorithms is presented. Approaches evolved differ
in the way of evaluating the similarity of sounds between di-
alects, and the decision tree structure applied. Proposed systems
are tested with Spanish dialects across Spain and Latin Amer-
ica. All multidialectal proposed systems improve monodialectal
performance using data from another dialect but it is shown that
the way to share data is critical. The best combination between
similarity measure and tree structure achieves an improvement
of 7% over the results obtained with monodialectal systems.Peer ReviewedPostprint (published version
Accurate Identification of ALK Positive Lung Carcinoma Patients: Novel FDA-Cleared Automated Fluorescence In Situ Hybridization Scanning System and Ultrasensitive Immunohistochemistry
Background: Based on the excellent results of the clinical trials with ALK-inhibitors, the importance of accurately identifying ALK positive lung cancer has never been greater. However, there are increasing number of recent publications addressing discordances between FISH and IHC. The controversy is further fuelled by the different regulatory approvals. This situation prompted us to investigate two ALK IHC antibodies (using a novel ultrasensitive detection-amplification kit) and an automated ALK FISH scanning system (FDA-cleared) in a series of non-small cell lung cancer tumor samples. Methods: Forty-seven ALK FISH-positive and 56 ALK FISH-negative NSCLC samples were studied. All specimens were screened for ALK expression by two IHC antibodies (clone 5A4 from Novocastra and clone D5F3 from Ventana) and for ALK rearrangement by FISH (Vysis ALK FISH break-apart kit), which was automatically captured and scored by using Bioview's automated scanning system. Results: All positive cases with the IHC antibodies were FISH-positive. There was only one IHC-negative case with both antibodies which showed a FISH-positive result. The overall sensitivity and specificity of the IHC in comparison with FISH were 98% and 100%, respectively. Conclusions: The specificity of these ultrasensitive IHC assays may obviate the need for FISH confirmation in positive IHC cases. However, the likelihood of false negative IHC results strengthens the case for FISH testing, at least in some situation
Silenci a la ciutat
Treballs de l'alumnat del Grau de Comunicació Audiovisual,
Facultat d'Informació i Mitjans Audiovisuals, Universitat de Barcelona,
De la idea a la pantalla.
Curs: 2020-2021, Tutor: Josep Rovira. // Director: Marc Colomer Carbassé; Aj. Direcció: Edgar Rodríguez Prados;
Productor: Edgar Rodríguez Prados; Aj. Producció: Thor Barroso Galeote; Guionista: Marc Colomer Carbassé;
Dir. Fotografia: Thor Barroso Galeote; Càmera: Thor Barroso Galeote; Aj. càmera: Edgar Rodríguez Prados;
Il·luminador: Edgar Rodríguez Prados; Direcció artística: Marc Colomer Carbassé; Direcció de so: Edgar Rodríguez Prados;
Muntatge: Thor Barroso Galeote; Música: Marc Colomer Carbassé; Postproducció: Thor Barroso Galeote.
Les imatges que apareixen han estat, pràcticament totes, registrades. Les úniques imatges no registrades són aquelles relacionades amb hospitals, PCR i altres imatges de l’àmbit sanitari. Aquestes han estat extretes de llibreries d’imatges de recurs.
En l’inici del reportatge, del segon 00:14 al 00:39, s’utilitza un fragment de la cançó de The Doors, “The End”, la qual està protegida per drets d’autor.Amb l'arribada de la COVID-19, tots els esdeveniments, petits i grans, de música en directe, es van veure prorrogats o cancel·lats. La ciutat de Barcelona, l'any 2020, havia d'acollir grans concerts d'estrelles com Elton John o Paul McCartney; però també d'estrelles emergents com Nil Moliner o Khalid. Espais tan mítics com el Palau Sant Jordi o l'Estadi Olímpic Lluís Companys, acostumats a acollir milers de persones, ara es troben vuits, amb les persianes abaixades i, fins i tot, amb perill d'haver de tancar definitivament. Al mateix temps que aquests espais i els seus milers de treballadors s'han vist afectats, també ho han fet els milers de consumidors d'aquest tipus d'esdeveniments
Multidialectal acoustic modeling: a comparative study
In this paper, multidialectal acoustic modeling based on shar-
ing data across dialects is addressed. A comparative study of
different methods of combining data based on decision tree
clustering algorithms is presented. Approaches evolved differ
in the way of evaluating the similarity of sounds between di-
alects, and the decision tree structure applied. Proposed systems
are tested with Spanish dialects across Spain and Latin Amer-
ica. All multidialectal proposed systems improve monodialectal
performance using data from another dialect but it is shown that
the way to share data is critical. The best combination between
similarity measure and tree structure achieves an improvement
of 7% over the results obtained with monodialectal systems.Peer Reviewe
Joint training of codebooks and acoustic models in automatic speech recognition using semi-continuous HMMs
In this paper, three different techniques for building semicontinuousHMMbased
speech recognisers are compared:
the classical one, using Euclidean generated codebooks and independently trained acoustic models; jointly reestimating
the codebooks and models obtained with the classical method; and jointly creating codebooks and models growing their size from one centroid to the desired number
of them. The way this growth may be done is carefully addressed, focusing on the selection of the splitting direction and the way splitting is implemented. Results in a large vocabulary task show the ef ciency of the approach, with noticeable improvements both in accuracy and CPU consumption. Moreover, this scheme enables the use of the concatenation of features, avoiding the independence assumption usually needed in semi-continuous HMM modelling, and leading to further improvements in accuracy and CPU.Peer ReviewedPostprint (published version
Joint training of codebooks and acoustic models in automatic speech recognition using semi-continuous HMMs
In this paper, three different techniques for building semicontinuousHMMbased
speech recognisers are compared:
the classical one, using Euclidean generated codebooks and independently trained acoustic models; jointly reestimating
the codebooks and models obtained with the classical method; and jointly creating codebooks and models growing their size from one centroid to the desired number
of them. The way this growth may be done is carefully addressed, focusing on the selection of the splitting direction and the way splitting is implemented. Results in a large vocabulary task show the ef ciency of the approach, with noticeable improvements both in accuracy and CPU consumption. Moreover, this scheme enables the use of the concatenation of features, avoiding the independence assumption usually needed in semi-continuous HMM modelling, and leading to further improvements in accuracy and CPU.Peer Reviewe
Multi-dialectal Spanish speech recognition
Spanish is a global language, spoken in a big number of different countries with a big dialectal variability‥ This paper deals with the suitability of using a single multi-dialectal acoustic modeling for all the Spanish variants spoken in Europe and Latin America. This paper deals with the suitability of using a single multi-dialectal acoustic modeling for all the Spanish variants spoken in Europe and Latin America. The objective is two fold. First, it allows to use all the available databases to jointly train and improve the same system. Second, it allows to use a single system for all the Spanish speakers. The paper describes the rule- based phonetic transcription used for each dialectal variant, the selection of the shared and the specific phonemes to be modeled in a multi-dialectal recognition system, and the results of a multi-dialectal system dealing with dialects in and out of the training set.Peer Reviewe
First experiments on an HMM based double layer framework for automatic continuous speech recognition
The usual approach to automatic continuous speech recognition is what can be called the acoustic-phonetic modelling approach. In this approach, voice is considered
to hold two different kinds of information acoustic and phonetic . Acoustic information is represented by some kind of feature extraction out of the voice signal, and phonetic information is extracted from the vocabulary of the task by means of a lexicon or some other procedure. The
main assumption in this approach is that models can be constructed that capture the correlation existing between
both kinds of information.
The main limitation of acoustic-phonetic modelling in speech recognition is its poor treatment of the variability
present both in the phonetic level and the acoustic one. In this paper, we propose the use of a slightly modified framework where the usual acoustic-phonetic modelling
is divided into two different layers: one closer to the voice signal, and the other closer to the phonetics of the sentence. By doing so we expect an improvement of
the modelling accuracy, as well as a better management of acoustic and phonetic variability. Experiments carried out so far, using a very simpli ed version of the proposed framework, show a signi cant improvement in the recognition of a large vocabulary continuous speech task, and represent a promising start point for
future research.Peer Reviewe
- …