644 research outputs found

    Joint model-based recognition and localization of overlapped acoustic events using a set of distributed small microphone arrays

    Get PDF
    In the analysis of acoustic scenes, often the occurring sounds have to be detected in time, recognized, and localized in space. Usually, each of these tasks is done separately. In this paper, a model-based approach to jointly carry them out for the case of multiple simultaneous sources is presented and tested. The recognized event classes and their respective room positions are obtained with a single system that maximizes the combination of a large set of scores, each one resulting from a different acoustic event model and a different beamformer output signal, which comes from one of several arbitrarily-located small microphone arrays. By using a two-step method, the experimental work for a specific scenario consisting of meeting-room acoustic events, either isolated or overlapped with speech, is reported. Tests carried out with two datasets show the advantage of the proposed approach with respect to some usual techniques, and that the inclusion of estimated priors brings a further performance improvement.Comment: Computational acoustic scene analysis, microphone array signal processing, acoustic event detectio

    On parameter filtering in continuous subword-unit-based speech recognition

    Get PDF
    Simple IIR or FIR filters have been widely used in isolated or connected word recognition tasks to filter the time sequence of speech spectral parameters, since, despite their simplicity, they significantly improve recognition performance. Those filters, when applied to continuous speech recognition, where phoneme-sized modelling units are used, induce spectral transition spreading and a cross-boundary effect. The authors show how the use of context-dependent units reduces the side effects of the filters and may result in improved recognition performance. When dynamic parameters are not used, filtering seems to be especially useful, even for clean speech, and when they are, filters do well under unmatched training and testing conditions.Peer ReviewedPostprint (published version

    Genomic determinants of chronic lymphocytic leukemia progression: from individual drivers to a heterogeneous genetic makeup

    Get PDF
    [eng] Chronic lymphocytic leukemia (CLL) is the most common form of adult leukemia in Western countries. Although the disease might follow an indolent course, it rapidly progresses in a fraction of cases, become resistant to treatment, and eventually transform to a more aggressive B-cell lymphoma, known as Richter syndrome. The mechanisms underlying these distinct clinical courses are not fully understood. In this Thesis, we aimed to elucidate the genomic determinants of CLL progression, to provide tools to characterize these tumors from next-generation sequencing data, and to extract key biological findings that could improve the management of the patients. In the first chapter (Studies 1 and 2), we characterized a noncoding mutation effecting the small nuclear RNA U1, a component of the spliceosome involved in the 5’ splice site recognition via base paring. Mutations in this gene altered the splicing and expression of multiple genes, were found in CLL tumors lacking clinically relevant genomic alterations, and were independently associated with patients’ outcome. In the next chapter (Studies 3, 4, and 5), we aimed to deeper into the subclonal architecture of CLL. We identified mutations present in small subpopulations associated with disease progression, recognized common evolutionary trajectories, and showed that the integration of the whole tumor architecture into prognostic models could improve the stratification of the patients. In the third chapter (Study 6), we analyzed the whole genome of CLL patients undergoing Richter syndrome and observed that this transformation was accompanied by an increased mutational and genomic complexity. We identified a unifying mutational process that could orchestrate this genomic chaos. In the fourth chapter (Studies 7 and 8), we developed a bioinformatic algorithm aimed to reconstruct the immunoglobulin gene rearrangements in lymphoid neoplasms from whole-genome sequencing, which might facilitate the use of this methodology in the future clinical practice. By applying this algorithm, we studied a recurrent mutation in the IGLV3-21 gene associated with an aggressive disease with a strong influence on the current and future risk stratification of CLL patients. Altogether, this Thesis has contributed to understand the genomic determinants of CLL progression through the analysis of its dynamic and heterogeneous genetic makeup.[cat] La leucèmia limfocítica crònica (LLC) és la forma més freqüent de leucèmia en adults als països occidentals. Malgrat que la malaltia pot seguir un curs clínic indolent, aquesta pot progressar ràpidament en una fracció dels casos, tornant-se resistent al tractament i, fins i tot, transformar a un limfoma de cèl·lules B més agressiu, fenomen conegut com síndrome de Richter. Els mecanismes que condicionen aquests diferents cursos clínics no són del tot coneguts. Els objectius d’aquesta tesi doctoral van ser elucidar els determinants genòmics de la progressió de la LLC, proporcionar eines per caracteritzar aquests tumors a partir de dades de seqüenciació de nova generació i extreure aspectes biològics clau d’aquests tumors que permetin millorar el maneig dels pacients. Al primer capítol (estudis 1 i 2), vam caracteritzar una mutació no codificant que afectava l’ARN nuclear petit U1, un component de l’espliceosoma implicat en el reconeixement del lloc 5’ d’empalmament (o splicing) per complementarietat de seqüència. Les mutacions d’aquest gen van alterar l’splicing i l’expressió de múltiples gens, es van trobar en tumors de LLC mancats d’alteracions genòmiques clínicament rellevants i es van associar de forma independentment amb el pronòstic dels pacients. En el següent capítol (estudis 3, 4 i 5), vam intentar aprofundir en l’arquitectura subclonal de la LLC. Es van identificar mutacions presents en petites subpoblacions associades a la progressió de la malaltia, es van reconèixer trajectòries evolutives comunes i es va demostrar que la integració de l'arquitectura tumoral en models pronòstics podria millorar l'estratificació dels pacients. En el tercer capítol (estudi 6), vam analitzar el genoma complet de pacients amb LLC que desenvolupaven la síndrome de Richter i vam observar que aquesta transformació anava acompanyava d’un augment de la complexitat mutacional i genòmica. Vam identificar un procés mutacional unificador que podria orquestrar aquest caos genòmic. Al quart capítol (estudis 7 i 8), vam desenvolupar un algoritme bioinformàtic dirigit a reconstruir els reordenaments genètics de les immunoglobulines en les neoplàsies limfoides a partir de la seqüenciació del genoma complet, cosa que podria facilitar l’ús d’aquesta metodologia en la pràctica clínica. Mitjançant l’aplicació d’aquest algoritme, vam estudiar una mutació recurrent en el gen IGLV3-21 associada a una malaltia agressiva amb una forta influència en la present i futura estratificació dels pacients amb LLC. En conjunt, aquesta tesi ha contribuït a entendre els determinants genòmics de la progressió de la LLC mitjançant l’anàlisi de la seva dinàmica i heterogènia composició genètica

    Frequency averaging: a useful multiwindow spectral analysis approach

    Get PDF
    The multiwindow approach is a meaningful framework for nonparametric spectral estimation. It also encompasses several conventional methods as WOSA and frequency-averaged periodogram. Recently, some authors claimed that the Slepian windows of Thomson's method and other related optimal sets of windows show a better performance in terms of resolution, variance and leakage. In this paper, that claim is discussed by means of some simulation examples and by applying the various methods to speech recognition. In conclusion, frequency averaging of the periodogram is a computationally simple method that has a great flexibility for band specification and comparatively shows good performance. In fact, it is the spectral analysis technique most extensively employed for speech recognition.Peer ReviewedPostprint (published version

    Wavelet transforms for non-uniform speech recognition

    Get PDF
    An algorithm for nonuniform speech segmentation and its application in speech recognition systems is presented. A method based on the Modulated Gaussian Wavelet Transform based Speech Analyser (MGWTSA) and the subsequent parametrization block is used to transform a uniform signal into a set of nonuniformly separated frames, with the accurate information being fed into a speech recognition system. The algorithm needs a frame characterizing the signal where necessary, trying to reduce the number of frames per signal as much as possible, without an appreciable reduction in the recognition rate of the system.Peer ReviewedPostprint (published version

    Design of a phonetic corpus for speech recognition in catalan

    Get PDF
    In this paper, we present the design of a corpus for speech recognition to be used for the recording of a speech database in Catalan. A previous database in Spanish was the reference in setting the specifications about the characteristics of the sentences and in the minimum number of units required. An analysis of unit frequencies were carried out in order to know which units were relevant for training and to compare the results with the figures from the designed corpus. Three different sub-corpora were generated, one for training, ...Peer ReviewedPostprint (published version

    notes

    Get PDF

    Monolingual and bilingual spanish-catalan speech recognizers developed from SpeechDat databases

    Get PDF
    Under the SpeechDat specifications, the Spanish member of SpeechDat consortium has recorded a Catalan database that includes one thousand speakers. This communication describes some experimental work that has been carried out using both the Spanish and the Catalan speech material. A speech recognition system has been trained for the Spanish language using a selection of the phonetically balanced utterances from the 4500 SpeechDat training sessions. Utterances with mispronounced or incomplete words and with intermittent noise were discarded. A set of 26 allophones was selected to account for the Spanish sounds and clustered demiphones have been used as context dependent sub-lexical units. Following the same methodology, a recognition system was trained from the Catalan SpeechDat database. Catalan sounds were described with 32 allophones. Additionally, a bilingual recognition system was built for both the Spanish and Catalan languages. By means of clustering techniques, the suitable set of allophones to cover simultaneously both languages was determined. Thus, 33 allophones were selected. The training material was built by the whole Catalan training material and the Spanish material coming from the Eastern region of Spain (the region where Catalan is spoken). The performance of the Spanish, Catalan and bilingual systems were assessed under the same framework. The Spanish system exhibits a significantly better performance than the rest of systems due to its better training. The bilingual system provides an equivalent performance to that afforded by both language specific systems trained with the Eastern Spanish material or the Catalan SpeechDat corpus.Peer ReviewedPostprint (published version

    A fast one-pass-training feature selection technique for GMM-based acoustic event detection with audio-visual data

    Get PDF
    Acoustic event detection becomes a difficult task, even for a small number of events, in scenarios where events are produced rather spontaneously and often overlap in time. In this work, we aim to improve the detection rate by means of feature selection. Using a one-against-all detection approach, a new fast one-pass-training algorithm, and an associated highly-precise metric are developed. Choosing a different subset of multimodal features for each acoustic event class, the results obtained from audiovisual data collected in the UPC multimodal room show an improvement in average detection rate with respect to using the whole set of features.Peer ReviewedPreprin
    • …
    corecore