15,741 research outputs found

    Deep semi-supervised segmentation with weight-averaged consistency targets

    Full text link
    Recently proposed techniques for semi-supervised learning such as Temporal Ensembling and Mean Teacher have achieved state-of-the-art results in many important classification benchmarks. In this work, we expand the Mean Teacher approach to segmentation tasks and show that it can bring important improvements in a realistic small data regime using a publicly available multi-center dataset from the Magnetic Resonance Imaging (MRI) domain. We also devise a method to solve the problems that arise when using traditional data augmentation strategies for segmentation tasks on our new training scheme.Comment: 8 pages, 1 figure, accepted for DLMIA/MICCA

    Prosody-Based Automatic Segmentation of Speech into Sentences and Topics

    Get PDF
    A crucial step in processing speech audio data for information extraction, topic detection, or browsing/playback is to segment the input into sentence and topic units. Speech segmentation is challenging, since the cues typically present for segmenting text (headers, paragraphs, punctuation) are absent in spoken language. We investigate the use of prosody (information gleaned from the timing and melody of speech) for these tasks. Using decision tree and hidden Markov modeling techniques, we combine prosodic cues with word-based approaches, and evaluate performance on two speech corpora, Broadcast News and Switchboard. Results show that the prosodic model alone performs on par with, or better than, word-based statistical language models -- for both true and automatically recognized words in news speech. The prosodic model achieves comparable performance with significantly less training data, and requires no hand-labeling of prosodic events. Across tasks and corpora, we obtain a significant improvement over word-only models using a probabilistic combination of prosodic and lexical information. Inspection reveals that the prosodic models capture language-independent boundary indicators described in the literature. Finally, cue usage is task and corpus dependent. For example, pause and pitch features are highly informative for segmenting news speech, whereas pause, duration and word-based cues dominate for natural conversation.Comment: 30 pages, 9 figures. To appear in Speech Communication 32(1-2), Special Issue on Accessing Information in Spoken Audio, September 200

    Integrating Prosodic and Lexical Cues for Automatic Topic Segmentation

    Get PDF
    We present a probabilistic model that uses both prosodic and lexical cues for the automatic segmentation of speech into topically coherent units. We propose two methods for combining lexical and prosodic information using hidden Markov models and decision trees. Lexical information is obtained from a speech recognizer, and prosodic features are extracted automatically from speech waveforms. We evaluate our approach on the Broadcast News corpus, using the DARPA-TDT evaluation metrics. Results show that the prosodic model alone is competitive with word-based segmentation methods. Furthermore, we achieve a significant reduction in error by combining the prosodic and word-based knowledge sources.Comment: 27 pages, 8 figure

    A framework for quantification and physical modeling of cell mixing applied to oscillator synchronization in vertebrate somitogenesis

    Get PDF
    In development and disease, cells move as they exchange signals. One example is found in vertebrate development, during which the timing of segment formation is set by a ‘segmentation clock’, in which oscillating gene expression is synchronized across a population of cells by Delta-Notch signaling. Delta-Notch signaling requires local cell-cell contact, but in the zebrafish embryonic tailbud, oscillating cells move rapidly, exchanging neighbors. Previous theoretical studies proposed that this relative movement or cell mixing might alter signaling and thereby enhance synchronization. However, it remains unclear whether the mixing timescale in the tissue is in the right range for this effect, because a framework to reliably measure the mixing timescale and compare it with signaling timescale is lacking. Here, we develop such a framework using a quantitative description of cell mixing without the need for an external reference frame and constructing a physical model of cell movement based on the data. Numerical simulations show that mixing with experimentally observed statistics enhances synchronization of coupled phase oscillators, suggesting that mixing in the tailbud is fast enough to affect the coherence of rhythmic gene expression. Our approach will find general application in analyzing the relative movements of communicating cells during development and disease.Fil: Uriu, Koichiro. Kanazawa University; JapónFil: Bhavna, Rajasekaran. Max Planck Institute of Molecular Cell Biology and Genetics; Alemania. Max Planck Institute for the Physics of Complex Systems; AlemaniaFil: Oates, Andrew C.. Francis Crick Institute; Reino Unido. University College London; Reino UnidoFil: Morelli, Luis Guillermo. Consejo Nacional de Investigaciones Científicas y Técnicas. Oficina de Coordinación Administrativa Parque Centenario. Instituto de Investigación en Biomedicina de Buenos Aires - Instituto Partner de la Sociedad Max Planck; Argentina. Max Planck Institute for Molecular Physiology; Alemania. Universidad de Buenos Aires. Facultad de Ciencias Exactas y Naturales. Departamento de Física; Argentin
    corecore