15,749 research outputs found

    A summary of the 2012 JHU CLSP Workshop on Zero Resource Speech Technologies and Models of Early Language Acquisition

    Get PDF
    We summarize the accomplishments of a multi-disciplinary workshop exploring the computational and scientific issues surrounding zero resource (unsupervised) speech technologies and related models of early language acquisition. Centered around the tasks of phonetic and lexical discovery, we consider unified evaluation metrics, present two new approaches for improving speaker independence in the absence of supervision, and evaluate the application of Bayesian word segmentation algorithms to automatic subword unit tokenizations. Finally, we present two strategies for integrating zero resource techniques into supervised settings, demonstrating the potential of unsupervised methods to improve mainstream technologies.5 page(s

    Multivariate time series classification with temporal abstractions

    Get PDF
    The increase in the number of complex temporal datasets collected today has prompted the development of methods that extend classical machine learning and data mining methods to time-series data. This work focuses on methods for multivariate time-series classification. Time series classification is a challenging problem mostly because the number of temporal features that describe the data and are potentially useful for classification is enormous. We study and develop a temporal abstraction framework for generating multivariate time series features suitable for classification tasks. We propose the STF-Mine algorithm that automatically mines discriminative temporal abstraction patterns from the time series data and uses them to learn a classification model. Our experimental evaluations, carried out on both synthetic and real world medical data, demonstrate the benefit of our approach in learning accurate classifiers for time-series datasets. Copyright © 2009, Assocation for the Advancement of ArtdicaI Intelligence (www.aaai.org). All rights reserved

    Analysis of Three-Dimensional Protein Images

    Full text link
    A fundamental goal of research in molecular biology is to understand protein structure. Protein crystallography is currently the most successful method for determining the three-dimensional (3D) conformation of a protein, yet it remains labor intensive and relies on an expert's ability to derive and evaluate a protein scene model. In this paper, the problem of protein structure determination is formulated as an exercise in scene analysis. A computational methodology is presented in which a 3D image of a protein is segmented into a graph of critical points. Bayesian and certainty factor approaches are described and used to analyze critical point graphs and identify meaningful substructures, such as alpha-helices and beta-sheets. Results of applying the methodologies to protein images at low and medium resolution are reported. The research is related to approaches to representation, segmentation and classification in vision, as well as to top-down approaches to protein structure prediction.Comment: See http://www.jair.org/ for any accompanying file

    Bayesian Models for Unit Discovery on a Very Low Resource Language

    Get PDF
    Developing speech technologies for low-resource languages has become a very active research field over the last decade. Among others, Bayesian models have shown some promising results on artificial examples but still lack of in situ experiments. Our work applies state-of-the-art Bayesian models to unsupervised Acoustic Unit Discovery (AUD) in a real low-resource language scenario. We also show that Bayesian models can naturally integrate information from other resourceful languages by means of informative prior leading to more consistent discovered units. Finally, discovered acoustic units are used, either as the 1-best sequence or as a lattice, to perform word segmentation. Word segmentation results show that this Bayesian approach clearly outperforms a Segmental-DTW baseline on the same corpus.Comment: Accepted to ICASSP 201
    corecore