Search CORE

15,749 research outputs found

A summary of the 2012 JHU CLSP Workshop on Zero Resource Speech Technologies and Models of Early Language Acquisition

Author: Bennett Erin
Borschinger Benjamin
Chiu Justin
Church Kenneth
Clark Pascal
Dunbar Ewan
Dupoux Emmanuel
Feldman Naomi
Fourtassi Abdallah
Goldwater Sharon
Harwath David
Hermansky Hynek
Jansen Aren
Johnson Mark
Khudanpur Sanjeev
Lee Chia-ying
Levin Keith
McGraw Ian
Metze Florian
Norouzian Atta
Peddinti Vijay
Richardson Rachel
Rose Richard
Schatz Thomas
Seltzer Mike
Thomas Samuel
Varadarajan Balakrishnan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

We summarize the accomplishments of a multi-disciplinary workshop exploring the computational and scientific issues surrounding zero resource (unsupervised) speech technologies and related models of early language acquisition. Centered around the tasks of phonetic and lexical discovery, we consider unified evaluation metrics, present two new approaches for improving speaker independence in the absence of supervision, and evaluate the application of Bayesian word segmentation algorithms to automatic subword unit tokenizations. Finally, we present two strategies for integrating zero resource techniques into supervised settings, demonstrating the potential of unsupervised methods to improve mainstream technologies.5 page(s

Edinburgh Research Explorer

Macquarie University ResearchOnline

Multivariate time series classification with temporal abstractions

Author: Batal L
Bellazzi R
Hauskrecht M
Sacchi L
Publication venue
Publication date: 01/01/2009
Field of study

The increase in the number of complex temporal datasets collected today has prompted the development of methods that extend classical machine learning and data mining methods to time-series data. This work focuses on methods for multivariate time-series classification. Time series classification is a challenging problem mostly because the number of temporal features that describe the data and are potentially useful for classification is enormous. We study and develop a temporal abstraction framework for generating multivariate time series features suitable for classification tasks. We propose the STF-Mine algorithm that automatically mines discriminative temporal abstraction patterns from the time series data and uses them to learn a classification model. Our experimental evaluations, carried out on both synthetic and real world medical data, demonstrate the benefit of our approach in learning accurate classifiers for time-series datasets. Copyright © 2009, Assocation for the Advancement of ArtdicaI Intelligence (www.aaai.org). All rights reserved

CiteSeerX

D-Scholarship@Pitt

University of Pittsburgh ETD Submission Page

Analysis of Three-Dimensional Protein Images

Author: Baxter K.
Fortier S.
Glasgow J.
Leherte L.
Steeg E.
Publication venue
Publication date: 01/01/1997
Field of study

A fundamental goal of research in molecular biology is to understand protein structure. Protein crystallography is currently the most successful method for determining the three-dimensional (3D) conformation of a protein, yet it remains labor intensive and relies on an expert's ability to derive and evaluate a protein scene model. In this paper, the problem of protein structure determination is formulated as an exercise in scene analysis. A computational methodology is presented in which a 3D image of a protein is segmented into a graph of critical points. Bayesian and certainty factor approaches are described and used to analyze critical point graphs and identify meaningful substructures, such as alpha-helices and beta-sheets. Results of applying the methodologies to protein images at low and medium resolution are reported. The research is related to approaches to representation, segmentation and classification in vision, as well as to top-down approaches to protein structure prediction.Comment: See http://www.jair.org/ for any accompanying file

arXiv.org e-Print Archive

CiteSeerX

Repository of the University of Namur

Bayesian Models for Unit Discovery on a Very Low Resource Language

Author: Besacier Laurent
Burget Lukas
Dupoux Emmanuel
Godard Pierre
Hasegawa-Johnson Mark
Khudanpur Sanjeev
Larsen Elin
Ondel Lucas
Scharenborg Odette
Yvon François
Publication venue
Publication date: 20/02/2018
Field of study

Developing speech technologies for low-resource languages has become a very active research field over the last decade. Among others, Bayesian models have shown some promising results on artificial examples but still lack of in situ experiments. Our work applies state-of-the-art Bayesian models to unsupervised Acoustic Unit Discovery (AUD) in a real low-resource language scenario. We also show that Bayesian models can naturally integrate information from other resourceful languages by means of informative prior leading to more consistent discovered units. Finally, discovered acoustic units are used, either as the 1-best sequence or as a lattice, to perform word segmentation. Word segmentation results show that this Bayesian approach clearly outperforms a Segmental-DTW baseline on the same corpus.Comment: Accepted to ICASSP 201

arXiv.org e-Print Archive

Crossref

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Recommended from our members

Covariate-assisted ranking and screening for large-scale two-sample inference

Author: Cai T. Tony
Sun Wenguang
Wang Weinan
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Two-sample multiple testing has a wide range of applications. The conventionalpractice first reduces the original observations to a vector of p-values and then chooses a cutoffto adjust for multiplicity. However, this data reduction step could cause significant loss ofinformation and thus lead to suboptimal testing procedures.We introduce a new framework fortwo-sample multiple testing by incorporating a carefully constructed auxiliary variable in inferenceto improve the power. A data-driven multiple-testing procedure is developed by employinga covariate-assisted ranking and screening (CARS) approach that optimally combines the informationfrom both the primary and the auxiliary variables. The proposed CARS procedureis shown to be asymptotically valid and optimal for false discovery rate control. The procedureis implemented in the R package CARS. Numerical results confirm the effectiveness of CARSin false discovery rate control and show that it achieves substantial power gain over existingmethods. CARS is also illustrated through an application to the analysis of a satellite imagingdata set for supernova detection

eScholarship - University of California