14,364 research outputs found
Reconstructing DNA copy number by joint segmentation of multiple sequences
The variation in DNA copy number carries information on the modalities of
genome evolution and misregulation of DNA replication in cancer cells; its
study can be helpful to localize tumor suppressor genes, distinguish different
populations of cancerous cell, as well identify genomic variations responsible
for disease phenotypes. A number of different high throughput technologies can
be used to identify copy number variable sites, and the literature documents
multiple effective algorithms. We focus here on the specific problem of
detecting regions where variation in copy number is relatively common in the
sample at hand: this encompasses the cases of copy number polymorphisms,
related samples, technical replicates, and cancerous sub-populations from the
same individual. We present an algorithm based on regularization approaches
with significant computational advantages and competitive accuracy. We
illustrate its applicability with simulated and real data sets.Comment: 54 pages, 5 figure
Robust ASR using Support Vector Machines
The improved theoretical properties of Support Vector Machines with respect to other machine learning alternatives due to their max-margin training paradigm have led us to suggest them as a good technique for robust speech recognition. However, important shortcomings have had to be circumvented, the most important being the normalisation of the time duration of different realisations of the acoustic speech units.
In this paper, we have compared two approaches in noisy environments: first, a hybrid HMM–SVM solution where a fixed number of frames is selected by means of an HMM segmentation and second, a normalisation kernel called Dynamic Time Alignment Kernel (DTAK) first introduced in Shimodaira et al. [Shimodaira, H., Noma, K., Nakai, M., Sagayama, S., 2001. Support vector machine with dynamic time-alignment kernel for speech recognition. In: Proc. Eurospeech, Aalborg, Denmark, pp. 1841–1844] and based on DTW (Dynamic Time Warping). Special attention has been paid to the adaptation of both alternatives to noisy environments, comparing two types of parameterisations and performing suitable feature normalisation operations. The results show that the DTA Kernel provides important advantages over the baseline HMM system in medium to bad noise conditions, also outperforming the results of the hybrid system.Publicad
Inferring Latent States and Refining Force Estimates via Hierarchical Dirichlet Process Modeling in Single Particle Tracking Experiments
Optical microscopy provides rich spatio-temporal information characterizing
in vivo molecular motion. However, effective forces and other parameters used
to summarize molecular motion change over time in live cells due to latent
state changes, e.g., changes induced by dynamic micro-environments,
photobleaching, and other heterogeneity inherent in biological processes. This
study focuses on techniques for analyzing Single Particle Tracking (SPT) data
experiencing abrupt state changes. We demonstrate the approach on GFP tagged
chromatids experiencing metaphase in yeast cells and probe the effective forces
resulting from dynamic interactions that reflect the sum of a number of
physical phenomena. State changes are induced by factors such as microtubule
dynamics exerting force through the centromere, thermal polymer fluctuations,
etc. Simulations are used to demonstrate the relevance of the approach in more
general SPT data analyses. Refined force estimates are obtained by adopting and
modifying a nonparametric Bayesian modeling technique, the Hierarchical
Dirichlet Process Switching Linear Dynamical System (HDP-SLDS), for SPT
applications. The HDP-SLDS method shows promise in systematically identifying
dynamical regime changes induced by unobserved state changes when the number of
underlying states is unknown in advance (a common problem in SPT applications).
We expand on the relevance of the HDP-SLDS approach, review the relevant
background of Hierarchical Dirichlet Processes, show how to map discrete time
HDP-SLDS models to classic SPT models, and discuss limitations of the approach.
In addition, we demonstrate new computational techniques for tuning
hyperparameters and for checking the statistical consistency of model
assumptions directly against individual experimental trajectories; the
techniques circumvent the need for "ground-truth" and subjective information.Comment: 25 pages, 6 figures. Differs only typographically from PLoS One
publication available freely as an open-access article at
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.013763
SVMs for Automatic Speech Recognition: a Survey
Hidden Markov Models (HMMs) are, undoubtedly, the most employed core technique for Automatic Speech Recognition (ASR). Nevertheless, we are still far from achieving high-performance ASR systems. Some alternative approaches, most of them based on Artificial Neural Networks (ANNs), were proposed during the late eighties and early nineties. Some of them tackled the ASR problem using predictive ANNs, while others proposed hybrid HMM/ANN systems. However, despite some achievements, nowadays, the preponderance of Markov Models is a fact.
During the last decade, however, a new tool appeared in the field of machine learning that has proved to be able to cope with hard classification problems in several fields of application: the Support Vector Machines (SVMs). The SVMs are effective discriminative classifiers with several outstanding characteristics, namely: their solution is that with maximum margin; they are capable to deal with samples of a very higher dimensionality; and their convergence to the minimum of the associated cost function is guaranteed.
These characteristics have made SVMs very popular and successful. In this chapter we discuss their strengths and weakness in the ASR context and make a review of the current state-of-the-art techniques. We organize the contributions in two parts: isolated-word recognition and continuous speech recognition. Within the first part we review several techniques to produce the fixed-dimension vectors needed for original SVMs. Afterwards we explore more sophisticated techniques based on the use of kernels capable to deal with sequences of different length. Among them is the DTAK kernel, simple and effective, which rescues an old technique of speech recognition: Dynamic Time Warping (DTW). Within the second part, we describe some recent approaches to tackle more complex tasks like connected digit recognition or continuous speech recognition using SVMs. Finally we draw some conclusions and outline several ongoing lines of research
Using Unmanned Aerial Systems for Deriving Forest Stand Characteristics in Mixed Hardwoods of West Virginia
Forest inventory information is a principle driver for forest management decisions. Information gathered through these inventories provides a summary of the condition of forested stands. The method by which remote sensing aids land managers is changing rapidly. Imagery produced from unmanned aerial systems (UAS) offer high temporal and spatial resolutions to small-scale forest management. UAS imagery is less expensive and easier to coordinate to meet project needs compared to traditional manned aerial imagery. This study focused on producing an efficient and approachable work flow for producing forest stand board volume estimates from UAS imagery in mixed hardwood stands of West Virginia. A supplementary aim of this project was to evaluate which season was best to collect imagery for forest inventory. True color imagery was collected with a DJI Phantom 3 Professional UAS and was processed in Agisoft Photoscan Professional. Automated tree crown segmentation was performed with Trimble eCognition Developer’s multi-resolution segmentation function with manual optimization of parameters through an iterative process. Individual tree volume metrics were derived from field data relationships and volume estimates were processed in EZ CRUZ forest inventory software. The software, at best, correctly segmented 43% of the individual tree crowns. No correlation between season of imagery acquisition and quality of segmentation was shown. Volume and other stand characteristics were not accurately estimated and were faulted by poor segmentation. However, the imagery was able to capture gaps consistently and provide a visualization of forest health. Difficulties, successes and time required for these procedures were thoroughly noted
- …