85 research outputs found
Novel Detection and Analysis using Deep Variational Autoencoders
This paper presents a Novel IdentiïŹcation System which uses generative modeling techniques and Gaussian Mixture Models (GMMs) to identify the main process variables involved in a novel event from multivariate data. Features are generated and subsequently dimensionally reduced by using a Variational Autoencoder (VAE) supplemented by a denoising criterion and a ÎČ disentangling method. The GMM parameters are learned using the Expectation Maximization(EM) algorithm on features collected from only normal operating conditions. A one-class classiïŹcation is achieved by thresholding the likelihoods by a statistically derived value. The Novel IdentiïŹcation method is veriïŹed as a detection method on existing Radio Frequency (RF) Generators and standard classiïŹcation datasets. The RF dataset contains 2 diïŹerent models of generators with almost 100 unique units tested. Novel Detection on these generators achieved an average testing true positive rate of 97.31% with an overall target class accuracy of 98.16%. A second application has the network evaluate process variables of the RF generators when a novel event is detected. This is achieved by using the VAE decoding layers to map the GMM parameters back to a space equivalent to the original input, resulting in a way to directly estimate the process variables ïŹtness
Estimating a potential without the agony of the partition function
Estimating a Gibbs density function given a sample is an important problem in
computational statistics and statistical learning. Although the well
established maximum likelihood method is commonly used, it requires the
computation of the partition function (i.e., the normalization of the density).
This function can be easily calculated for simple low-dimensional problems
but its computation is difficult or even intractable for general densities and
high-dimensional problems. In this paper we propose an alternative approach
based on Maximum A-Posteriori (MAP) estimators, we name Maximum Recovery MAP
(MR-MAP), to derive estimators that do not require the computation of the
partition function, and reformulate the problem as an optimization problem. We
further propose a least-action type potential that allows us to quickly solve
the optimization problem as a feed-forward hyperbolic neural network. We
demonstrate the effectiveness of our methods on some standard data sets
Machine learning in orthopedics: a literature review
In this paper we present the findings of a systematic literature review covering the articles published in the last two decades in which the authors described the application of a machine learning technique and method to an orthopedic problem or purpose. By searching both in the Scopus and Medline databases, we retrieved, screened and analyzed the content of 70 journal articles, and coded these resources following an iterative method within a Grounded Theory approach. We report the survey findings by outlining the articles\u2019 content in terms of the main machine learning techniques mentioned therein, the orthopedic application domains, the source data and the quality of their predictive performance
Compressed sensing with approximate message passing: measurement matrix and algorithm design
Compressed sensing (CS) is an emerging technique that exploits the properties of a sparse or
compressible signal to efficiently and faithfully capture it with a sampling rate far below the
Nyquist rate. The primary goal of compressed sensing is to achieve the best signal recovery
with the least number of samples. To this end, two research directions have been receiving
increasing attention: customizing the measurement matrix to the signal of interest and optimizing
the reconstruction algorithm. In this thesis, contributions in both directions are made
in the Bayesian setting for compressed sensing. The work presented in this thesis focuses on
the approximate message passing (AMP) schemes, a new class of recovery algorithm that takes
advantage of the statistical properties of the CS problem.
First of all, a complete sample distortion (SD) framework is presented to fundamentally quantify
the reconstruction performance for a certain pair of measurement matrix and recovery
scheme. In the SD setting, the non-optimality region of the homogeneous Gaussian matrix
is identified and the novel zeroing matrix is proposed with an improved performance. With the
SD framework, the optimal sample allocation strategy for the block diagonal measurement matrix
are derived for the wavelet representation of natural images. Extensive simulations validate
the optimality of the proposed measurement matrix design.
Motivated by the zeroing matrix, we extend the seeded matrix design in the CS literature to
the novel modulated matrix structure. The major advantage of the modulated matrix over the
seeded matrix lies in the simplicity of its state evolution dynamics. Together with the AMP
based algorithm, the modulated matrix possesses a 1-D performance prediction system, with
which we can optimize the matrix configuration. We then focus on a special modulated matrix
form, designated as the two block matrix, which can also be seen as a generalization of the
zeroing matrix. The effectiveness of the two block matrix is demonstrated through both sparse
and compressible signals. The underlining reason for the improved performance is presented
through the analysis of the state evolution dynamics.
The final contribution of the thesis explores improving the reconstruction algorithm. By taking
the signal prior into account, the Bayesian optimal AMP (BAMP) algorithm is demonstrated
to dramatically improve the reconstruction quality. The key insight for its success is that it
utilizes the minimum mean square error (MMSE) estimator for the CS denoising. However, the
prerequisite of the prior information makes it often impractical. A novel SURE-AMP algorithm
is proposed to address the dilemma. The critical feature of SURE-AMP is that the Steinâs
unbiased risk estimate (SURE) based parametric least square estimator is used to replace the
MMSE estimator. Given the optimization of the SURE estimator only involves the noisy data,
it eliminates the need for the signal prior, thus can accommodate more general sparse models
Hidden Markov Models
Hidden Markov Models (HMMs), although known for decades, have made a big career nowadays and are still in state of development. This book presents theoretical issues and a variety of HMMs applications in speech recognition and synthesis, medicine, neurosciences, computational biology, bioinformatics, seismology, environment protection and engineering. I hope that the reader will find this book useful and helpful for their own research
On the application of Bayesian statistics to protein structure calculation from nuclear magnetic resonance data
In the present work, we use concepts of Bayesian statistics to infer the three-dimensional structures of proteins from experimental data. We thus build upon the method of inferential structure determination (ISD) as introduced by Rieping et al. (2005). In line with their probabilistic approach, we factor the probability of a three-dimensional protein structure given the experimental data, into a prior distribution that captures the protein-likeness of a structure and the likelihood that describes how likely the experimental data were generated from a given three-dimensional structure. In this Bayesian framework, we attempt to develop structure calculation from NMR experiments into a highly accurate, objective and parameter-free process.
We start by focusing on integrating new types of data, as ISD currently does not entail a mechanism to incorporate chemical shifts in the calculation process. To alleviate this shortcoming, we propose a hidden Markov Model that captures the relationship between protein structures and chemical shifts. Based on our probabilistic model, we are able to predict the secondary structure and dihedral angles of a protein from chemical shifts.
Another means to high quality structures involves improving the potential functions that form the core of ISDâs prior distributions. Although potential functions are designed to approximate physical forces, there are still parameters, such as force constants and temperatures, that are set on an ad hoc basis and can bias the structure calculation. As an alternative, we propose an algorithm based on Bayesian model comparison to determine these parameters from the data. Further, we demonstrate that optimal data-dependent parameters lead to improved accuracy and quality of the final structure, especially with sparse and noisy data. These findings dismiss the notion of a single universal parameter and advocate the estimation of free parameters based on experimental data instead.
Third, we focus on the estimation of new potential functions to include even more prior information in the structure calculation process. Currently, only a few methods allow the estimation of potential functions from a database of known structures. Our method provides a sound mathematical solution of this problem, which is also known as the inverse problem of statistical mechanics.We demonstrate the effectiveness of our approach on the examples of simple fluids and a coarse-grained protein model.Im Rahmen dieser Arbeit stellen wir neue AnsĂ€tze, basierend auf der Bayesâschen Statistik, zur Interpretation von experimentellen Daten in der NMR-Spektroskopie vor. Dabei bauen wir auf den Ergebnissen von Rieping et al. (2005) auf, die das Prinzip der inferentiellen Strukturbestimmung (ISD) eingefĂŒhrt haben. Ihr probabilistischer Ansatz beruht auf der Faktorisierung der A-posteriori-Verteilung in die A-priori-Verteilung, welche die ProteinĂ€hnlichkeit einer möglichen Struktur bewertet, und die Likelihood-Funktion, welche die Ăbereinstimmung mit den experimentellen Daten beschreibt. Ziel dieser Arbeit ist es, die QualitĂ€t, aber auch die Vergleichbarkeit der Strukturberechnung in der NMR- Spektroskopie zu verbessern.
Zuerst beschÀftigen wir uns mit der Integration neuer experimenteller Datentypen in die Strukturrechnung. Dazu schlagen wir ein Hidden-Markov-Modell vor, das beruhend auf der chemischen Verschiebung die Dihedralwinkel und SekundÀrstruktur vorhersagt.
Eine Alternative zur Integration zusĂ€tzlicher experimenteller Information ist die Verbesserung der A-priori-Verteilung. In ISD beruht die A-priori-Verteilung auf einer Po- tentialfunktion, welche die frei Energie approximiert. Dennoch gibt es freie Parameter in Potentialfunktionen, wie die Temperatur oder doe Kraftkonstante, die festgelegt werden mĂŒssen. Wir benutzen Bayesâsche Hypothesentests, um die freien Parameter objektiv und beruhend auf den experimentellen Daten zu bestimmen. Die Anwendung der Bayesâschen Hypothesentests ermöglicht es uns, verschiedene Potentialfunktionen zu kombinieren, um aus verrauschten und unvollstĂ€ndigen Daten noch exakte Strukturen zu bestimmen. Weiterhin zeigen unsere Studien, dass fĂŒr statistische Potentiale keine allgemeingĂŒltige Kraftkonstante existiert und diese anhand der experimentellen Daten bestimmt werden sollte.
Im dritten Teil dieser Arbeit fĂŒhren wir eine Methode ein, um neue Kraftfelder aus Strukturdatenbanken zu erlernen und damit die A-priori-Verteilung noch weiter zu verbessern. Dieses nichtlineare Problem ist auch als inverses Problems der statistischen Mechanik bekannt, das wir durch eine Generalisierung des Konzepts der "Configurational Temperature" lösen. Wir benutzen unsere Methode, um die Potentialfunktionen von vereinfachten MolekĂŒldynamik Kraftfeldern zu rekonstruieren
Distributional semantics and machine learning for statistical machine translation
[EU]Lan honetan semantika distribuzionalaren eta ikasketa automatikoaren erabilera aztertzen
dugu itzulpen automatiko estatistikoa hobetzeko. Bide horretan, erregresio logistikoan
oinarritutako ikasketa automatikoko eredu bat proposatzen dugu hitz-segiden itzulpen-
probabilitatea modu dinamikoan modelatzeko. Proposatutako eredua itzulpen automatiko
estatistikoko ohiko itzulpen-probabilitateen orokortze bat dela frogatzen dugu, eta testuinguruko nahiz semantika distribuzionaleko informazioa barneratzeko baliatu ezaugarri
lexiko, hitz-cluster eta hitzen errepresentazio bektorialen bidez. Horretaz gain, semantika
distribuzionaleko ezagutza itzulpen automatiko estatistikoan txertatzeko beste hurbilpen
bat lantzen dugu: hitzen errepresentazio bektorial elebidunak erabiltzea hitz-segiden
itzulpenen antzekotasuna modelatzeko. Gure esperimentuek proposatutako ereduen baliagarritasuna erakusten dute, emaitza itxaropentsuak eskuratuz oinarrizko sistema sendo
baten gainean. Era berean, gure lanak ekarpen garrantzitsuak egiten ditu errepresentazio
bektorialen mapaketa elebidunei eta hitzen errepresentazio bektorialetan oinarritutako
hitz-segiden antzekotasun neurriei dagokienean, itzulpen automatikoaz haratago balio
propio bat dutenak semantika distribuzionalaren arloan.[EN]In this work, we explore the use of distributional semantics and machine learning to
improve statistical machine translation. For that purpose, we propose the use of a logistic
regression based machine learning model for dynamic phrase translation probability mod-
eling. We prove that the proposed model can be seen as a generalization of the standard
translation probabilities used in statistical machine translation, and use it to incorporate
context and distributional semantic information through lexical, word cluster and word
embedding features. Apart from that, we explore the use of word embeddings for phrase
translation probability scoring as an alternative approach to incorporate distributional
semantic knowledge into statistical machine translation. Our experiments show the
effectiveness of the proposed models, achieving promising results over a strong baseline.
At the same time, our work makes important contributions in relation to bilingual word
embedding mappings and word embedding based phrase similarity measures, which go be-
yond machine translation and have an intrinsic value in the field of distributional semantics
Document Meta-Information as Weak Supervision for Machine Translation
Data-driven machine translation has advanced considerably since the first pioneering work
in the 1990s with recent systems claiming human parity on sentence translation for highresource tasks. However, performance degrades for low-resource domains with no available
sentence-parallel training data. Machine translation systems also rarely incorporate the
document context beyond the sentence level, ignoring knowledge which is essential for
some situations. In this thesis, we aim to address the two issues mentioned above by
examining ways to incorporate document-level meta-information into data-driven machine
translation. Examples of document meta-information include document authorship and
categorization information, as well as cross-lingual correspondences between documents,
such as hyperlinks or citations between documents. As this meta-information is much more
coarse-grained than reference translations, it constitutes a source of weak supervision for
machine translation. We present four cumulatively conducted case studies where we devise
and evaluate methods to exploit these sources of weak supervision both in low-resource
scenarios where no task-appropriate supervision from parallel data exists, and in a full
supervision scenario where weak supervision from document meta-information is used to
supplement supervision from sentence-level reference translations. All case studies show
improved translation quality when incorporating document meta-information
Methods for Addressing Data Diversity in Automatic Speech Recognition
The performance of speech recognition systems is known to degrade in mismatched conditions, where the acoustic environment and the speaker population significantly differ between the training and target test data. Performance degradation due to the mismatch is widely reported in the literature, particularly for diverse datasets.
This thesis approaches the mismatch problem in diverse datasets with various strategies including data refinement, variability modelling and speech recognition model adaptation. These strategies are realised in six novel contributions.
The first contribution is a data subset selection technique using likelihood ratio derived from a target test set quantifying mismatch. The second contribution is a multi-style training method using data augmentation.
The existing training data is augmented using a distribution of variabilities learnt from a target dataset, resulting in a matched set.
The third contribution is a new approach for genre identification in diverse media data with the aim of reducing the mismatch in an adaptation framework.
The fourth contribution is a novel method which performs an unsupervised domain discovery using latent Dirichlet allocation. Since the latent domains have a high correlation with some subjective meta-data tags, such as genre labels of media data, features derived from the latent domains are successfully applied to the genre and broadcast show identification tasks.
The fifth contribution extends the latent modelling technique for acoustic model adaptation, where latent-domain specific models are adapted from a base model. As the sixth contribution, an alternative adaptation approach is proposed where subspace adaptation of deep neural network acoustic models is performed using the proposed latent-domain aware training procedure.
All of the proposed techniques for mismatch reduction are verified using diverse datasets.
Using data selection, data augmentation and latent-domain model adaptation methods the mismatch between training and testing conditions of diverse ASR systems are reduced, resulting in more robust speech recognition systems
- âŠ