2,879 research outputs found
Conditional Random Field Autoencoders for Unsupervised Structured Prediction
We introduce a framework for unsupervised learning of structured predictors
with overlapping, global features. Each input's latent representation is
predicted conditional on the observable data using a feature-rich conditional
random field. Then a reconstruction of the input is (re)generated, conditional
on the latent structure, using models for which maximum likelihood estimation
has a closed-form. Our autoencoder formulation enables efficient learning
without making unrealistic independence assumptions or restricting the kinds of
features that can be used. We illustrate insightful connections to traditional
autoencoders, posterior regularization and multi-view learning. We show
competitive results with instantiations of the model for two canonical NLP
tasks: part-of-speech induction and bitext word alignment, and show that
training our model can be substantially more efficient than comparable
feature-rich baselines
Representation Learning for Words and Entities
This thesis presents new methods for unsupervised learning of distributed
representations of words and entities from text and knowledge bases. The first
algorithm presented in the thesis is a multi-view algorithm for learning
representations of words called Multiview Latent Semantic Analysis (MVLSA). By
incorporating up to 46 different types of co-occurrence statistics for the same
vocabulary of english words, I show that MVLSA outperforms other
state-of-the-art word embedding models. Next, I focus on learning entity
representations for search and recommendation and present the second method of
this thesis, Neural Variational Set Expansion (NVSE). NVSE is also an
unsupervised learning method, but it is based on the Variational Autoencoder
framework. Evaluations with human annotators show that NVSE can facilitate
better search and recommendation of information gathered from noisy, automatic
annotation of unstructured natural language corpora. Finally, I move from
unstructured data and focus on structured knowledge graphs. I present novel
approaches for learning embeddings of vertices and edges in a knowledge graph
that obey logical constraints.Comment: phd thesis, Machine Learning, Natural Language Processing,
Representation Learning, Knowledge Graphs, Entities, Word Embeddings, Entity
Embedding
Machine learning methods for histopathological image analysis
Abundant accumulation of digital histopathological images has led to the
increased demand for their analysis, such as computer-aided diagnosis using
machine learning techniques. However, digital pathological images and related
tasks have some issues to be considered. In this mini-review, we introduce the
application of digital pathological image analysis using machine learning
algorithms, address some problems specific to such analysis, and propose
possible solutions.Comment: 23 pages, 4 figure
Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing
Linguistic typology aims to capture structural and semantic variation across
the world's languages. A large-scale typology could provide excellent guidance
for multilingual Natural Language Processing (NLP), particularly for languages
that suffer from the lack of human labeled resources. We present an extensive
literature survey on the use of typological information in the development of
NLP techniques. Our survey demonstrates that to date, the use of information in
existing typological databases has resulted in consistent but modest
improvements in system performance. We show that this is due to both intrinsic
limitations of databases (in terms of coverage and feature granularity) and
under-employment of the typological features included in them. We advocate for
a new approach that adapts the broad and discrete nature of typological
categories to the contextual and continuous nature of machine learning
algorithms used in contemporary NLP. In particular, we suggest that such
approach could be facilitated by recent developments in data-driven induction
of typological knowledge
A Survey on Deep Learning in Medical Image Analysis
Deep learning algorithms, in particular convolutional networks, have rapidly
become a methodology of choice for analyzing medical images. This paper reviews
the major deep learning concepts pertinent to medical image analysis and
summarizes over 300 contributions to the field, most of which appeared in the
last year. We survey the use of deep learning for image classification, object
detection, segmentation, registration, and other tasks and provide concise
overviews of studies per application area. Open challenges and directions for
future research are discussed.Comment: Revised survey includes expanded discussion section and reworked
introductory section on common deep architectures. Added missed papers from
before Feb 1st 201
- …