Search CORE

705 research outputs found

High-Dimensional Gaussian Graphical Model Selection: Walk Summability and Local Separation Criterion

Author: Anandkumar Animashree
Tan Vincent Y. F.
Willsky Alan. S.
Publication venue
Publication date: 01/06/2011
Field of study

We consider the problem of high-dimensional Gaussian graphical model selection. We identify a set of graphs for which an efficient estimation algorithm exists, and this algorithm is based on thresholding of empirical conditional covariances. Under a set of transparent conditions, we establish structural consistency (or sparsistency) for the proposed algorithm, when the number of samples n=omega(J_{min}^{-2} log p), where p is the number of variables and J_{min} is the minimum (absolute) edge potential of the graphical model. The sufficient conditions for sparsistency are based on the notion of walk-summability of the model and the presence of sparse local vertex separators in the underlying graph. We also derive novel non-asymptotic necessary conditions on the number of samples required for sparsistency

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Towards Molecule Generation with Heterogeneous States via Reinforcement Learning

Author: Shi Fangzhou
Publication venue: Faculty of Engineering, School of Computer Science
Publication date: 01/01/2020
Field of study

De novo molecular design and generation are frequently prescribed in the field of chemistry and biology, for it plays a critical role in maintaining the prosperity of the chemical industry and benefiting the drug discovery. Nowadays, many significant problems in this field are based on the philosophy of designing molecular structures towards specific desired properties. This research is very meaningful in both medical and AI fields, which can benefits novel drug discovery for some diseases. However, It remains a challenging task due to the large size of chemical space. In recent years, reinforcement learning-based methods leverage graphs to represent molecules and generate molecules as a decision making process. However, this vanilla graph representation may neglect the intrinsic context information with molecules and limits the generation performance accordingly. In this paper, we propose to augment the original graph states with the SMILES context vectors. As a result, SMILES representations are easily processed by a simple language model such that the general semantic features of a molecule can be extracted; and the graph representations perform better in handling the topology relationship of each atom. Moreover, we propose a framework that combines supervised learning and reinforcement learning algorithm to take a solid consideration of these two heterogeneous state representations of a molecule, which can fuse the information from both of them and extract more comprehensive features so that more sophisticated decisions can be made by the policy network. Our model also introduces two attention mechanisms, i.e., action-attention, and graph-attention, to further improve the performance. We conduct our experiments on a practical dataset, ZINC, and the experiment results demonstrate that our framework can outperform other baselines in the learning performance of molecule generation and chemical property optimization

Sydney eScholarship

Kernel methods in machine learning

Author: Hofmann Thomas
Schölkopf Bernhard
Smola Alexander J.
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2008
Field of study

We review machine learning methods employing positive definite kernels. These methods formulate learning and estimation problems in a reproducing kernel Hilbert space (RKHS) of functions defined on the data domain, expanded in terms of a kernel. Working in linear spaces of function has the benefit of facilitating the construction and analysis of learning algorithms while at the same time allowing large classes of functions. The latter include nonlinear functions as well as functions defined on nonvectorial data. We cover a wide range of methods, ranging from binary classifiers to sophisticated methods for estimation with structured data.Comment: Published in at http://dx.doi.org/10.1214/009053607000000677 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

The Australian National University

MPG.PuRe

Bethe Projections for Non-Local Inference

Author: Belanger David
McCallum Andrew
Sheldon Daniel
Vilnis Luke
Publication venue
Publication date: 28/11/2016
Field of study

Many inference problems in structured prediction are naturally solved by augmenting a tractable dependency structure with complex, non-local auxiliary objectives. This includes the mean field family of variational inference algorithms, soft- or hard-constrained inference using Lagrangian relaxation or linear programming, collective graphical models, and forms of semi-supervised learning such as posterior regularization. We present a method to discriminatively learn broad families of inference objectives, capturing powerful non-local statistics of the latent variables, while maintaining tractable and provably fast inference using non-Euclidean projected gradient descent with a distance-generating function given by the Bethe entropy. We demonstrate the performance and flexibility of our method by (1) extracting structured citations from research papers by learning soft global constraints, (2) achieving state-of-the-art results on a widely-used handwriting recognition task using a novel learned non-convex inference procedure, and (3) providing a fast and highly scalable algorithm for the challenging problem of inference in a collective graphical model applied to bird migration.Comment: minor bug fix to appendix. appeared in UAI 201

arXiv.org e-Print Archive

CiteSeerX

A review on probabilistic graphical models in evolutionary computation

Author: A. Brownlee
A. Cuesta-Infante
A.P. Dawid
A.P. Dempster
B. Li
B.J. Frey
C. Ahn
C. Echegoyen
C. González
C. Lima
C. Lima
C.W. Ahn
Concha Bielza
D. Chickering
D. Chickering
D. Geiger
D. Heckerman
D. Heckerman
D. Koller
D. Thierens
D.E. Goldberg
D.Y. Cho
D.Y. Cho
E. Bengoetxea
G. Cooper
G. Harik
G. Harik
G. Schwarz
H. Akaike
H. Karshenas
H. Karshenas
H. Mühlenbein
H. Mühlenbein
H. Mühlenbein
Hossein Karshenas
J. Bonet De
J. Grahl
J. Gámez
J. Holland
J. Očenášek
J. Očenášek
J. Pearl
J. Rissanen
J. Sun
J. Xiao
J.M. Peña
J.R. Koza
K. Sastry
K. Sastry
K. Yanai
L. Martí
L.F. Wang
L.F. Wang
L.J. Fogel
M. Costa
M. Frydenberg
M. Pelikan
M. Pelikan
M. Pelikan
M. Pelikan
M. Pelikan
M. Pelikan
M. Pelikan
M. Sebag
N. Ding
N. Luo
N.L. Cramer
P. Larrañaga
P. Larrañaga
P. Larrañaga
P. Pošík
P. Pošík
P. Pošík
P. Spirtes
P. Spirtes
P.A.D. Castro de
P.A.N. Bosman
P.A.N. Bosman
P.A.N. Bosman
P.A.N. Bosman
P.A.N. Bosman
P.A.N. Bosman
P.A.N. Bosman
Pedro Larrañaga
Q. Zhang
Q. Zhang
R. Etxeberria
R. McKay
R. Robinson
R. Salinas-Gutiérrez
R. Santana
R. Santana
R. Santana
R. Santana
R. Santana
R. Santana
R. Santana
R.P. Sałustowicz
R.S. Michalski
Roberto Santana
S. Baluja
S. Geman
S. Tsutsui
S. Tsutsui
S.I. Valdez-Peña
S.L. Lauritzen
T. Miquélez
T. Miquélez
T. Weise
W. Buntine
X. Wang
Y. Hasegawa
Y. Hong
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Thanks to their inherent properties, probabilistic graphical models are one of the prime candidates for machine learning and decision making tasks especially in uncertain domains. Their capabilities, like representation, inference and learning, if used effectively, can greatly help to build intelligent systems that are able to act accordingly in different problem domains. Evolutionary algorithms is one such discipline that has employed probabilistic graphical models to improve the search for optimal solutions in complex problems. This paper shows how probabilistic graphical models have been used in evolutionary algorithms to improve their performance in solving complex problems. Specifically, we give a survey of probabilistic model building-based evolutionary algorithms, called estimation of distribution algorithms, and compare different methods for probabilistic modeling in these algorithms

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Recommended from our members

Modeling Word Forms Using Latent Underlying Morphs and Phonology

Author: Cotterell Ryan
Eisner Jason
Peng Nanyun
Publication venue: Transactions of the Association for Computational Linguistics (TACL) 2015
Publication date: 01/12/2015
Field of study

The observed pronunciations or spellings of words are often explained as arising from the “underlying forms” of their mor- phemes. These forms are latent strings that linguists try to reconstruct by hand. We propose to reconstruct them automatically at scale, enabling generalization to new words. Given some surface word types of a concatenative language along with the abstract morpheme sequences that they ex- press, we show how to recover consistent underlying forms for these morphemes, together with the (stochastic) phonology that maps each concatenation of underly- ing forms to a surface form. Our technique involves loopy belief propagation in a nat- ural directed graphical model whose vari- ables are unknown strings and whose con- ditional distributions are encoded as finite- state machines with trainable weights. We define training and evaluation paradigms for the task of surface word prediction, and report results on subsets of 7 languages

Apollo (Cambridge)

Learning the Structure of Variable-Order CRFs: a finite-state perspective

Author: Lavergne Thomas
Yvon François
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

The computational complexity of linear-chain Conditional Random Fields (CRFs) makes it difficult to deal with very large label sets and long range dependencies. Such situations are not rare and arise when dealing with morphologically rich languages or joint labelling tasks. We extend here recent proposals to consider variable order CRFs. Using an effective finite-state representation of variable-length dependencies, we propose new ways to perform feature selection at large scale and report experimental results where we outperform strong baselines on a tagging task

Crossref

Using generative models for handwritten digit recognition

Author: Hinton G. E.
Revow M.
Williams C. K. I.
Publication venue
Publication date: 01/01/1996
Field of study

We describe a method of recognizing handwritten digits by fitting generative models that are built from deformable B-splines with Gaussian ``ink generators'' spaced along the length of the spline. The splines are adjusted using a novel elastic matching procedure based on the Expectation Maximization (EM) algorithm that maximizes the likelihood of the model generating the data. This approach has many advantages. (1) After identifying the model most likely to have generated the data, the system not only produces a classification of the digit but also a rich description of the instantiation parameters which can yield information such as the writing style. (2) During the process of explaining the image, generative models can perform recognition driven segmentation. (3) The method involves a relatively small number of parameters and hence training is relatively easy and fast. (4) Unlike many other recognition schemes it does not rely on some form of pre-normalization of input images, but can handle arbitrary scalings, translations and a limited degree of image rotation. We have demonstrated our method of fitting models to images does not get trapped in poor local minima. The main disadvantage of the method is it requires much more computation than more standard OCR techniques

CiteSeerX

Aston Publications Explorer

Character-Aware Neural Language Models

Author: Jernite Yacine
Kim Yoon
Rush Alexander M.
Sontag David
Publication venue
Publication date: 01/12/2015
Field of study

We describe a simple neural language model that relies only on character-level inputs. Predictions are still made at the word-level. Our model employs a convolutional neural network (CNN) and a highway network over characters, whose output is given to a long short-term memory (LSTM) recurrent neural network language model (RNN-LM). On the English Penn Treebank the model is on par with the existing state-of-the-art despite having 60% fewer parameters. On languages with rich morphology (Arabic, Czech, French, German, Spanish, Russian), the model outperforms word-level/morpheme-level LSTM baselines, again with fewer parameters. The results suggest that on many languages, character inputs are sufficient for language modeling. Analysis of word representations obtained from the character composition part of the model reveals that the model is able to encode, from characters only, both semantic and orthographic information.Comment: AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications