191 research outputs found
More data means less inference: A pseudo-max approach to structured learning
The problem of learning to predict structured labels is of key importance in many applications. However, for general graph structure both learning and inference in this setting are intractable. Here we show that it is possible to circumvent this difficulty when the input distribution is rich enough via a method similar in spirit to pseudo-likelihood. We show how our new method achieves consistency, and illustrate empirically that it indeed performs as well as exact methods when sufficiently large training sets are used.United States-Israel Binational Science Foundation (Grant 2008303)Google (Firm) (Research Grant)Google (Firm) (PhD Fellowship
Learning bayesian network structure using lp relaxations
We propose to solve the combinatorial problem
of finding the highest scoring Bayesian
network structure from data. This structure
learning problem can be viewed as an inference
problem where the variables specify the
choice of parents for each node in the graph.
The key combinatorial difficulty arises from
the global constraint that the graph structure
has to be acyclic. We cast the structure
learning problem as a linear program over
the polytope defined by valid acyclic structures.
In relaxing this problem, we maintain
an outer bound approximation to the polytope
and iteratively tighten it by searching
over a new class of valid constraints. If an
integral solution is found, it is guaranteed
to be the optimal Bayesian network. When
the relaxation is not tight, the fast dual algorithms
we develop remain useful in combination
with a branch and bound method.
Empirical results suggest that the method is
competitive or faster than alternative exact
methods based on dynamic programming
Learning efficiently with approximate inference via dual losses
Many structured prediction tasks involve
complex models where inference is computationally intractable, but where it can be well
approximated using a linear programming
relaxation. Previous approaches for learning for structured prediction (e.g., cutting-
plane, subgradient methods, perceptron) repeatedly make predictions for some of the
data points. These approaches are computationally demanding because each prediction
involves solving a linear program to optimality. We present a scalable algorithm for learning for structured prediction. The main idea
is to instead solve the dual of the structured
prediction loss. We formulate the learning
task as a convex minimization over both the
weights and the dual variables corresponding
to each data point. As a result, we can begin to optimize the weights even before completely solving any of the individual prediction problems. We show how the dual variables can be efficiently optimized using coordinate descent. Our algorithm is competitive with state-of-the-art methods such as
stochastic subgradient and cutting-plane
Steps to Excellence: Simple Inference with Refined Scoring of Dependency Trees
Much of the recent work on dependency parsing has been focused on solving inherent combinatorial problems associated with rich scoring functions. In contrast, we demonstrate that highly expressive scoring functions can be used with substantially simpler inference procedures. Specifically, we introduce a sampling-based parser that can easily handle arbitrary global features. Inspired by SampleRank, we learn to take guided stochastic steps towards a high scoring parse. We introduce two samplers for traversing the space of trees, Gibbs and Metropolis-Hastings with Random Walk. The model outperforms state-of-the-art results when evaluated on 14 languages of non-projective CoNLL datasets. Our sampling-based approach naturally extends to joint prediction scenarios, such as joint parsing and POS correction. The resulting method outperforms the best reported results on the CATiB dataset, approaching performance of parsing with gold tags.United States. Multidisciplinary University Research Initiative (W911NF-10-1-0533)United States. Defense Advanced Research Projects Agency. Broad Operational Language TranslationUnited States-Israel Binational Science Foundation (Grant 2012330
MHC-Linked Syngeneic Developmental Preference in Thymic Lobes Colonized with Bone Marrow Cells: A Mathematical model
Reconstitution of the T-cell compartment after bone marrow transplantation depends on
successful colonization of the thymus by bone-marrow-derived progenitor cells. Recent studies
compared the development of syngeneic and allogeneic bone-marrow-derived cells in cocultures
with lymphoid-depleted fetal thymus explants, leading to the discovery of MHC-linked
syngeneic developmental preference (SDP) in the thymus. To determine the nature of cell
interactions among the bone marrow and thymic elements that might underlie SDP, we analyzed
this phenomenon by mathematical modeling. The results indicate that syngeneic mature T cells,
responsible for inducing this preference, probably interfere both with the seeding of allogeneic
bone-marrow-derived thymocyte progenitors in the thymic stroma and with their subsequent
proliferation. In addition, the possibility of augmented death among the developing allogeneic
thymocytes cannot be ruled out
Markov entropy decomposition: a variational dual for quantum belief propagation
We present a lower bound for the free energy of a quantum many-body system at
finite temperature. This lower bound is expressed as a convex optimization
problem with linear constraints, and is derived using strong subadditivity of
von Neumann entropy and a relaxation of the consistency condition of local
density operators. The dual to this minimization problem leads to a set of
quantum belief propagation equations, thus providing a firm theoretical
foundation to that approach. The minimization problem is numerically tractable,
and we find good agreement with quantum Monte Carlo for the spin-half
Heisenberg anti-ferromagnet in two dimensions. This lower bound complements
other variational upper bounds. We discuss applications to Hamiltonian
complexity theory and give a generalization of the structure theorem of Hayden,
Jozsa, Petz and Winter to trees in an appendix
Estrogen-Receptor Expression and Function in Thymocytes in Relation to Gender and Age
The expression of estrogen receptor (ER) in thymocytes was studied in young, middle-aged, and
old (2, 12, and 24 months, respectively) female and male C57BL/6J mice. Western immunoblots
prepared from the thymocytes of females of all age groups showed the presence of a 67-kD
protein band, which has been associated with the apparent MW of denatured ER. Flow cytometry
analysis o,f cells stained with a monoclonal anti-ER antibody (clone 13H2) disclosed ER
expression in both females and males of all age groups. In vivo treatment with estradiol (E2) led
to an increase in the specific activity of thymic creatine kinase (CK) in the female mice, whereas
the male thymocytes responded with an increase in CK activity only on treatment with
dihydrotestosterone (DHT). The data show no differences in ER expression between male and
females, but the receptor appears not to be functional in males. Interestingly, when estradiol was
applied to co-cultures of lymphoid-depleted fetal thymus (FT) explants and bone-marrow cells,
or thymocytes, from young and old females, it resulted in increased cellularity of cultures
containing cells of the young, and not those of the old. The proportion of CD4/CD8 phenotypes
of the developing cells in these cultures was not affected by E2 treatment. These observations
provide a new insight into ER expression and function in T-cell development in relation to
gender and age
Robustness and Generalization
We derive generalization bounds for learning algorithms based on their
robustness: the property that if a testing sample is "similar" to a training
sample, then the testing error is close to the training error. This provides a
novel approach, different from the complexity or stability arguments, to study
generalization of learning algorithms. We further show that a weak notion of
robustness is both sufficient and necessary for generalizability, which implies
that robustness is a fundamental property for learning algorithms to work
- …