5,694 research outputs found
Discovery of Linguistic Relations Using Lexical Attraction
This work has been motivated by two long term goals: to understand how humans
learn language and to build programs that can understand language. Using a
representation that makes the relevant features explicit is a prerequisite for
successful learning and understanding. Therefore, I chose to represent
relations between individual words explicitly in my model. Lexical attraction
is defined as the likelihood of such relations. I introduce a new class of
probabilistic language models named lexical attraction models which can
represent long distance relations between words and I formalize this new class
of models using information theory.
Within the framework of lexical attraction, I developed an unsupervised
language acquisition program that learns to identify linguistic relations in a
given sentence. The only explicitly represented linguistic knowledge in the
program is lexical attraction. There is no initial grammar or lexicon built in
and the only input is raw text. Learning and processing are interdigitated. The
processor uses the regularities detected by the learner to impose structure on
the input. This structure enables the learner to detect higher level
regularities. Using this bootstrapping procedure, the program was trained on
100 million words of Associated Press material and was able to achieve 60%
precision and 50% recall in finding relations between content-words. Using
knowledge of lexical attraction, the program can identify the correct relations
in syntactically ambiguous sentences such as ``I saw the Statue of Liberty
flying over New York.''Comment: dissertation, 56 page
Making Indefinite Kernel Learning Practical
In this paper we embed evolutionary computation into statistical learning theory. First, we outline the connection between large margin optimization and statistical learning and see why this paradigm is successful for many pattern recognition problems. We then embed evolutionary computation into the most prominent representative of this class of learning methods, namely into Support Vector Machines (SVM). In contrast to former applications of evolutionary algorithms to SVM we do not only optimize the method or kernel parameters. We rather use evolution strategies in order to directly solve the posed constrained optimization problem. Transforming the problem into the Wolfe dual reduces the total runtime and allows the usage of kernel functions just as for traditional SVM. We will show that evolutionary SVM are at least as accurate as their quadratic programming counterparts on eight real-world benchmark data sets in terms of generalization performance. They always outperform traditional approaches in terms of the original optimization problem. Additionally, the proposed algorithm is more generic than existing traditional solutions since it will also work for non-positive semidefinite or indefinite kernel functions. The evolutionary SVM variants frequently outperform their quadratic programming competitors in cases where such an indefinite Kernel function is used. --
Zero-shot causal learning
Predicting how different interventions will causally affect a specific
individual is important in a variety of domains such as personalized medicine,
public policy, and online marketing. There are a large number of methods to
predict the effect of an existing intervention based on historical data from
individuals who received it. However, in many settings it is important to
predict the effects of novel interventions (\emph{e.g.}, a newly invented
drug), which these methods do not address. Here, we consider zero-shot causal
learning: predicting the personalized effects of a novel intervention. We
propose CaML, a causal meta-learning framework which formulates the
personalized prediction of each intervention's effect as a task. CaML trains a
single meta-model across thousands of tasks, each constructed by sampling an
intervention, along with its recipients and nonrecipients. By leveraging both
intervention information (\emph{e.g.}, a drug's attributes) and individual
features~(\emph{e.g.}, a patient's history), CaML is able to predict the
personalized effects of novel interventions that do not exist at the time of
training. Experimental results on real world datasets in large-scale medical
claims and cell-line perturbations demonstrate the effectiveness of our
approach. Most strikingly, CaML's zero-shot predictions outperform even strong
baselines trained directly on data from the test interventions
- …