5,694 research outputs found

    Discovery of Linguistic Relations Using Lexical Attraction

    Full text link
    This work has been motivated by two long term goals: to understand how humans learn language and to build programs that can understand language. Using a representation that makes the relevant features explicit is a prerequisite for successful learning and understanding. Therefore, I chose to represent relations between individual words explicitly in my model. Lexical attraction is defined as the likelihood of such relations. I introduce a new class of probabilistic language models named lexical attraction models which can represent long distance relations between words and I formalize this new class of models using information theory. Within the framework of lexical attraction, I developed an unsupervised language acquisition program that learns to identify linguistic relations in a given sentence. The only explicitly represented linguistic knowledge in the program is lexical attraction. There is no initial grammar or lexicon built in and the only input is raw text. Learning and processing are interdigitated. The processor uses the regularities detected by the learner to impose structure on the input. This structure enables the learner to detect higher level regularities. Using this bootstrapping procedure, the program was trained on 100 million words of Associated Press material and was able to achieve 60% precision and 50% recall in finding relations between content-words. Using knowledge of lexical attraction, the program can identify the correct relations in syntactically ambiguous sentences such as ``I saw the Statue of Liberty flying over New York.''Comment: dissertation, 56 page

    Making Indefinite Kernel Learning Practical

    Get PDF
    In this paper we embed evolutionary computation into statistical learning theory. First, we outline the connection between large margin optimization and statistical learning and see why this paradigm is successful for many pattern recognition problems. We then embed evolutionary computation into the most prominent representative of this class of learning methods, namely into Support Vector Machines (SVM). In contrast to former applications of evolutionary algorithms to SVM we do not only optimize the method or kernel parameters. We rather use evolution strategies in order to directly solve the posed constrained optimization problem. Transforming the problem into the Wolfe dual reduces the total runtime and allows the usage of kernel functions just as for traditional SVM. We will show that evolutionary SVM are at least as accurate as their quadratic programming counterparts on eight real-world benchmark data sets in terms of generalization performance. They always outperform traditional approaches in terms of the original optimization problem. Additionally, the proposed algorithm is more generic than existing traditional solutions since it will also work for non-positive semidefinite or indefinite kernel functions. The evolutionary SVM variants frequently outperform their quadratic programming competitors in cases where such an indefinite Kernel function is used. --

    Zero-shot causal learning

    Full text link
    Predicting how different interventions will causally affect a specific individual is important in a variety of domains such as personalized medicine, public policy, and online marketing. There are a large number of methods to predict the effect of an existing intervention based on historical data from individuals who received it. However, in many settings it is important to predict the effects of novel interventions (\emph{e.g.}, a newly invented drug), which these methods do not address. Here, we consider zero-shot causal learning: predicting the personalized effects of a novel intervention. We propose CaML, a causal meta-learning framework which formulates the personalized prediction of each intervention's effect as a task. CaML trains a single meta-model across thousands of tasks, each constructed by sampling an intervention, along with its recipients and nonrecipients. By leveraging both intervention information (\emph{e.g.}, a drug's attributes) and individual features~(\emph{e.g.}, a patient's history), CaML is able to predict the personalized effects of novel interventions that do not exist at the time of training. Experimental results on real world datasets in large-scale medical claims and cell-line perturbations demonstrate the effectiveness of our approach. Most strikingly, CaML's zero-shot predictions outperform even strong baselines trained directly on data from the test interventions
    corecore