11 research outputs found
Robust domain adaptation for relation extraction via clustering consistency
We propose a two-phase framework to adapt existing relation extraction classifiers to extract relations for new target domains. We address two challenges: negative transfer when knowledge in source domains is used without considering the differences in relation distributions; and lack of adequate labeled samples for rarer relations in the new domain, due to a small labeled data set and imbalance relation distributions. Our framework leverages on both labeled and unlabeled data in the target domain. First, we determine the relevance of each source domain to the target domain for each relation type, using the consistency between the clustering given by the target domain labels and the clustering given by the predictors trained for the source domain. To overcome the lack of labeled samples for rarer relations, these clusterings operate on both the labeled and unlabeled data in the target domain. Second, we trade-off between using relevance-weighted sourcedomain predictors and the labeled target data. Again, to overcome the imbalance distribution, the source-domain predictors operate on the unlabeled target data. Our method outperforms numerous baselines and a weakly-supervised relation extraction method on ACE 2004 and YAGO. © 2014 Association for Computational Linguistics
Gaussian Process Pseudo-Likelihood Models for Sequence Labeling
Several machine learning problems arising in natural language processing can
be modeled as a sequence labeling problem. We provide Gaussian process models
based on pseudo-likelihood approximation to perform sequence labeling. Gaussian
processes (GPs) provide a Bayesian approach to learning in a kernel based
framework. The pseudo-likelihood model enables one to capture long range
dependencies among the output components of the sequence without becoming
computationally intractable. We use an efficient variational Gaussian
approximation method to perform inference in the proposed model. We also
provide an iterative algorithm which can effectively make use of the
information from the neighboring labels to perform prediction. The ability to
capture long range dependencies makes the proposed approach useful for a wide
range of sequence labeling problems. Numerical experiments on some sequence
labeling data sets demonstrate the usefulness of the proposed approach.Comment: 18 pages, 5 figure
Current and prospective pharmacological targets in relation to antimigraine action
Migraine is a recurrent incapacitating neurovascular disorder characterized by unilateral and throbbing headaches associated with photophobia, phonophobia, nausea, and vomiting. Current specific drugs used in the acute treatment of migraine interact with vascular receptors, a fact that has raised concerns about their cardiovascular safety. In the past, α-adrenoceptor agonists (ergotamine, dihydroergotamine, isometheptene) were used. The last two decades have witnessed the advent of 5-HT1B/1D receptor agonists (sumatriptan and second-generation triptans), which have a well-established efficacy in the acute treatment of migraine. Moreover, current prophylactic treatments of migraine include 5-HT2 receptor antagonists, Ca2+ channel blockers, and β-adrenoceptor antagonists. Despite the progress in migraine research and in view of its complex etiology, this disease still remains underdiagnosed, and available therapies are underused. In this review, we have discussed pharmacological targets in migraine, with special emphasis on compounds acting on 5-HT (5-HT1-7), adrenergic (α1, α2, and β), calcitonin gene-related peptide (CGRP 1 and CGRP2), adenosine (A1, A2, and A3), glutamate (NMDA, AMPA, kainate, and metabotropic), dopamine, endothelin, and female hormone (estrogen and progesterone) receptors. In addition, we have considered some other targets, including gamma-aminobutyric acid, angiotensin, bradykinin, histamine, and ionotropic receptors, in relation to antimigraine therapy. Finally, the cardiovascular safety of current and prospective antimigraine therapies is touched upon
Active learning for probabilistic hypotheses using the maximum Gibbs error criterion
Advances in Neural Information Processing System
Domain adaptation for coreference resolution: An adaptive ensemble approach
We propose an adaptive ensemble method to adapt coreference resolution across domains. This method has three features: (1) it can optimize for any user-specified objective measure; (2) it can make document-specific prediction rather than rely on a fixed base model or a fixed set of base models; (3) it can automatically adjust the active ensemble members during prediction. With simplification, this method can be used in the traditional within-domain case, while still retaining the above features. To the best of our knowledge, this work is the first to both (i) develop a domain adaptation algorithm for the coreference resolution problem and (ii) have the above features as an ensemble method. Empirically, we show the benefits of (i) on the six domains of the ACE 2005 data set in domain adaptation setting, and of (ii) on both the MUC-6 and the ACE 2005 data sets in within-domain setting. © 2012 Association for Computational Linguistics
A split-merge framework for comparing clusterings
Clustering evaluation measures are frequently used to evaluate the performance of algorithms. However, most measures are not properly normalized and ignore some information in the inherent structure of clusterings. We model the relation between two clusterings as a bipartite graph and propose a general component-based decomposition formula based on the components of the graph. Most existing measures are examples of this formula. In order to satisfy consistency in the component, we further propose a split-merge framework for comparing clusterings of different data sets. Our framework gives measures that are conditionally normalized, and it can make use of data point information, such as feature vectors and pairwise distances. We use an entropy-based instance of the framework and a coreference resolution data set to demonstrate empirically the utility of our framework over other measures. Copyright 2012 by the author(s)/owner(s)