25 research outputs found
Domain adaptation for sequence labeling using hidden Markov models
Most natural language processing systems based on machine learning are not
robust to domain shift. For example, a state-of-the-art syntactic dependency
parser trained on Wall Street Journal sentences has an absolute drop in
performance of more than ten points when tested on textual data from the Web.
An efficient solution to make these methods more robust to domain shift is to
first learn a word representation using large amounts of unlabeled data from
both domains, and then use this representation as features in a supervised
learning algorithm. In this paper, we propose to use hidden Markov models to
learn word representations for part-of-speech tagging. In particular, we study
the influence of using data from the source, the target or both domains to
learn the representation and the different ways to represent words using an
HMM.Comment: New Directions in Transfer and Multi-Task: Learning Across Domains
and Tasks (NIPS Workshop) (2013
Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure
It has been established that incorporating word cluster features derived from large unlabeled corpora can significantly improve prediction of linguistic structure. While previous work has focused primarily on English, we extend these results to other languages along two dimensions. First, we show that these results hold true for a number of languages across families. Second, and more interestingly, we provide an algorithm for inducing cross-lingual clusters and we show that features derived from these clusters significantly improve the accuracy of cross-lingual structure prediction. Specifically, we show that by augmenting direct-transfer systems with cross-lingual cluster features, the relative error of delexicalized dependency parsers, trained on English treebanks and transferred to foreign languages, can be reduced by up to 13%. When applying the same method to direct transfer of named-entity recognizers, we observe relative improvements of up to 26%
One Sense per Context Cluster: Improving Word Sense Disambiguation Using Web-Scale Phrase Clustering
Abstract-The performance of word sense disambiguation task is still limited by lexical context matching due to data sparse problem. In this paper we present a simple but effective method that incorporates web-scale phrase clustering results for context matching. This method is able to capture some semantic relations that are not in WordNet. Without using any additional labeled data this new approach obtained 2.11%-6.92% higher accuracy over a typical supervised classifier