25 research outputs found

    Domain adaptation for sequence labeling using hidden Markov models

    Get PDF
    Most natural language processing systems based on machine learning are not robust to domain shift. For example, a state-of-the-art syntactic dependency parser trained on Wall Street Journal sentences has an absolute drop in performance of more than ten points when tested on textual data from the Web. An efficient solution to make these methods more robust to domain shift is to first learn a word representation using large amounts of unlabeled data from both domains, and then use this representation as features in a supervised learning algorithm. In this paper, we propose to use hidden Markov models to learn word representations for part-of-speech tagging. In particular, we study the influence of using data from the source, the target or both domains to learn the representation and the different ways to represent words using an HMM.Comment: New Directions in Transfer and Multi-Task: Learning Across Domains and Tasks (NIPS Workshop) (2013

    Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure

    Get PDF
    It has been established that incorporating word cluster features derived from large unlabeled corpora can significantly improve prediction of linguistic structure. While previous work has focused primarily on English, we extend these results to other languages along two dimensions. First, we show that these results hold true for a number of languages across families. Second, and more interestingly, we provide an algorithm for inducing cross-lingual clusters and we show that features derived from these clusters significantly improve the accuracy of cross-lingual structure prediction. Specifically, we show that by augmenting direct-transfer systems with cross-lingual cluster features, the relative error of delexicalized dependency parsers, trained on English treebanks and transferred to foreign languages, can be reduced by up to 13%. When applying the same method to direct transfer of named-entity recognizers, we observe relative improvements of up to 26%

    One Sense per Context Cluster: Improving Word Sense Disambiguation Using Web-Scale Phrase Clustering

    Get PDF
    Abstract-The performance of word sense disambiguation task is still limited by lexical context matching due to data sparse problem. In this paper we present a simple but effective method that incorporates web-scale phrase clustering results for context matching. This method is able to capture some semantic relations that are not in WordNet. Without using any additional labeled data this new approach obtained 2.11%-6.92% higher accuracy over a typical supervised classifier
    corecore