11,106 research outputs found

    Latent sentiment model for weakly-supervised cross-lingual sentiment classification

    No full text
    In this paper, we present a novel weakly-supervised method for crosslingual sentiment analysis. In specific, we propose a latent sentiment model (LSM) based on latent Dirichlet allocation where sentiment labels are considered as topics. Prior information extracted from English sentiment lexicons through machine translation are incorporated into LSM model learning, where preferences on expectations of sentiment labels of those lexicon words are expressed using generalized expectation criteria. An efficient parameter estimation procedure using variational Bayes is presented. Experimental results on the Chinese product reviews show that the weakly-supervised LSM model performs comparably to supervised classifiers such as Support vector Machines with an average of 81% accuracy achieved over a total of 5484 review documents. Moreover, starting with a generic sentiment lexicon, the LSM model is able to extract highly domainspecific polarity words from text

    Using foreign inclusion detection to improve parsing performance

    Get PDF
    Inclusions from other languages can be a significant source of errors for monolin-gual parsers. We show this for English in-clusions, which are sufficiently frequent to present a problem when parsing German. We describe an annotation-free approach for accurately detecting such inclusions, and de-velop two methods for interfacing this ap-proach with a state-of-the-art parser for Ger-man. An evaluation on the TIGER cor-pus shows that our inclusion entity model achieves a performance gain of 4.3 points in F-score over a baseline of no inclusion de-tection, and even outperforms a parser with access to gold standard part-of-speech tags.
    • …
    corecore