Search CORE

158 research outputs found

Instance Weighting for Domain Adaptation in NLP

Author: JIANG Jing
ZHAI ChengXiang
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2007
Field of study

Domain adaptation is an important problem in natural language processing (NLP) due to the lack of labeled data in novel domains. In this paper, we study the domain adaptation problem from the instance weighting perspective. We formally analyze and characterize the domain adaptation problem from a distributional view, and show that there are two distinct needs for adaptation, corresponding to the different distributions of instances and classification functions in the source and the target domains. We then propose a general instance weighting framework for domain adaptation. Our empirical results on three NLP tasks show that incorporating and exploiting more information from the target domain through instance weighting is effective.

CiteSeerX

Institutional Knowledge at Singapore Management University

A fuzzy domain adaptation method based on self-constructing fuzzy neural network

Author: Behbood V
Hao P
Zhang G
Zheng Z
Publication venue
Publication date: 01/01/2014
Field of study

© 2014 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. Domain adaptation addresses the problem of how to utilize a model trained in the source domain to make predictions for target domain when the distribution between two domains differs substantially and labeled data in target domain is costly to collect for retraining. Existed studies are incapable to handle the issue of information granularity, in this paper, we propose a new fuzzy domain adaptation method based on self-constructing fuzzy neural network. This approach models the transferred knowledge supporting the development of the current models granularly in the form of fuzzy sets and adapts the knowledge using fuzzy similarity measure to reduce prediction error in the target domain

Crossref

OPUS - University of Technology Sydney

Automatically extracting polarity-bearing topics for cross-domain sentiment classification

Author: Alani Harith
He Yulan
Lin Chenghua
Publication venue
Publication date: 01/01/2011
Field of study

Joint sentiment-topic (JST) model was previously proposed to detect sentiment and topic simultaneously from text. The only supervision required by JST model learning is domain-independent polarity word priors. In this paper, we modify the JST model by incorporating word polarity priors through modifying the topic-word Dirichlet priors. We study the polarity-bearing topics extracted by JST and show that by augmenting the original feature space with polarity-bearing topics, the in-domain supervised classifiers learned from augmented feature representation achieve the state-of-the-art performance of 95% on the movie review data and an average of 90% on the multi-domain sentiment dataset. Furthermore, using feature augmentation and selection according to the information gain criteria for cross-domain sentiment classification, our proposed approach performs either better or comparably compared to previous approaches. Nevertheless, our approach is much simpler and does not require difficult parameter tuning

CiteSeerX

Open Research Online (The Open University)

Not All Dialogues are Created Equal: Instance Weighting for Neural Conversational Models

Author: Bibauw Serge
Lison Pierre
Publication venue
Publication date: 01/01/2017
Field of study

Neural conversational models require substantial amounts of dialogue data for their parameter estimation and are therefore usually learned on large corpora such as chat forums or movie subtitles. These corpora are, however, often challenging to work with, notably due to their frequent lack of turn segmentation and the presence of multiple references external to the dialogue itself. This paper shows that these challenges can be mitigated by adding a weighting model into the architecture. The weighting model, which is itself estimated from dialogue data, associates each training example to a numerical weight that reflects its intrinsic quality for dialogue modelling. At training time, these sample weights are included into the empirical loss to be minimised. Evaluation results on retrieval-based models trained on movie and TV subtitles demonstrate that the inclusion of such a weighting model improves the model performance on unsupervised metrics.Comment: Accepted to SIGDIAL 201

arXiv.org e-Print Archive

Crossref