2 research outputs found
Domain Adaptation for Statistical Classifiers
The most basic assumption used in statistical learning theory is that
training data and test data are drawn from the same underlying distribution.
Unfortunately, in many applications, the "in-domain" test data is drawn from a
distribution that is related, but not identical, to the "out-of-domain"
distribution of the training data. We consider the common case in which labeled
out-of-domain data is plentiful, but labeled in-domain data is scarce. We
introduce a statistical formulation of this problem in terms of a simple
mixture model and present an instantiation of this framework to maximum entropy
classifiers and their linear chain counterparts. We present efficient inference
algorithms for this special case based on the technique of conditional
expectation maximization. Our experimental results show that our approach leads
to improved performance on three real world tasks on four different data sets
from the natural language processing domain
Evaluation Of Large-Scale Optimization Problems On Vector And Parallel Architectures
. We examine the importance of problem formulation for the solution of large-scale optimization problems on high-performance architectures. We use limited memory variable metric methods to illustrate performance issues. We show that the performance of these algorithms is drastically affected by application implementation. Model applications are drawn from the MINPACK-2 test problem collection, with numerical results from a super-scalar architecture (IBM RS6000/370), a vector architecture (CRAY-2), and a massively parallel architecture (Intel DELTA). Key words. optimization, large-scale, limited memory, variable metric, performance evaluation, vector architecture, parallel architecture. AMS subject classifications. 65Y05, 65Y20, 65K05, 65K10, 90C06, 90C30 1. Introduction. Our aim is to explore performance issues associated with the solution of large-scale optimization problems on high-performance architectures. The solution of these problems, where the number of variables ranges betwe..