28,147 research outputs found
Sparse Model Identification and Learning for Ultra-high-dimensional Additive Partially Linear Models
The additive partially linear model (APLM) combines the flexibility of
nonparametric regression with the parsimony of regression models, and has been
widely used as a popular tool in multivariate nonparametric regression to
alleviate the "curse of dimensionality". A natural question raised in practice
is the choice of structure in the nonparametric part, that is, whether the
continuous covariates enter into the model in linear or nonparametric form. In
this paper, we present a comprehensive framework for simultaneous sparse model
identification and learning for ultra-high-dimensional APLMs where both the
linear and nonparametric components are possibly larger than the sample size.
We propose a fast and efficient two-stage procedure. In the first stage, we
decompose the nonparametric functions into a linear part and a nonlinear part.
The nonlinear functions are approximated by constant spline bases, and a triple
penalization procedure is proposed to select nonzero components using adaptive
group LASSO. In the second stage, we refit data with selected covariates using
higher order polynomial splines, and apply spline-backfitted local-linear
smoothing to obtain asymptotic normality for the estimators. The procedure is
shown to be consistent for model structure identification. It can identify
zero, linear, and nonlinear components correctly and efficiently. Inference can
be made on both linear coefficients and nonparametric functions. We conduct
simulation studies to evaluate the performance of the method and apply the
proposed method to a dataset on the Shoot Apical Meristem (SAM) of maize
genotypes for illustration
Active Sampling for Large-scale Information Retrieval Evaluation
Evaluation is crucial in Information Retrieval. The development of models,
tools and methods has significantly benefited from the availability of reusable
test collections formed through a standardized and thoroughly tested
methodology, known as the Cranfield paradigm. Constructing these collections
requires obtaining relevance judgments for a pool of documents, retrieved by
systems participating in an evaluation task; thus involves immense human labor.
To alleviate this effort different methods for constructing collections have
been proposed in the literature, falling under two broad categories: (a)
sampling, and (b) active selection of documents. The former devises a smart
sampling strategy by choosing only a subset of documents to be assessed and
inferring evaluation measure on the basis of the obtained sample; the sampling
distribution is being fixed at the beginning of the process. The latter
recognizes that systems contributing documents to be judged vary in quality,
and actively selects documents from good systems. The quality of systems is
measured every time a new document is being judged. In this paper we seek to
solve the problem of large-scale retrieval evaluation combining the two
approaches. We devise an active sampling method that avoids the bias of the
active selection methods towards good systems, and at the same time reduces the
variance of the current sampling approaches by placing a distribution over
systems, which varies as judgments become available. We validate the proposed
method using TREC data and demonstrate the advantages of this new method
compared to past approaches
Do Multi-Sense Embeddings Improve Natural Language Understanding?
Learning a distinct representation for each sense of an ambiguous word could
lead to more powerful and fine-grained models of vector-space representations.
Yet while `multi-sense' methods have been proposed and tested on artificial
word-similarity tasks, we don't know if they improve real natural language
understanding tasks. In this paper we introduce a multi-sense embedding model
based on Chinese Restaurant Processes that achieves state of the art
performance on matching human word similarity judgments, and propose a
pipelined architecture for incorporating multi-sense embeddings into language
understanding.
We then test the performance of our model on part-of-speech tagging, named
entity recognition, sentiment analysis, semantic relation identification and
semantic relatedness, controlling for embedding dimensionality. We find that
multi-sense embeddings do improve performance on some tasks (part-of-speech
tagging, semantic relation identification, semantic relatedness) but not on
others (named entity recognition, various forms of sentiment analysis). We
discuss how these differences may be caused by the different role of word sense
information in each of the tasks. The results highlight the importance of
testing embedding models in real applications
- …