1,123 research outputs found
Carving model-free inference
In many large-scale experiments, the investigator begins with pilot data to
look for promising findings. As fresh data becomes available at a later point
of time, or from a different source, she is left with the question of how to
use the full data to infer for the selected findings. Compensating for the
overoptimism from selection, carving permits a reuse of pilot data for valid
inference. The principles of carving are quite appealing in practice: instead
of throwing away the pilot samples, carving simply discards the information
consumed at the time of selection. However, the theoretical justification for
carving is strongly tied to parametric models, an example being the ubiquitous
gaussian model. In this paper we develop asymptotic guarantees to substantiate
the use of carving beyond gaussian generating models. In simulations and in an
application on gene expression data, we find that carving delivers valid and
tight confidence intervals in model-free settings.Comment: 50 pages, 2 figures, 7 Table
Approximate selective inference via maximum likelihood
This article considers a conditional approach to selective inference via
approximate maximum likelihood for data described by Gaussian models. There are
two important considerations in adopting a post-selection inferential
perspective. While one of them concerns the effective use of information in
data, the other aspect deals with the computational cost of adjusting for
selection. Our approximate proposal serves both these purposes-- (i) exploits
the use of randomness for efficient utilization of left-over information from
selection; (ii) enables us to bypass potentially expensive MCMC sampling from
conditional distributions. At the core of our method is the solution to a
convex optimization problem which assumes a separable form across multiple
selection queries. This allows us to address the problem of tractable and
efficient inference in many practical scenarios, where more than one learning
query is conducted to define and perhaps redefine models and their
corresponding parameters. Through an in-depth analysis, we illustrate the
potential of our proposal and provide extensive comparisons with other
post-selective schemes in both randomized and non-randomized paradigms of
inference
Development of a Hindi Lemmatizer
We live in a translingual society, in order to communicate with people from different parts of the world we need to have an expertise in their respective languages. Learning all these languages is not at all possible; therefore we need a mechanism which can do this task for us. Machine translators have emerged as a tool which can perform this task. In order to develop a machine translator we need to develop several different rules. The very first module that comes in machine translation pipeline is morphological analysis. Stemming and lemmatization comes under morphological analysis. In this paper we have created a lemmatizer which generates rules for removing the affixes along with the addition of rules for creating a proper root word
Ask, and shall you receive?: Understanding Desire Fulfillment in Natural Language Text
The ability to comprehend wishes or desires and their fulfillment is
important to Natural Language Understanding. This paper introduces the task of
identifying if a desire expressed by a subject in a given short piece of text
was fulfilled. We propose various unstructured and structured models that
capture fulfillment cues such as the subject's emotional state and actions. Our
experiments with two different datasets demonstrate the importance of
understanding the narrative and discourse structure to address this task
- …