47,456 research outputs found
Guaranteed validity for empirical approaches to adaptive data analysis
We design a general framework for answering
adaptive statistical queries that focuses on
providing explicit confidence intervals along
with point estimates. Prior work in this area
has either focused on providing tight confidence
intervals for specific analyses, or providing
general worst-case bounds for point estimates.
Unfortunately, as we observe, these
worst-case bounds are loose in many settings
— often not even beating simple baselines like
sample splitting. Our main contribution is
to design a framework for providing valid,
instance-specific confidence intervals for point
estimates that can be generated by heuristics.
When paired with good heuristics, this
method gives guarantees that are orders of
magnitude better than the best worst-case
bounds. We provide a Python library implementing
our method.http://proceedings.mlr.press/v108/rogers20a.htm
Target Contrastive Pessimistic Discriminant Analysis
Domain-adaptive classifiers learn from a source domain and aim to generalize
to a target domain. If the classifier's assumptions on the relationship between
domains (e.g. covariate shift) are valid, then it will usually outperform a
non-adaptive source classifier. Unfortunately, it can perform substantially
worse when its assumptions are invalid. Validating these assumptions requires
labeled target samples, which are usually not available. We argue that, in
order to make domain-adaptive classifiers more practical, it is necessary to
focus on robust methods; robust in the sense that the model still achieves a
particular level of performance without making strong assumptions on the
relationship between domains. With this objective in mind, we formulate a
conservative parameter estimator that only deviates from the source classifier
when a lower or equal risk is guaranteed for all possible labellings of the
given target samples. We derive the corresponding estimator for a discriminant
analysis model, and show that its risk is actually strictly smaller than that
of the source classifier. Experiments indicate that our classifier outperforms
state-of-the-art classifiers for geographically biased samples.Comment: 9 pages, no figures, 2 tables. arXiv admin note: substantial text
overlap with arXiv:1706.0808
A Study of the Factors That Influence Consumer Attitudes Toward Beef Products Using the Conjoint Market Analysis Tool
This study utilizes an analysis technique commonly used in marketing, the conjoint analysis method, to examine the relative utilities of a set of beef steak characteristics considered by a national sample of 1,432 US consumers, as well as additional localized samples representing undergraduate students at a business college and in an animal science department. The analyses indicate that among all respondents, region of origin is by far the most important characteristic; this is followed by animal breed, traceability, animal feed, and beef quality. Alternatively, the cost of cut, farm ownership, the use (or nonuse) of growth promoters, and whether the product is guaranteed tender were the least important factors. Results for animal science undergraduates are similar to the aggregate results, except that these students emphasized beef quality at the expense of traceability and the nonuse of growth promoters. Business students also emphasized region of origin but then emphasized traceability and cost. The ideal steak for the national sample is from a locally produced, choice Angus fed a mixture of grain and grass that is traceable to the farm of origin. If the product was not produced locally, respondents indicated that their preferred production states are, in order from most to least preferred, Iowa, Texas, Nebraska, and Kansas.
Compositional Verification for Autonomous Systems with Deep Learning Components
As autonomy becomes prevalent in many applications, ranging from
recommendation systems to fully autonomous vehicles, there is an increased need
to provide safety guarantees for such systems. The problem is difficult, as
these are large, complex systems which operate in uncertain environments,
requiring data-driven machine-learning components. However, learning techniques
such as Deep Neural Networks, widely used today, are inherently unpredictable
and lack the theoretical foundations to provide strong assurance guarantees. We
present a compositional approach for the scalable, formal verification of
autonomous systems that contain Deep Neural Network components. The approach
uses assume-guarantee reasoning whereby {\em contracts}, encoding the
input-output behavior of individual components, allow the designer to model and
incorporate the behavior of the learning-enabled components working
side-by-side with the other components. We illustrate the approach on an
example taken from the autonomous vehicles domain
- …