726 research outputs found
A Simple Baseline for Travel Time Estimation using Large-Scale Trip Data
The increased availability of large-scale trajectory data around the world
provides rich information for the study of urban dynamics. For example, New
York City Taxi Limousine Commission regularly releases source-destination
information about trips in the taxis they regulate. Taxi data provide
information about traffic patterns, and thus enable the study of urban flow --
what will traffic between two locations look like at a certain date and time in
the future? Existing big data methods try to outdo each other in terms of
complexity and algorithmic sophistication. In the spirit of "big data beats
algorithms", we present a very simple baseline which outperforms
state-of-the-art approaches, including Bing Maps and Baidu Maps (whose APIs
permit large scale experimentation). Such a travel time estimation baseline has
several important uses, such as navigation (fast travel time estimates can
serve as approximate heuristics for A search variants for path finding) and
trip planning (which uses operating hours for popular destinations along with
travel time estimates to create an itinerary).Comment: 12 page
The extended Ville's inequality for nonintegrable nonnegative supermartingales
Following initial work by Robbins, we rigorously present an extended theory
of nonnegative supermartingales, requiring neither integrability nor
finiteness. In particular, we derive a key maximal inequality foreshadowed by
Robbins, which we call the extended Ville's inequality, that strengthens the
classical Ville's inequality (for integrable nonnegative supermartingales), and
also applies to our nonintegrable setting. We derive an extension of the method
of mixtures, which applies to -finite mixtures of our extended
nonnegative supermartingales. We present some implications of our theory for
sequential statistics, such as the use of improper mixtures (priors) in
deriving nonparametric confidence sequences and (extended) e-processes
Huber-Robust Confidence Sequences
Confidence sequences are confidence intervals that can be sequentially
tracked, and are valid at arbitrary data-dependent stopping times. This paper
presents confidence sequences for a univariate mean of an unknown distribution
with a known upper bound on the p-th central moment (p > 1), but allowing for
(at most) {\epsilon} fraction of arbitrary distribution corruption, as in
Huber's contamination model. We do this by designing new robust exponential
supermartingales, and show that the resulting confidence sequences attain the
optimal width achieved in the nonsequential setting. Perhaps surprisingly, the
constant margin between our sequential result and the lower bound is smaller
than even fixed-time robust confidence intervals based on the trimmed mean, for
example. Since confidence sequences are a common tool used within A/B/n testing
and bandits, these results open the door to sequential experimentation that is
robust to outliers and adversarial corruptions.Comment: 26th International Conference on Artificial Intelligence and
Statistics (AISTATS 2023
Time-Uniform Confidence Spheres for Means of Random Vectors
We derive and study time-uniform confidence spheres - termed confidence
sphere sequences (CSSs) - which contain the mean of random vectors with high
probability simultaneously across all sample sizes. Inspired by the original
work of Catoni and Giulini, we unify and extend their analysis to cover both
the sequential setting and to handle a variety of distributional assumptions.
More concretely, our results include an empirical-Bernstein CSS for bounded
random vectors (resulting in a novel empirical-Bernstein confidence interval),
a CSS for sub- random vectors, and a CSS for heavy-tailed random vectors
based on a sequentially valid Catoni-Giulini estimator. Finally, we provide a
version of our empirical-Bernstein CSS that is robust to contamination by Huber
noise.Comment: 36 pages, 3 figure
Comparative Study on Static Term Structure of Interest Rates
The term structure of interest rates has been a hot topic in the financial sector. With the accelerating process of interest rate liberalization, to seek a representative benchmark interest rate of the market is basis for the fixed income products pricing. This paper using Nelson-Siegel-Svensson model and polynomial spline model fitting analysis is carried out on bond transaction data of Shanghai stock exchange in China, through analysis and comparison of the two models, to choose the appropriate method to fit the term structure of interest rates
- …