2,257 research outputs found
ROC-Based Model Estimation for Forecasting Large Changes in Demand
Forecasting for large changes in demand should benefit from different estimation than that used for estimating mean behavior. We develop a multivariate forecasting model designed for detecting the largest changes across many time series. The model is fit based upon a penalty function that maximizes true positive rates along a relevant false positive rate range and can be used by managers wishing to take action on a small percentage of products likely to change the most in the next time period. We apply the model to a crime dataset and compare results to OLS as the basis for comparisons as well as models that are promising for exceptional demand forecasting such as quantile regression, synthetic data from a Bayesian model, and a power loss model. Using the Partial Area Under the Curve (PAUC) metric, our results show statistical significance, a 35 percent improvement over OLS, and at least a 20 percent improvement over competing methods. We suggest management with an increasing number of products to use our method for forecasting large changes in conjunction with typical magnitude-based methods for forecasting expected demand
Robust Estimation of Mahalanobis Distance in Hyperspectral Images
This dissertation develops new estimation methods that fit Johnson distributions and generalized Pareto distributions to hyperspectral Mahalanobis distances. The Johnson distribution fit is optimized using a new method which monitors the second derivative behavior of exceedance probability to mitigate potential outlier effects. This univariate distribution is then used to derive an elliptically contoured multivariate density model for the pixel data. The generalized Pareto distribution models are optimized by a new two-pass method that estimates the tail-index parameter. This method minimizes the mean squared fitting error by correcting parameter values using data distance information from an initial pass. A unique method for estimating the posterior density of the tail-index parameter for generalized Pareto models is also developed. Both the Johnson and Pareto distribution models are shown to reduce fitting error and to increase computational efficiency compared to previous models
Real space tests of the statistical isotropy and Gaussianity of the WMAP CMB data
ABRIDGED: We introduce and analyze a method for testing statistical isotropy
and Gaussianity and apply it to the WMAP CMB foreground reduced, temperature
maps, and cross-channel difference maps. We divide the sky into regions of
varying size and shape and measure the first four moments of the one-point
distribution within these regions, and using their simulated spatial
distributions we test the statistical isotropy and Gaussianity hypotheses. By
randomly varying orientations of these regions, we sample the underlying CMB
field in a new manner, that offers a richer exploration of the data content,
and avoids possible biasing due to a single choice of sky division. The
statistical significance is assessed via comparison with realistic Monte-Carlo
simulations.
We find the three-year WMAP maps to agree well with the isotropic, Gaussian
random field simulations as probed by regions corresponding to the angular
scales ranging from 6 deg to 30 deg at 68% confidence level. We report a
strong, anomalous (99.8% CL) dipole ``excess'' in the V band of the three-year
WMAP data and also in the V band of the WMAP five-year data (99.3% CL). We
notice the large scale hemispherical power asymmetry, and find that it is not
highly statistically significant in the WMAP three-year data (<~ 97%) at scales
l <= 40. The significance is even smaller if multipoles up to l=1024 are
considered (~90% CL). We give constraints on the amplitude of the
previously-proposed CMB dipole modulation field parameter. We easily detect the
residual foregrounds in cross-band difference maps at rms level <~ 7 \mu K (at
scales >~ 6 deg) and limit the systematical uncertainties to <~ 1.7 \mu K (at
scales >~ 30 deg).Comment: 20 pages, 20 figures; more tests added; updated to match the version
to be published in JCA
An energy-based model approach to rare event probability estimation
The estimation of rare event probabilities plays a pivotal role in diverse
fields. Our aim is to determine the probability of a hazard or system failure
occurring when a quantity of interest exceeds a critical value. In our
approach, the distribution of the quantity of interest is represented by an
energy density, characterized by a free energy function. To efficiently
estimate the free energy, a bias potential is introduced. Using concepts from
energy-based models (EBM), this bias potential is optimized such that the
corresponding probability density function approximates a pre-defined
distribution targeting the failure region of interest. Given the optimal bias
potential, the free energy function and the rare event probability of interest
can be determined. The approach is applicable not just in traditional rare
event settings where the variable upon which the quantity of interest relies
has a known distribution, but also in inversion settings where the variable
follows a posterior distribution. By combining the EBM approach with a Stein
discrepancy-based stopping criterion, we aim for a balanced accuracy-efficiency
trade-off. Furthermore, we explore both parametric and non-parametric
approaches for the bias potential, with the latter eliminating the need for
choosing a particular parameterization, but depending strongly on the accuracy
of the kernel density estimate used in the optimization process. Through three
illustrative test cases encompassing both traditional and inversion settings,
we show that the proposed EBM approach, when properly configured, (i) allows
stable and efficient estimation of rare event probabilities and (ii) compares
favorably against subset sampling approaches
Dark Matter Constraints from a Joint Analysis of Dwarf Spheroidal Galaxy Observations with VERITAS
We present constraints on the annihilation cross section of WIMP dark matter
based on the joint statistical analysis of four dwarf galaxies with VERITAS.
These results are derived from an optimized photon weighting statistical
technique that improves on standard imaging atmospheric Cherenkov telescope
(IACT) analyses by utilizing the spectral and spatial properties of individual
photon events. We report on the results of 230 hours of observations of
five dwarf galaxies and the joint statistical analysis of four of the dwarf
galaxies. We find no evidence of gamma-ray emission from any individual dwarf
nor in the joint analysis. The derived upper limit on the dark matter
annihilation cross section from the joint analysis is at 1 TeV for the bottom quark () final state,
at 1 TeV for the tau lepton
() final state and at 1 TeV for the gauge boson () final state.Comment: 14 pages, 9 figures, published in PRD, Ascii tables containing
annihilation cross sections limits are available for download as ancillary
files with readme.txt file description of limit
Statistical analysis of probability density functions for photometric redshifts through the KiDS-ESO-DR3 galaxies
Despite the high accuracy of photometric redshifts (zphot) derived using
Machine Learning (ML) methods, the quantification of errors through reliable
and accurate Probability Density Functions (PDFs) is still an open problem.
First, because it is difficult to accurately assess the contribution from
different sources of errors, namely internal to the method itself and from the
photometric features defining the available parameter space. Second, because
the problem of defining a robust statistical method, always able to quantify
and qualify the PDF estimation validity, is still an open issue. We present a
comparison among PDFs obtained using three different methods on the same data
set: two ML techniques, METAPHOR (Machine-learning Estimation Tool for Accurate
PHOtometric Redshifts) and ANNz2, plus the spectral energy distribution
template fitting method, BPZ. The photometric data were extracted from the KiDS
(Kilo Degree Survey) ESO Data Release 3, while the spectroscopy was obtained
from the GAMA (Galaxy and Mass Assembly) Data Release 2. The statistical
evaluation of both individual and stacked PDFs was done through quantitative
and qualitative estimators, including a dummy PDF, useful to verify whether
different statistical estimators can correctly assess PDF quality. We conclude
that, in order to quantify the reliability and accuracy of any zphot PDF
method, a combined set of statistical estimators is required.Comment: Accepted for publication by MNRAS, 20 pages, 14 figure
- …