55,590 research outputs found
Sequential Quantiles via Hermite Series Density Estimation
Sequential quantile estimation refers to incorporating observations into
quantile estimates in an incremental fashion thus furnishing an online estimate
of one or more quantiles at any given point in time. Sequential quantile
estimation is also known as online quantile estimation. This area is relevant
to the analysis of data streams and to the one-pass analysis of massive data
sets. Applications include network traffic and latency analysis, real time
fraud detection and high frequency trading. We introduce new techniques for
online quantile estimation based on Hermite series estimators in the settings
of static quantile estimation and dynamic quantile estimation. In the static
quantile estimation setting we apply the existing Gauss-Hermite expansion in a
novel manner. In particular, we exploit the fact that Gauss-Hermite
coefficients can be updated in a sequential manner. To treat dynamic quantile
estimation we introduce a novel expansion with an exponentially weighted
estimator for the Gauss-Hermite coefficients which we term the Exponentially
Weighted Gauss-Hermite (EWGH) expansion. These algorithms go beyond existing
sequential quantile estimation algorithms in that they allow arbitrary
quantiles (as opposed to pre-specified quantiles) to be estimated at any point
in time. In doing so we provide a solution to online distribution function and
online quantile function estimation on data streams. In particular we derive an
analytical expression for the CDF and prove consistency results for the CDF
under certain conditions. In addition we analyse the associated quantile
estimator. Simulation studies and tests on real data reveal the Gauss-Hermite
based algorithms to be competitive with a leading existing algorithm.Comment: 43 pages, 9 figures. Improved version incorporating referee comments,
as appears in Electronic Journal of Statistic
Open-cluster density profiles derived using a kernel estimator
Surface and spatial radial density profiles in open clusters are derived
using a kernel estimator method. Formulae are obtained for the contribution of
every star into the spatial density profile. The evaluation of spatial density
profiles is tested against open-cluster models from N-body experiments with N =
500. Surface density profiles are derived for seven open clusters (NGC 1502,
1960, 2287, 2516, 2682, 6819 and 6939) using Two-Micron All-Sky Survey data and
for different limiting magnitudes. The selection of an optimal kernel
half-width is discussed. It is shown that open-cluster radius estimates hardly
depend on the kernel half-width. Hints of stellar mass segregation and
structural features indicating cluster non-stationarity in the regular force
field are found. A comparison with other investigations shows that the data on
open-cluster sizes are often underestimated. The existence of an extended
corona around the open cluster NGC 6939 was confirmed. A combined function
composed of the King density profile for the cluster core and the uniform
sphere for the cluster corona is shown to be a better approximation of the
surface radial density profile.The King function alone does not reproduce
surface density profiles of sample clusters properly. The number of stars, the
cluster masses and the tidal radii in the Galactic gravitational field for the
sample clusters are estimated. It is shown that NGC 6819 and 6939 are extended
beyond their tidal surfaces.Comment: 17 pages, 15 figure
- …