55,590 research outputs found

    Sequential Quantiles via Hermite Series Density Estimation

    Full text link
    Sequential quantile estimation refers to incorporating observations into quantile estimates in an incremental fashion thus furnishing an online estimate of one or more quantiles at any given point in time. Sequential quantile estimation is also known as online quantile estimation. This area is relevant to the analysis of data streams and to the one-pass analysis of massive data sets. Applications include network traffic and latency analysis, real time fraud detection and high frequency trading. We introduce new techniques for online quantile estimation based on Hermite series estimators in the settings of static quantile estimation and dynamic quantile estimation. In the static quantile estimation setting we apply the existing Gauss-Hermite expansion in a novel manner. In particular, we exploit the fact that Gauss-Hermite coefficients can be updated in a sequential manner. To treat dynamic quantile estimation we introduce a novel expansion with an exponentially weighted estimator for the Gauss-Hermite coefficients which we term the Exponentially Weighted Gauss-Hermite (EWGH) expansion. These algorithms go beyond existing sequential quantile estimation algorithms in that they allow arbitrary quantiles (as opposed to pre-specified quantiles) to be estimated at any point in time. In doing so we provide a solution to online distribution function and online quantile function estimation on data streams. In particular we derive an analytical expression for the CDF and prove consistency results for the CDF under certain conditions. In addition we analyse the associated quantile estimator. Simulation studies and tests on real data reveal the Gauss-Hermite based algorithms to be competitive with a leading existing algorithm.Comment: 43 pages, 9 figures. Improved version incorporating referee comments, as appears in Electronic Journal of Statistic

    Open-cluster density profiles derived using a kernel estimator

    Full text link
    Surface and spatial radial density profiles in open clusters are derived using a kernel estimator method. Formulae are obtained for the contribution of every star into the spatial density profile. The evaluation of spatial density profiles is tested against open-cluster models from N-body experiments with N = 500. Surface density profiles are derived for seven open clusters (NGC 1502, 1960, 2287, 2516, 2682, 6819 and 6939) using Two-Micron All-Sky Survey data and for different limiting magnitudes. The selection of an optimal kernel half-width is discussed. It is shown that open-cluster radius estimates hardly depend on the kernel half-width. Hints of stellar mass segregation and structural features indicating cluster non-stationarity in the regular force field are found. A comparison with other investigations shows that the data on open-cluster sizes are often underestimated. The existence of an extended corona around the open cluster NGC 6939 was confirmed. A combined function composed of the King density profile for the cluster core and the uniform sphere for the cluster corona is shown to be a better approximation of the surface radial density profile.The King function alone does not reproduce surface density profiles of sample clusters properly. The number of stars, the cluster masses and the tidal radii in the Galactic gravitational field for the sample clusters are estimated. It is shown that NGC 6819 and 6939 are extended beyond their tidal surfaces.Comment: 17 pages, 15 figure
    corecore