516 research outputs found

    Estimation of a distribution function by an indirect sample

    No full text
    The problem of estimation of a distribution function is considered in the case where the observer has access only to a part of the indicator random values. Some basic asymptotic properties of the constructed estimates are studied. The limit theorems are proved for continuous functionals related to the estimation of F^n(x) in the space C[a, 1 - a], 0 < a < 1/2.Розглянуто задачу оцінювання функції розподілу у випадку, коли спостерігач має доступ лише до деяких індикаторних випадкових значень. Вивчено деякі базові асимптотичні властивості побудованих оцінок. У статгі доведено граничні теореми для неперервних функціоналів щодо оцінки Fn(x) у просторі C[a,1−a],0 < a < 1/2

    About Testing the Hypothesis of Equality of Two Bernoulli Regression Curves

    Get PDF
    The limiting distribution of an integral square deviation between two kernel type estimators of Bernoulli regression functions is established in the case of two independent samples. The criterion of testing is constructed for both simple and composite hypotheses of equality of two Bernoulli regression functions. The question of consistency is studied. The asymptotics of behavior of the power of test is investigated for some close alternatives. Keywords: Bernoulli Regression Function, Power of Test, Consistency, Composite Hypothesi

    Integral Functionals of the Gasser–Muller Regression Function

    No full text
    For integral functionals of the Gasser–Muller regression function and its derivatives, we consider the plug-in estimator. The consistency and asymptotic normality of the estimator are shown.Для інтегральних функцiоналiв Функції регресії Гассера-Мюллера та їх похідних розглядається оцінка, що підключається. Встановлено обґрунтованість та асимптотичну нормальність цієї оцінки

    Conditional stochastic dominance tests in dynamic settings

    Get PDF
    This paper proposes nonparametric consistent tests of conditional stochastic dominance of arbitrary order in a dynamic setting. The novelty of these tests lies in the nonparametric manner of incorporating the information set. The test allows for general forms of unknown serial and mutual dependence between random variables, and has an asymptotic distribution that can be easily approximated by simulation. This method has good finite-sample performance. These tests are applied to determine investment efficiency between US industry portfolios conditional on the dynamics of the market portfolio. The empirical analysis suggests that telecommunications dominates the other sectoral portfolios under risk aversion

    Local generalised method of moments: an application to point process-based rainfall models

    Get PDF
    Long series of simulated rainfall are required at point locations for a range of applications, including hydrological studies. Clustered point process-based rainfall models have been used for generating such simulations for many decades. These models suffer from a major limitation, however, their stationarity. Although seasonality can be allowed by fitting separate models for each calendar month or season, the models are unsuitable in their basic form for climate impact studies. In this paper, we develop new methodology to address this limitation. We extend the current fitting approach by allowing the discrete covariate, calendar month, to be replaced or supplemented with continuous covariates that are more directly related to the incidence and nature of rainfall. The covariate-dependent model parameters are estimated for each time interval using a kernel-based nonparametric approach within a generalised method-of-moments framework. An empirical study demonstrates the new methodology using a time series of 5-min rainfall data. The study considers both local mean and local linear approaches. While asymptotic results are included, the focus is on developing useable methodology for a complex model that can only be solved numerically. Issues including the choice of weighting matrix, estimation of parameter uncertainty and bandwidth and model selection are considered from this perspective

    Non-Redundant Spectral Dimensionality Reduction

    Full text link
    Spectral dimensionality reduction algorithms are widely used in numerous domains, including for recognition, segmentation, tracking and visualization. However, despite their popularity, these algorithms suffer from a major limitation known as the "repeated Eigen-directions" phenomenon. That is, many of the embedding coordinates they produce typically capture the same direction along the data manifold. This leads to redundant and inefficient representations that do not reveal the true intrinsic dimensionality of the data. In this paper, we propose a general method for avoiding redundancy in spectral algorithms. Our approach relies on replacing the orthogonality constraints underlying those methods by unpredictability constraints. Specifically, we require that each embedding coordinate be unpredictable (in the statistical sense) from all previous ones. We prove that these constraints necessarily prevent redundancy, and provide a simple technique to incorporate them into existing methods. As we illustrate on challenging high-dimensional scenarios, our approach produces significantly more informative and compact representations, which improve visualization and classification tasks

    The distribution of exoplanet masses

    Full text link
    The present study derives the distribution of secondary masses M2 for the 67 exoplanets and very low-mass brown dwarf companions of solar-type stars, known as of April 4, 2001. This distribution is related to the distribution of M2 sin i through an integral equation of Abel's type. Although a formal solution exists for this equation, it is known to be ill-behaved, and thus very sensitive to the statistical noise present in the input M2 sin i distribution. To overcome that difficulty, we present two robust, independent approaches: (i) the formal solution of the integral equation is numerically computed after performing an optimal smoothing of the input distribution, (ii) the Lucy-Richardson algorithm is used to invert the integral equation. Both approaches give consistent results. The resulting statistical distribution of exoplanet true masses reveals that there is no reason to ascribe the transition between giant planets and brown dwarfs to the threshold mass for deuterium ignition (about 13 MJ). The M2 distribution shows instead that all the objects have M2 < 10 MJ, except the heavier candidates which cluster around 15 MJ.Comment: Accepted by Astronomy & Astrophysics (7 pages, 4 figures

    Simultaneous interval regression for K-nearest neighbor

    Get PDF
    International audienceIn some regression problems, it may be more reasonable to predict intervals rather than precise values. We are interested in finding intervals which simultaneously for all input instances x ∈X contain a β proportion of the response values. We name this problem simultaneous interval regression. This is similar to simultaneous tolerance intervals for regression with a high confidence level γ ≈ 1 and several authors have already treated this problem for linear regression. Such intervals could be seen as a form of confidence envelop for the prediction variable given any value of predictor variables in their domain. Tolerance intervals and simultaneous tolerance intervals have not yet been treated for the K-nearest neighbor (KNN) regression method. The goal of this paper is to consider the simultaneous interval regression problem for KNN and this is done without the homoscedasticity assumption. In this scope, we propose a new interval regression method based on KNN which takes advantage of tolerance intervals in order to choose, for each instance, the value of the hyper-parameter K which will be a good trade-off between the precision and the uncertainty due to the limited sample size of the neighborhood around each instance. In the experiment part, our proposed interval construction method is compared with a more conventional interval approximation method on six benchmark regression data sets

    Non-linear regression models for Approximate Bayesian Computation

    Full text link
    Approximate Bayesian inference on the basis of summary statistics is well-suited to complex problems for which the likelihood is either mathematically or computationally intractable. However the methods that use rejection suffer from the curse of dimensionality when the number of summary statistics is increased. Here we propose a machine-learning approach to the estimation of the posterior density by introducing two innovations. The new method fits a nonlinear conditional heteroscedastic regression of the parameter on the summary statistics, and then adaptively improves estimation using importance sampling. The new algorithm is compared to the state-of-the-art approximate Bayesian methods, and achieves considerable reduction of the computational burden in two examples of inference in statistical genetics and in a queueing model.Comment: 4 figures; version 3 minor changes; to appear in Statistics and Computin
    corecore