85,862 research outputs found

    An overview of the goodness-of-fit test problem for copulas

    Full text link
    We review the main "omnibus procedures" for goodness-of-fit testing for copulas: tests based on the empirical copula process, on probability integral transformations, on Kendall's dependence function, etc, and some corresponding reductions of dimension techniques. The problems of finding asymptotic distribution-free test statistics and the calculation of reliable p-values are discussed. Some particular cases, like convenient tests for time-dependent copulas, for Archimedean or extreme-value copulas, etc, are dealt with. Finally, the practical performances of the proposed approaches are briefly summarized

    Building and using semiparametric tolerance regions for parametric multinomial models

    Full text link
    We introduce a semiparametric ``tubular neighborhood'' of a parametric model in the multinomial setting. It consists of all multinomial distributions lying in a distance-based neighborhood of the parametric model of interest. Fitting such a tubular model allows one to use a parametric model while treating it as an approximation to the true distribution. In this paper, the Kullback--Leibler distance is used to build the tubular region. Based on this idea one can define the distance between the true multinomial distribution and the parametric model to be the index of fit. The paper develops a likelihood ratio test procedure for testing the magnitude of the index. A semiparametric bootstrap method is implemented to better approximate the distribution of the LRT statistic. The approximation permits more accurate construction of a lower confidence limit for the model fitting index.Comment: Published in at http://dx.doi.org/10.1214/08-AOS603 the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Tailor-made tests for goodness of fit to semiparametric hypotheses

    Full text link
    We introduce a new framework for constructing tests of general semiparametric hypotheses which have nontrivial power on the n−1/2n^{-1/2} scale in every direction, and can be tailored to put substantial power on alternatives of importance. The approach is based on combining test statistics based on stochastic processes of score statistics with bootstrap critical values.Comment: Published at http://dx.doi.org/10.1214/009053606000000137 in the Annals of Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A simple and general test for white noise

    Get PDF
    This article considers testing that a time series is uncorrelated when it possibly exhibits some form of dependence. Contrary to the currently employed tests that require selecting arbitrary user-chosen numbers to compute the associated tests statistics, we consider a test statistic that is very simple to use because it does not require any user chosen number and because its asymptotic null distribution is standard under general weak dependent conditions, and hence, asymptotic critical values are readily available. We consider the case of testing that the raw data is white noise, and also consider the case of applying the test to the residuals of an ARMA model. Finally, we also study finite sample performance

    Analyzing Network Traffic for Malicious Hacker Activity

    Get PDF
    Since the Internet came into life in the 1970s, it has been growing more than 100% every year. On the other hand, the solutions to detecting network intrusion are far outpaced. The economic impact of malicious attacks in lost revenue to a single e-commerce company can vary from 66 thousand up to 53 million US dollars. At the same time, there is no effective mathematical model widely available to distinguish anomaly network behaviours such as port scanning, system exploring, virus and worm propagation from normal traffic. PDS proposed by Random Knowledge Inc., detects and localizes traffic patterns consistent with attacks hidden within large amounts of legitimate traffic. With the network’s packet traffic stream being its input, PDS relies on high fidelity models for normal traffic from which it can critically judge the legitimacy of any substream of packet traffic. Because of the reliability on an accurate baseline model for normal network traffic, in this workshop, we concentrate on modelling normal network traffic with a Poisson process

    Model Assessment Tools for a Model False World

    Full text link
    A standard goal of model evaluation and selection is to find a model that approximates the truth well while at the same time is as parsimonious as possible. In this paper we emphasize the point of view that the models under consideration are almost always false, if viewed realistically, and so we should analyze model adequacy from that point of view. We investigate this issue in large samples by looking at a model credibility index, which is designed to serve as a one-number summary measure of model adequacy. We define the index to be the maximum sample size at which samples from the model and those from the true data generating mechanism are nearly indistinguishable. We use standard notions from hypothesis testing to make this definition precise. We use data subsampling to estimate the index. We show that the definition leads us to some new ways of viewing models as flawed but useful. The concept is an extension of the work of Davies [Statist. Neerlandica 49 (1995) 185--245].Comment: Published in at http://dx.doi.org/10.1214/09-STS302 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    A Linear-Time Kernel Goodness-of-Fit Test

    Full text link
    We propose a novel adaptive test of goodness-of-fit, with computational cost linear in the number of samples. We learn the test features that best indicate the differences between observed samples and a reference model, by minimizing the false negative rate. These features are constructed via Stein's method, meaning that it is not necessary to compute the normalising constant of the model. We analyse the asymptotic Bahadur efficiency of the new test, and prove that under a mean-shift alternative, our test always has greater relative efficiency than a previous linear-time kernel test, regardless of the choice of parameters for that test. In experiments, the performance of our method exceeds that of the earlier linear-time test, and matches or exceeds the power of a quadratic-time kernel test. In high dimensions and where model structure may be exploited, our goodness of fit test performs far better than a quadratic-time two-sample test based on the Maximum Mean Discrepancy, with samples drawn from the model.Comment: Accepted to NIPS 201

    Markov models for fMRI correlation structure: is brain functional connectivity small world, or decomposable into networks?

    Get PDF
    Correlations in the signal observed via functional Magnetic Resonance Imaging (fMRI), are expected to reveal the interactions in the underlying neural populations through hemodynamic response. In particular, they highlight distributed set of mutually correlated regions that correspond to brain networks related to different cognitive functions. Yet graph-theoretical studies of neural connections give a different picture: that of a highly integrated system with small-world properties: local clustering but with short pathways across the complete structure. We examine the conditional independence properties of the fMRI signal, i.e. its Markov structure, to find realistic assumptions on the connectivity structure that are required to explain the observed functional connectivity. In particular we seek a decomposition of the Markov structure into segregated functional networks using decomposable graphs: a set of strongly-connected and partially overlapping cliques. We introduce a new method to efficiently extract such cliques on a large, strongly-connected graph. We compare methods learning different graph structures from functional connectivity by testing the goodness of fit of the model they learn on new data. We find that summarizing the structure as strongly-connected networks can give a good description only for very large and overlapping networks. These results highlight that Markov models are good tools to identify the structure of brain connectivity from fMRI signals, but for this purpose they must reflect the small-world properties of the underlying neural systems
    • 

    corecore