4,075 research outputs found

    Model selection in High-Dimensions: A Quadratic-risk based approach

    Full text link
    In this article we propose a general class of risk measures which can be used for data based evaluation of parametric models. The loss function is defined as generalized quadratic distance between the true density and the proposed model. These distances are characterized by a simple quadratic form structure that is adaptable through the choice of a nonnegative definite kernel and a bandwidth parameter. Using asymptotic results for the quadratic distances we build a quick-to-compute approximation for the risk function. Its derivation is analogous to the Akaike Information Criterion (AIC), but unlike AIC, the quadratic risk is a global comparison tool. The method does not require resampling, a great advantage when point estimators are expensive to compute. The method is illustrated using the problem of selecting the number of components in a mixture model, where it is shown that, by using an appropriate kernel, the method is computationally straightforward in arbitrarily high data dimensions. In this same context it is shown that the method has some clear advantages over AIC and BIC.Comment: Updated with reviewer suggestion

    Prediction and Generation of Binary Markov Processes: Can a Finite-State Fox Catch a Markov Mouse?

    Get PDF
    Understanding the generative mechanism of a natural system is a vital component of the scientific method. Here, we investigate one of the fundamental steps toward this goal by presenting the minimal generator of an arbitrary binary Markov process. This is a class of processes whose predictive model is well known. Surprisingly, the generative model requires three distinct topologies for different regions of parameter space. We show that a previously proposed generator for a particular set of binary Markov processes is, in fact, not minimal. Our results shed the first quantitative light on the relative (minimal) costs of prediction and generation. We find, for instance, that the difference between prediction and generation is maximized when the process is approximately independently, identically distributed.Comment: 12 pages, 12 figures; http://csc.ucdavis.edu/~cmg/compmech/pubs/gmc.ht

    Dynamic fluctuations in the superconductivity of NbN films from microwave conductivity measurements

    Full text link
    We have measured the frequency and temperature dependences of complex ac conductivity, \sigma(\omega)=\sigma_1(\omega)-i\sigma_2(\omega), of NbN films in zero magnetic field between 0.1 to 10 GHz using a microwave broadband technique. In the vicinity of superconducting critical temperature, Tc, both \sigma_1(\omega) and \sigma_2(\omega) showed a rapid increase in the low frequency limit owing to the fluctuation effect of superconductivity. For the films thinner than 300 nm, frequency and temperature dependences of fluctuation conductivity, \sigma(\omega,T), were successfully scaled onto one scaling function, which was consistent with the Aslamazov and Larkin model for two dimensional (2D) cases. For thicker films, \sigma(\omega,T) data could not be scaled, but indicated that the dimensional crossover from three dimensions (3D) to 2D occurred as the temperature approached Tc from above. This provides a good reference of ac fluctuation conductivity for more exotic superconductors of current interest.Comment: 8 pages, 7 Figures, 1 Table, Accepted for publication in PR

    Adaptive density estimation for stationary processes

    Get PDF
    We propose an algorithm to estimate the common density ss of a stationary process X1,...,XnX_1,...,X_n. We suppose that the process is either ÎČ\beta or τ\tau-mixing. We provide a model selection procedure based on a generalization of Mallows' CpC_p and we prove oracle inequalities for the selected estimator under a few prior assumptions on the collection of models and on the mixing coefficients. We prove that our estimator is adaptive over a class of Besov spaces, namely, we prove that it achieves the same rates of convergence as in the i.i.d framework

    Analyzing the House Fly's Exploratory Behavior with Autoregression Methods

    Full text link
    This paper presents a detailed characterization of the trajectory of a single housefly with free range of a square cage. The trajectory of the fly was recorded and transformed into a time series, which was fully analyzed using an autoregressive model, which describes a stationary time series by a linear regression of prior state values with the white noise. The main discovery was that the fly switched styles of motion from a low dimensional regular pattern to a higher dimensional disordered pattern. This discovered exploratory behavior is, irrespective of the presence of food, characterized by anomalous diffusion.Comment: 20 pages, 9 figures, 1 table, full pape

    Uncovering predictability in the evolution of the WTI oil futures curve

    Full text link
    Accurately forecasting the price of oil, the world's most actively traded commodity, is of great importance to both academics and practitioners. We contribute by proposing a functional time series based method to model and forecast oil futures. Our approach boasts a number of theoretical and practical advantages including effectively exploiting underlying process dynamics missed by classical discrete approaches. We evaluate the finite-sample performance against established benchmarks using a model confidence set test. A realistic out-of-sample exercise provides strong support for the adoption of our approach with it residing in the superior set of models in all considered instances.Comment: 28 pages, 4 figures, to appear in European Financial Managemen

    Large-scale structure of time evolving citation networks

    Full text link
    In this paper we examine a number of methods for probing and understanding the large-scale structure of networks that evolve over time. We focus in particular on citation networks, networks of references between documents such as papers, patents, or court cases. We describe three different methods of analysis, one based on an expectation-maximization algorithm, one based on modularity optimization, and one based on eigenvector centrality. Using the network of citations between opinions of the United States Supreme Court as an example, we demonstrate how each of these methods can reveal significant structural divisions in the network, and how, ultimately, the combination of all three can help us develop a coherent overall picture of the network's shape.Comment: 10 pages, 6 figures; journal names for 4 references fixe

    Detecting periodicity in experimental data using linear modeling techniques

    Get PDF
    Fourier spectral estimates and, to a lesser extent, the autocorrelation function are the primary tools to detect periodicities in experimental data in the physical and biological sciences. We propose a new method which is more reliable than traditional techniques, and is able to make clear identification of periodic behavior when traditional techniques do not. This technique is based on an information theoretic reduction of linear (autoregressive) models so that only the essential features of an autoregressive model are retained. These models we call reduced autoregressive models (RARM). The essential features of reduced autoregressive models include any periodicity present in the data. We provide theoretical and numerical evidence from both experimental and artificial data, to demonstrate that this technique will reliably detect periodicities if and only if they are present in the data. There are strong information theoretic arguments to support the statement that RARM detects periodicities if they are present. Surrogate data techniques are used to ensure the converse. Furthermore, our calculations demonstrate that RARM is more robust, more accurate, and more sensitive, than traditional spectral techniques.Comment: 10 pages (revtex) and 6 figures. To appear in Phys Rev E. Modified styl

    An approximate Bayesian marginal likelihood approach for estimating finite mixtures

    Full text link
    Estimation of finite mixture models when the mixing distribution support is unknown is an important problem. This paper gives a new approach based on a marginal likelihood for the unknown support. Motivated by a Bayesian Dirichlet prior model, a computationally efficient stochastic approximation version of the marginal likelihood is proposed and large-sample theory is presented. By restricting the support to a finite grid, a simulated annealing method is employed to maximize the marginal likelihood and estimate the support. Real and simulated data examples show that this novel stochastic approximation--simulated annealing procedure compares favorably to existing methods.Comment: 16 pages, 1 figure, 3 table
    • 

    corecore