212,387 research outputs found

    Information Splitting for Big Data Analytics

    Full text link
    Many statistical models require an estimation of unknown (co)-variance parameter(s) in a model. The estimation usually obtained by maximizing a log-likelihood which involves log determinant terms. In principle, one requires the \emph{observed information}--the negative Hessian matrix or the second derivative of the log-likelihood---to obtain an accurate maximum likelihood estimator according to the Newton method. When one uses the \emph{Fisher information}, the expect value of the observed information, a simpler algorithm than the Newton method is obtained as the Fisher scoring algorithm. With the advance in high-throughput technologies in the biological sciences, recommendation systems and social networks, the sizes of data sets---and the corresponding statistical models---have suddenly increased by several orders of magnitude. Neither the observed information nor the Fisher information is easy to obtained for these big data sets. This paper introduces an information splitting technique to simplify the computation. After splitting the mean of the observed information and the Fisher information, an simpler approximate Hessian matrix for the log-likelihood can be obtained. This approximated Hessian matrix can significantly reduce computations, and makes the linear mixed model applicable for big data sets. Such a spitting and simpler formulas heavily depends on matrix algebra transforms, and applicable to large scale breeding model, genetics wide association analysis.Comment: arXiv admin note: text overlap with arXiv:1605.0764

    Dropout Training as Adaptive Regularization

    Full text link
    Dropout and other feature noising schemes control overfitting by artificially corrupting the training data. For generalized linear models, dropout performs a form of adaptive regularization. Using this viewpoint, we show that the dropout regularizer is first-order equivalent to an L2 regularizer applied after scaling the features by an estimate of the inverse diagonal Fisher information matrix. We also establish a connection to AdaGrad, an online learning algorithm, and find that a close relative of AdaGrad operates by repeatedly solving linear dropout-regularized problems. By casting dropout as regularization, we develop a natural semi-supervised algorithm that uses unlabeled data to create a better adaptive regularizer. We apply this idea to document classification tasks, and show that it consistently boosts the performance of dropout training, improving on state-of-the-art results on the IMDB reviews dataset.Comment: 11 pages. Advances in Neural Information Processing Systems (NIPS), 201

    Optimal inputs for system identification

    Get PDF
    Identification criteria are presented for linear dynamic systems with and without process noise. With process noise, the state equations are replaced by the Kalman filter equations. If the identification performance index is expanded in a Taylor's series with respect to the parameters to be identified, then maximizing the weighting factor of the quadratic term with respect to the inputs will insure that an identification algorithm will converge more rapidly and to a more accurate result than with nonoptimal inputs. The expectation of this weighting factor is the Fisher information matrix, and its inverse is a lower bound for the covariance of the parameters. Direct and indirect methods of calculating the information matrix are presented for systems with and without process noise

    On the solution of Stein's equation and Fisher information matrix of an ARMAX process

    Get PDF
    The main goal of this paper consists in expressing the solution of a Stein equation in terms of the Fisher information matrix (FIM) of a scalar ARMAX process. A condition for expressing the FIM in terms of a solution to a Stein equation is also set forth. Such interconnections can be derived when a companion matrix with eigenvalues equal to the roots of an appropriate polynomial associated with the ARMAX process is inserted in the Stein equation. The case of algebraic multiplicity greater than or equal to one is studied. The FIM and the corresponding solution to Stein’s equation are presented as solutions to systems of linear equations. The interconnections are obtained by using the common particular solution of these systems. The kernels of the structured coefficient matrices are described as well as some right inverses. This enables us to find a solution to the newly obtained linear system of equations

    Delineating Parameter Unidentifiabilities in Complex Models

    Full text link
    Scientists use mathematical modelling to understand and predict the properties of complex physical systems. In highly parameterised models there often exist relationships between parameters over which model predictions are identical, or nearly so. These are known as structural or practical unidentifiabilities, respectively. They are hard to diagnose and make reliable parameter estimation from data impossible. They furthermore imply the existence of an underlying model simplification. We describe a scalable method for detecting unidentifiabilities, and the functional relations defining them, for generic models. This allows for model simplification, and appreciation of which parameters (or functions thereof) cannot be estimated from data. Our algorithm can identify features such as redundant mechanisms and fast timescale subsystems, as well as the regimes in which such approximations are valid. We base our algorithm on a novel quantification of regional parametric sensitivity: multiscale sloppiness. Traditionally, the link between parametric sensitivity and the conditioning of the parameter estimation problem is made locally, through the Fisher Information Matrix. This is valid in the regime of infinitesimal measurement uncertainty. We demonstrate the duality between multiscale sloppiness and the geometry of confidence regions surrounding parameter estimates made where measurement uncertainty is non-negligible. Further theoretical relationships are provided linking multiscale sloppiness to the Likelihood-ratio test. From this, we show that a local sensitivity analysis (as typically done) is insufficient for determining the reliability of parameter estimation, even with simple (non)linear systems. Our algorithm provides a tractable alternative. We finally apply our methods to a large-scale, benchmark Systems Biology model of NF-κ\kappaB, uncovering previously unknown unidentifiabilities

    Quantum speed limits on operator flows and correlation functions

    Get PDF
    Quantum speed limits (QSLs) identify fundamental time scales of physical processes by providing lower bounds on the rate of change of a quantum state or the expectation value of an observable. We introduce a generalization of QSL for unitary operator flows, which are ubiquitous in physics and relevant for applications in both the quantum and classical domains. We derive two types of QSLs and assess the existence of a crossover between them, that we illustrate with a qubit and a random matrix Hamiltonian, as canonical examples. We further apply our results to the time evolution of autocorrelation functions, obtaining computable constraints on the linear dynamical response of quantum systems out of equilibrium and the quantum Fisher information governing the precision in quantum parameter estimation.Comment: 14 pages, 4 figure

    Fisher information matrix for single molecules with stochastic trajectories

    Full text link
    Tracking of objects in cellular environments has become a vital tool in molecular cell biology. A particularly important example is single molecule tracking which enables the study of the motion of a molecule in cellular environments and provides quantitative information on the behavior of individual molecules in cellular environments, which were not available before through bulk studies. Here, we consider a dynamical system where the motion of an object is modeled by stochastic differential equations (SDEs), and measurements are the detected photons emitted by the moving fluorescently labeled object, which occur at discrete time points, corresponding to the arrival times of a Poisson process, in contrast to uniform time points which have been commonly used in similar dynamical systems. The measurements are distributed according to optical diffraction theory, and therefore, they would be modeled by different distributions, e.g., a Born and Wolf profile for an out-of-focus molecule. For some special circumstances, Gaussian image models have been proposed. In this paper, we introduce a stochastic framework in which we calculate the maximum likelihood estimates of the biophysical parameters of the molecular interactions, e.g., diffusion and drift coefficients. More importantly, we develop a general framework to calculate the Cram\'er-Rao lower bound (CRLB), given by the inverse of the Fisher information matrix, for the estimation of unknown parameters and use it as a benchmark in the evaluation of the standard deviation of the estimates. There exists no established method, even for Gaussian measurements, to systematically calculate the CRLB for the general motion model that we consider in this paper. We apply the developed methodology to simulated data of a molecule with linear trajectories and show that the standard deviation of the estimates matches well with the square root of the CRLB

    Limitations of the Empirical Fisher Approximation for Natural Gradient Descent

    Full text link
    Natural gradient descent, which preconditions a gradient descent update with the Fisher information matrix of the underlying statistical model, is a way to capture partial second-order information. Several highly visible works have advocated an approximation known as the empirical Fisher, drawing connections between approximate second-order methods and heuristics like Adam. We dispute this argument by showing that the empirical Fisher---unlike the Fisher---does not generally capture second-order information. We further argue that the conditions under which the empirical Fisher approaches the Fisher (and the Hessian) are unlikely to be met in practice, and that, even on simple optimization problems, the pathologies of the empirical Fisher can have undesirable effects.Comment: V3: Minor corrections (typographic errors
    • …
    corecore