3,120 research outputs found

    Recent advances in directional statistics

    Get PDF
    Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page

    Hybrid PDE solver for data-driven problems and modern branching

    Full text link
    The numerical solution of large-scale PDEs, such as those occurring in data-driven applications, unavoidably require powerful parallel computers and tailored parallel algorithms to make the best possible use of them. In fact, considerations about the parallelization and scalability of realistic problems are often critical enough to warrant acknowledgement in the modelling phase. The purpose of this paper is to spread awareness of the Probabilistic Domain Decomposition (PDD) method, a fresh approach to the parallelization of PDEs with excellent scalability properties. The idea exploits the stochastic representation of the PDE and its approximation via Monte Carlo in combination with deterministic high-performance PDE solvers. We describe the ingredients of PDD and its applicability in the scope of data science. In particular, we highlight recent advances in stochastic representations for nonlinear PDEs using branching diffusions, which have significantly broadened the scope of PDD. We envision this work as a dictionary giving large-scale PDE practitioners references on the very latest algorithms and techniques of a non-standard, yet highly parallelizable, methodology at the interface of deterministic and probabilistic numerical methods. We close this work with an invitation to the fully nonlinear case and open research questions.Comment: 23 pages, 7 figures; Final SMUR version; To appear in the European Journal of Applied Mathematics (EJAM

    Autoregressive Kernels For Time Series

    Full text link
    We propose in this work a new family of kernels for variable-length time series. Our work builds upon the vector autoregressive (VAR) model for multivariate stochastic processes: given a multivariate time series x, we consider the likelihood function p_{\theta}(x) of different parameters \theta in the VAR model as features to describe x. To compare two time series x and x', we form the product of their features p_{\theta}(x) p_{\theta}(x') which is integrated out w.r.t \theta using a matrix normal-inverse Wishart prior. Among other properties, this kernel can be easily computed when the dimension d of the time series is much larger than the lengths of the considered time series x and x'. It can also be generalized to time series taking values in arbitrary state spaces, as long as the state space itself is endowed with a kernel \kappa. In that case, the kernel between x and x' is a a function of the Gram matrices produced by \kappa on observations and subsequences of observations enumerated in x and x'. We describe a computationally efficient implementation of this generalization that uses low-rank matrix factorization techniques. These kernels are compared to other known kernels using a set of benchmark classification tasks carried out with support vector machines

    The PDD method for solving linear, nonlinear, and fractional PDEs problems

    Get PDF
    We review the Probabilistic Domain Decomposition (PDD) method for the numerical solution of linear and nonlinear Partial Differential Equation (PDE) problems. This Domain Decomposition (DD) method is based on a suitable probabilistic representation of the solution given in the form of an expectation which, in turns, involves the solution of a Stochastic Differential Equation (SDE). While the structure of the SDE depends only upon the corresponding PDE, the expectation also depends upon the boundary data of the problem. The method consists of three stages: (i) only few values of the sought solution are solved by Monte Carlo or Quasi-Monte Carlo at some interfaces; (ii) a continuous approximation of the solution over these interfaces is obtained via interpolation; and (iii) prescribing the previous (partial) solutions as additional Dirichlet boundary conditions, a fully decoupled set of sub-problems is finally solved in parallel. For linear parabolic problems, this is based on the celebrated Feynman-Kac formula, while for semilinear parabolic equations requires a suitable generalization based on branching diffusion processes. In case of semilinear transport equations and the Vlasov-Poisson system, a generalization of the probabilistic representation was also obtained in terms of the Method of Characteristics (characteristic curves). Finally, we present the latest progress towards the extension of the PDD method for nonlocal fractional operators. The algorithm notably improves the scalability of classical algorithms and is suited to massively parallel implementation, enjoying arbitrary scalability and fault tolerance properties. Numerical examples conducted in 1D and 2D, including some for the KPP equation and Plasma Physics, are given.info:eu-repo/semantics/acceptedVersio
    corecore