Search CORE

3,120 research outputs found

Recent advances in directional statistics

Author: García-Portugués Eduardo
Pewsey Arthur
Publication venue
Publication date: 22/09/2020
Field of study

Mainstream statistical methodology is generally applicable to data observed in Euclidean space. There are, however, numerous contexts of considerable scientific interest in which the natural supports for the data under consideration are Riemannian manifolds like the unit circle, torus, sphere and their extensions. Typically, such data can be represented using one or more directions, and directional statistics is the branch of statistics that deals with their analysis. In this paper we provide a review of the many recent developments in the field since the publication of Mardia and Jupp (1999), still the most comprehensive text on directional statistics. Many of those developments have been stimulated by interesting applications in fields as diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics, image analysis, text mining, environmetrics, and machine learning. We begin by considering developments for the exploratory analysis of directional data before progressing to distributional models, general approaches to inference, hypothesis testing, regression, nonparametric curve estimation, methods for dimension reduction, classification and clustering, and the modelling of time series, spatial and spatio-temporal data. An overview of currently available software for analysing directional data is also provided, and potential future developments discussed.Comment: 61 page

arXiv.org e-Print Archive

Crossref

Universidad Carlos III de Madrid e-Archivo

Hybrid PDE solver for data-driven problems and modern branching

Author: Brown André EX
Ch'ng Quee-Lim
Currie Michael
Grundy Laura J
Hokanson Jim
Javer Avelino
Kerr Rex
Lee Chee Wai
Li Chris
Li Kezhi
Schafer William R
Yemini Eviatar
Publication venue
Publication date: 10/05/2017
Field of study

The numerical solution of large-scale PDEs, such as those occurring in data-driven applications, unavoidably require powerful parallel computers and tailored parallel algorithms to make the best possible use of them. In fact, considerations about the parallelization and scalability of realistic problems are often critical enough to warrant acknowledgement in the modelling phase. The purpose of this paper is to spread awareness of the Probabilistic Domain Decomposition (PDD) method, a fresh approach to the parallelization of PDEs with excellent scalability properties. The idea exploits the stochastic representation of the PDE and its approximation via Monte Carlo in combination with deterministic high-performance PDE solvers. We describe the ingredients of PDD and its applicability in the scope of data science. In particular, we highlight recent advances in stochastic representations for nonlinear PDEs using branching diffusions, which have significantly broadened the scope of PDD. We envision this work as a dictionary giving large-scale PDE practitioners references on the very latest algorithms and techniques of a non-standard, yet highly parallelizable, methodology at the interface of deterministic and probabilistic numerical methods. We close this work with an invitation to the fully nonlinear case and open research questions.Comment: 23 pages, 7 figures; Final SMUR version; To appear in the European Journal of Applied Mathematics (EJAM

arXiv.org e-Print Archive

ZENODO

FigShare

Autoregressive Kernels For Time Series

Author: Cuturi Marco
Doucet Arnaud
Publication venue
Publication date: 01/01/2011
Field of study

We propose in this work a new family of kernels for variable-length time series. Our work builds upon the vector autoregressive (VAR) model for multivariate stochastic processes: given a multivariate time series x, we consider the likelihood function p_{\theta}(x) of different parameters \theta in the VAR model as features to describe x. To compare two time series x and x', we form the product of their features p_{\theta}(x) p_{\theta}(x') which is integrated out w.r.t \theta using a matrix normal-inverse Wishart prior. Among other properties, this kernel can be easily computed when the dimension d of the time series is much larger than the lengths of the considered time series x and x'. It can also be generalized to time series taking values in arbitrary state spaces, as long as the state space itself is endowed with a kernel \kappa. In that case, the kernel between x and x' is a a function of the Gram matrices produced by \kappa on observations and subsequences of observations enumerated in x and x'. We describe a computationally efficient implementation of this generalization that uses low-rank matrix factorization techniques. These kernels are compared to other known kernels using a set of benchmark classification tasks carried out with support vector machines

arXiv.org e-Print Archive

CiteSeerX

The PDD method for solving linear, nonlinear, and fractional PDEs problems

Author: Acebron J.A.
Rodriguez-Rozas A.
Spigler R.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

We review the Probabilistic Domain Decomposition (PDD) method for the numerical solution of linear and nonlinear Partial Differential Equation (PDE) problems. This Domain Decomposition (DD) method is based on a suitable probabilistic representation of the solution given in the form of an expectation which, in turns, involves the solution of a Stochastic Differential Equation (SDE). While the structure of the SDE depends only upon the corresponding PDE, the expectation also depends upon the boundary data of the problem. The method consists of three stages: (i) only few values of the sought solution are solved by Monte Carlo or Quasi-Monte Carlo at some interfaces; (ii) a continuous approximation of the solution over these interfaces is obtained via interpolation; and (iii) prescribing the previous (partial) solutions as additional Dirichlet boundary conditions, a fully decoupled set of sub-problems is finally solved in parallel. For linear parabolic problems, this is based on the celebrated Feynman-Kac formula, while for semilinear parabolic equations requires a suitable generalization based on branching diffusion processes. In case of semilinear transport equations and the Vlasov-Poisson system, a generalization of the probabilistic representation was also obtained in terms of the Method of Characteristics (characteristic curves). Finally, we present the latest progress towards the extension of the PDD method for nonlocal fractional operators. The algorithm notably improves the scalability of classical algorithms and is suited to massively parallel implementation, enjoying arbitrary scalability and fault tolerance properties. Numerical examples conducted in 1D and 2D, including some for the KPP equation and Plasma Physics, are given.info:eu-repo/semantics/acceptedVersio

Repositório Institucional do ISCTE-IUL