3,120 research outputs found
Recent advances in directional statistics
Mainstream statistical methodology is generally applicable to data observed
in Euclidean space. There are, however, numerous contexts of considerable
scientific interest in which the natural supports for the data under
consideration are Riemannian manifolds like the unit circle, torus, sphere and
their extensions. Typically, such data can be represented using one or more
directions, and directional statistics is the branch of statistics that deals
with their analysis. In this paper we provide a review of the many recent
developments in the field since the publication of Mardia and Jupp (1999),
still the most comprehensive text on directional statistics. Many of those
developments have been stimulated by interesting applications in fields as
diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics,
image analysis, text mining, environmetrics, and machine learning. We begin by
considering developments for the exploratory analysis of directional data
before progressing to distributional models, general approaches to inference,
hypothesis testing, regression, nonparametric curve estimation, methods for
dimension reduction, classification and clustering, and the modelling of time
series, spatial and spatio-temporal data. An overview of currently available
software for analysing directional data is also provided, and potential future
developments discussed.Comment: 61 page
Hybrid PDE solver for data-driven problems and modern branching
The numerical solution of large-scale PDEs, such as those occurring in
data-driven applications, unavoidably require powerful parallel computers and
tailored parallel algorithms to make the best possible use of them. In fact,
considerations about the parallelization and scalability of realistic problems
are often critical enough to warrant acknowledgement in the modelling phase.
The purpose of this paper is to spread awareness of the Probabilistic Domain
Decomposition (PDD) method, a fresh approach to the parallelization of PDEs
with excellent scalability properties. The idea exploits the stochastic
representation of the PDE and its approximation via Monte Carlo in combination
with deterministic high-performance PDE solvers. We describe the ingredients of
PDD and its applicability in the scope of data science. In particular, we
highlight recent advances in stochastic representations for nonlinear PDEs
using branching diffusions, which have significantly broadened the scope of
PDD.
We envision this work as a dictionary giving large-scale PDE practitioners
references on the very latest algorithms and techniques of a non-standard, yet
highly parallelizable, methodology at the interface of deterministic and
probabilistic numerical methods. We close this work with an invitation to the
fully nonlinear case and open research questions.Comment: 23 pages, 7 figures; Final SMUR version; To appear in the European
Journal of Applied Mathematics (EJAM
Autoregressive Kernels For Time Series
We propose in this work a new family of kernels for variable-length time
series. Our work builds upon the vector autoregressive (VAR) model for
multivariate stochastic processes: given a multivariate time series x, we
consider the likelihood function p_{\theta}(x) of different parameters \theta
in the VAR model as features to describe x. To compare two time series x and
x', we form the product of their features p_{\theta}(x) p_{\theta}(x') which is
integrated out w.r.t \theta using a matrix normal-inverse Wishart prior. Among
other properties, this kernel can be easily computed when the dimension d of
the time series is much larger than the lengths of the considered time series x
and x'. It can also be generalized to time series taking values in arbitrary
state spaces, as long as the state space itself is endowed with a kernel
\kappa. In that case, the kernel between x and x' is a a function of the Gram
matrices produced by \kappa on observations and subsequences of observations
enumerated in x and x'. We describe a computationally efficient implementation
of this generalization that uses low-rank matrix factorization techniques.
These kernels are compared to other known kernels using a set of benchmark
classification tasks carried out with support vector machines
The PDD method for solving linear, nonlinear, and fractional PDEs problems
We review the Probabilistic Domain Decomposition (PDD) method for the numerical solution of linear and nonlinear Partial Differential Equation (PDE) problems. This Domain Decomposition (DD) method is based on a suitable probabilistic representation of the solution given in the form of an expectation which, in turns, involves the solution of a Stochastic Differential Equation (SDE). While the structure of the SDE depends only upon the corresponding PDE, the expectation also depends upon the boundary data of the problem. The method consists of three stages: (i) only few values of the sought solution are solved by Monte Carlo or Quasi-Monte Carlo at some interfaces; (ii) a continuous approximation of the solution over these interfaces is obtained via interpolation; and (iii) prescribing the previous (partial) solutions as additional Dirichlet boundary conditions, a fully decoupled set of sub-problems is finally solved in parallel. For linear parabolic problems, this is based on the celebrated Feynman-Kac formula, while for semilinear parabolic equations requires a suitable generalization based on branching diffusion processes. In case of semilinear transport equations and the Vlasov-Poisson system, a generalization of the probabilistic representation was also obtained in terms of the Method of Characteristics (characteristic curves). Finally, we present the latest progress towards the extension of the PDD method for nonlocal fractional operators. The algorithm notably improves the scalability of classical algorithms and is suited to massively parallel implementation, enjoying arbitrary scalability and fault tolerance properties. Numerical examples conducted in 1D and 2D, including some for the KPP equation and Plasma Physics, are given.info:eu-repo/semantics/acceptedVersio
- …