202,970 research outputs found
Recent advances in directional statistics
Mainstream statistical methodology is generally applicable to data observed
in Euclidean space. There are, however, numerous contexts of considerable
scientific interest in which the natural supports for the data under
consideration are Riemannian manifolds like the unit circle, torus, sphere and
their extensions. Typically, such data can be represented using one or more
directions, and directional statistics is the branch of statistics that deals
with their analysis. In this paper we provide a review of the many recent
developments in the field since the publication of Mardia and Jupp (1999),
still the most comprehensive text on directional statistics. Many of those
developments have been stimulated by interesting applications in fields as
diverse as astronomy, medicine, genetics, neurology, aeronautics, acoustics,
image analysis, text mining, environmetrics, and machine learning. We begin by
considering developments for the exploratory analysis of directional data
before progressing to distributional models, general approaches to inference,
hypothesis testing, regression, nonparametric curve estimation, methods for
dimension reduction, classification and clustering, and the modelling of time
series, spatial and spatio-temporal data. An overview of currently available
software for analysing directional data is also provided, and potential future
developments discussed.Comment: 61 page
Data-driven modelling of biological multi-scale processes
Biological processes involve a variety of spatial and temporal scales. A
holistic understanding of many biological processes therefore requires
multi-scale models which capture the relevant properties on all these scales.
In this manuscript we review mathematical modelling approaches used to describe
the individual spatial scales and how they are integrated into holistic models.
We discuss the relation between spatial and temporal scales and the implication
of that on multi-scale modelling. Based upon this overview over
state-of-the-art modelling approaches, we formulate key challenges in
mathematical and computational modelling of biological multi-scale and
multi-physics processes. In particular, we considered the availability of
analysis tools for multi-scale models and model-based multi-scale data
integration. We provide a compact review of methods for model-based data
integration and model-based hypothesis testing. Furthermore, novel approaches
and recent trends are discussed, including computation time reduction using
reduced order and surrogate models, which contribute to the solution of
inference problems. We conclude the manuscript by providing a few ideas for the
development of tailored multi-scale inference methods.Comment: This manuscript will appear in the Journal of Coupled Systems and
Multiscale Dynamics (American Scientific Publishers
Data analysis of gravitational-wave signals from spinning neutron stars. III. Detection statistics and computational requirements
We develop the analytic and numerical tools for data analysis of the
gravitational-wave signals from spinning neutron stars for ground-based laser
interferometric detectors. We study in detail the statistical properties of the
optimum functional that need to be calculated in order to detect the
gravitational-wave signal from a spinning neutron star and estimate its
parameters. We derive formulae for false alarm and detection probabilities both
for the optimal and the suboptimal filters. We assess the computational
requirements needed to do the signal search. We compare a number of criteria to
build sufficiently accurate templates for our data analysis scheme. We verify
the validity of our concepts and formulae by means of the Monte Carlo
simulations. We present algorithms by which one can estimate the parameters of
the continuous signals accurately.Comment: LaTeX, 45 pages, 13 figures, submitted to Phys. Rev.
Graphs in machine learning: an introduction
Graphs are commonly used to characterise interactions between objects of
interest. Because they are based on a straightforward formalism, they are used
in many scientific fields from computer science to historical sciences. In this
paper, we give an introduction to some methods relying on graphs for learning.
This includes both unsupervised and supervised methods. Unsupervised learning
algorithms usually aim at visualising graphs in latent spaces and/or clustering
the nodes. Both focus on extracting knowledge from graph topologies. While most
existing techniques are only applicable to static graphs, where edges do not
evolve through time, recent developments have shown that they could be extended
to deal with evolving networks. In a supervised context, one generally aims at
inferring labels or numerical values attached to nodes using both the graph
and, when they are available, node characteristics. Balancing the two sources
of information can be challenging, especially as they can disagree locally or
globally. In both contexts, supervised and un-supervised, data can be
relational (augmented with one or several global graphs) as described above, or
graph valued. In this latter case, each object of interest is given as a full
graph (possibly completed by other characteristics). In this context, natural
tasks include graph clustering (as in producing clusters of graphs rather than
clusters of nodes in a single graph), graph classification, etc. 1 Real
networks One of the first practical studies on graphs can be dated back to the
original work of Moreno [51] in the 30s. Since then, there has been a growing
interest in graph analysis associated with strong developments in the modelling
and the processing of these data. Graphs are now used in many scientific
fields. In Biology [54, 2, 7], for instance, metabolic networks can describe
pathways of biochemical reactions [41], while in social sciences networks are
used to represent relation ties between actors [66, 56, 36, 34]. Other examples
include powergrids [71] and the web [75]. Recently, networks have also been
considered in other areas such as geography [22] and history [59, 39]. In
machine learning, networks are seen as powerful tools to model problems in
order to extract information from data and for prediction purposes. This is the
object of this paper. For more complete surveys, we refer to [28, 62, 49, 45].
In this section, we introduce notations and highlight properties shared by most
real networks. In Section 2, we then consider methods aiming at extracting
information from a unique network. We will particularly focus on clustering
methods where the goal is to find clusters of vertices. Finally, in Section 3,
techniques that take a series of networks into account, where each network i
- …