2,222 research outputs found
Robust Inference of Trees
This paper is concerned with the reliable inference of optimal
tree-approximations to the dependency structure of an unknown distribution
generating data. The traditional approach to the problem measures the
dependency strength between random variables by the index called mutual
information. In this paper reliability is achieved by Walley's imprecise
Dirichlet model, which generalizes Bayesian learning with Dirichlet priors.
Adopting the imprecise Dirichlet model results in posterior interval
expectation for mutual information, and in a set of plausible trees consistent
with the data. Reliable inference about the actual tree is achieved by focusing
on the substructure common to all the plausible trees. We develop an exact
algorithm that infers the substructure in time O(m^4), m being the number of
random variables. The new algorithm is applied to a set of data sampled from a
known distribution. The method is shown to reliably infer edges of the actual
tree even when the data are very scarce, unlike the traditional approach.
Finally, we provide lower and upper credibility limits for mutual information
under the imprecise Dirichlet model. These enable the previous developments to
be extended to a full inferential method for trees.Comment: 26 pages, 7 figure
Functional Bipartite Ranking: a Wavelet-Based Filtering Approach
It is the main goal of this article to address the bipartite ranking issue
from the perspective of functional data analysis (FDA). Given a training set of
independent realizations of a (possibly sampled) second-order random function
with a (locally) smooth autocorrelation structure and to which a binary label
is randomly assigned, the objective is to learn a scoring function s with
optimal ROC curve. Based on linear/nonlinear wavelet-based approximations, it
is shown how to select compact finite dimensional representations of the input
curves adaptively, in order to build accurate ranking rules, using recent
advances in the ranking problem for multivariate data with binary feedback.
Beyond theoretical considerations, the performance of the learning methods for
functional bipartite ranking proposed in this paper are illustrated by
numerical experiments
A survey of max-type recursive distributional equations
In certain problems in a variety of applied probability settings (from
probabilistic analysis of algorithms to statistical physics), the central
requirement is to solve a recursive distributional equation of the form X =^d
g((\xi_i,X_i),i\geq 1). Here (\xi_i) and g(\cdot) are given and the X_i are
independent copies of the unknown distribution X. We survey this area,
emphasizing examples where the function g(\cdot) is essentially a ``maximum''
or ``minimum'' function. We draw attention to the theoretical question of
endogeny: in the associated recursive tree process X_i, are the X_i measurable
functions of the innovations process (\xi_i)?Comment: Published at http://dx.doi.org/10.1214/105051605000000142 in the
Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute
of Mathematical Statistics (http://www.imstat.org
copulaedas: An R Package for Estimation of Distribution Algorithms Based on Copulas
The use of copula-based models in EDAs (estimation of distribution
algorithms) is currently an active area of research. In this context, the
copulaedas package for R provides a platform where EDAs based on copulas can be
implemented and studied. The package offers complete implementations of various
EDAs based on copulas and vines, a group of well-known optimization problems,
and utility functions to study the performance of the algorithms. Newly
developed EDAs can be easily integrated into the package by extending an S4
class with generic functions for their main components. This paper presents
copulaedas by providing an overview of EDAs based on copulas, a description of
the implementation of the package, and an illustration of its use through
examples. The examples include running the EDAs defined in the package,
implementing new algorithms, and performing an empirical study to compare the
behavior of different algorithms on benchmark functions and a real-world
problem
Tools to Characterize Point Patterns: dbmss for R
The dbmss package for R provides an easy-to-use toolbox to characterize the spatial structure of point patterns. Our contribution presents the state of the art of distance-based methods employed in economic geography and which are also used in ecology. Topographic functions such as Ripley's K, absolute functions such as Duranton and Overman's Kd and relative functions such as Marcon and Puech's M are implemented. Their confidence envelopes (including global ones) and tests against counterfactuals are included in the package
Graph analysis of functional brain networks: practical issues in translational neuroscience
The brain can be regarded as a network: a connected system where nodes, or
units, represent different specialized regions and links, or connections,
represent communication pathways. From a functional perspective communication
is coded by temporal dependence between the activities of different brain
areas. In the last decade, the abstract representation of the brain as a graph
has allowed to visualize functional brain networks and describe their
non-trivial topological properties in a compact and objective way. Nowadays,
the use of graph analysis in translational neuroscience has become essential to
quantify brain dysfunctions in terms of aberrant reconfiguration of functional
brain networks. Despite its evident impact, graph analysis of functional brain
networks is not a simple toolbox that can be blindly applied to brain signals.
On the one hand, it requires a know-how of all the methodological steps of the
processing pipeline that manipulates the input brain signals and extract the
functional network properties. On the other hand, a knowledge of the neural
phenomenon under study is required to perform physiological-relevant analysis.
The aim of this review is to provide practical indications to make sense of
brain network analysis and contrast counterproductive attitudes
Analytic urns
This article describes a purely analytic approach to urn models of the
generalized or extended P\'olya-Eggenberger type, in the case of two types of
balls and constant ``balance,'' that is, constant row sum. The treatment starts
from a quasilinear first-order partial differential equation associated with a
combinatorial renormalization of the model and bases itself on elementary
conformal mapping arguments coupled with singularity analysis techniques.
Probabilistic consequences in the case of ``subtractive'' urns are new
representations for the probability distribution of the urn's composition at
any time n, structural information on the shape of moments of all orders,
estimates of the speed of convergence to the Gaussian limit and an explicit
determination of the associated large deviation function. In the general case,
analytic solutions involve Abelian integrals over the Fermat curve x^h+y^h=1.
Several urn models, including a classical one associated with balanced trees
(2-3 trees and fringe-balanced search trees) and related to a previous study of
Panholzer and Prodinger, as well as all urns of balance 1 or 2 and a sporadic
urn of balance 3, are shown to admit of explicit representations in terms of
Weierstra\ss elliptic functions: these elliptic models appear precisely to
correspond to regular tessellations of the Euclidean plane.Comment: Published at http://dx.doi.org/10.1214/009117905000000026 in the
Annals of Probability (http://www.imstat.org/aop/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …