25,925 research outputs found
ClustGeo: an R package for hierarchical clustering with spatial constraints
In this paper, we propose a Ward-like hierarchical clustering algorithm
including spatial/geographical constraints. Two dissimilarity matrices
and are inputted, along with a mixing parameter . The
dissimilarities can be non-Euclidean and the weights of the observations can be
non-uniform. The first matrix gives the dissimilarities in the "feature space"
and the second matrix gives the dissimilarities in the "constraint space". The
criterion minimized at each stage is a convex combination of the homogeneity
criterion calculated with and the homogeneity criterion calculated with
. The idea is then to determine a value of which increases the
spatial contiguity without deteriorating too much the quality of the solution
based on the variables of interest i.e. those of the feature space. This
procedure is illustrated on a real dataset using the R package ClustGeo
On multi-view learning with additive models
In many scientific settings data can be naturally partitioned into variable
groupings called views. Common examples include environmental (1st view) and
genetic information (2nd view) in ecological applications, chemical (1st view)
and biological (2nd view) data in drug discovery. Multi-view data also occur in
text analysis and proteomics applications where one view consists of a graph
with observations as the vertices and a weighted measure of pairwise similarity
between observations as the edges. Further, in several of these applications
the observations can be partitioned into two sets, one where the response is
observed (labeled) and the other where the response is not (unlabeled). The
problem for simultaneously addressing viewed data and incorporating unlabeled
observations in training is referred to as multi-view transductive learning. In
this work we introduce and study a comprehensive generalized fixed point
additive modeling framework for multi-view transductive learning, where any
view is represented by a linear smoother. The problem of view selection is
discussed using a generalized Akaike Information Criterion, which provides an
approach for testing the contribution of each view. An efficient implementation
is provided for fitting these models with both backfitting and local-scoring
type algorithms adjusted to semi-supervised graph-based learning. The proposed
technique is assessed on both synthetic and real data sets and is shown to be
competitive to state-of-the-art co-training and graph-based techniques.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS202 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
A taxonomy framework for unsupervised outlier detection techniques for multi-type data sets
The term "outlier" can generally be defined as an observation that is significantly different from
the other values in a data set. The outliers may be instances of error or indicate events. The
task of outlier detection aims at identifying such outliers in order to improve the analysis of
data and further discover interesting and useful knowledge about unusual events within numerous
applications domains. In this paper, we report on contemporary unsupervised outlier detection
techniques for multiple types of data sets and provide a comprehensive taxonomy framework and
two decision trees to select the most suitable technique based on data set. Furthermore, we
highlight the advantages, disadvantages and performance issues of each class of outlier detection
techniques under this taxonomy framework
Evolutionary games on graphs
Game theory is one of the key paradigms behind many scientific disciplines
from biology to behavioral sciences to economics. In its evolutionary form and
especially when the interacting agents are linked in a specific social network
the underlying solution concepts and methods are very similar to those applied
in non-equilibrium statistical physics. This review gives a tutorial-type
overview of the field for physicists. The first three sections introduce the
necessary background in classical and evolutionary game theory from the basic
definitions to the most important results. The fourth section surveys the
topological complications implied by non-mean-field-type social network
structures in general. The last three sections discuss in detail the dynamic
behavior of three prominent classes of models: the Prisoner's Dilemma, the
Rock-Scissors-Paper game, and Competing Associations. The major theme of the
review is in what sense and how the graph structure of interactions can modify
and enrich the picture of long term behavioral patterns emerging in
evolutionary games.Comment: Review, final version, 133 pages, 65 figure
- …