1,288 research outputs found
A Survey of Neighbourhood Construction Models for Categorizing Data Points
Finding neighbourhood structures is very useful in extracting valuable
relationships among data samples. This paper presents a survey of recent
neighbourhood construction algorithms for pattern clustering and classifying
data points. Extracting neighbourhoods and connections among the points is
extremely useful for clustering and classifying the data. Many applications
such as detecting social network communities, bundling related edges, and
solving location and routing problems all indicate the usefulness of this
problem. Finding data point neighbourhood in data mining and pattern
recognition should generally improve knowledge extraction from databases.
Several algorithms of data point neighbourhood construction have been proposed
to analyse the data in this sense. They will be described and discussed from
different aspects in this paper. Finally, the future challenges concerning the
title of the present paper will be outlined
Generative Models for Functional Data using Phase and Amplitude Separation
Constructing generative models for functional observations is an important
task in statistical functional analysis. In general, functional data contains
both phase (or x or horizontal) and amplitude (or y or vertical) variability.
Tradi- tional methods often ignore the phase variability and focus solely on
the amplitude variation, using cross-sectional techniques such as fPCA for
dimensional reduction and data modeling. Ignoring phase variability leads to a
loss of structure in the data and inefficiency in data models. This paper
presents an approach that relies on separating the phase (x-axis) and amplitude
(y-axis), then modeling these components using joint distributions. This
separation, in turn, is performed using a technique called elastic shape
analysis of curves that involves a new mathematical representation of
functional data. Then, using individual fPCAs, one each for phase and amplitude
components, while respecting the nonlinear geometry of the phase representation
space; impose joint probability models on principal coefficients of these
components. These ideas are demonstrated using random sampling, for models
estimated from simulated and real datasets, and show their superiority over
models that ignore phase-amplitude separation. Furthermore, the generative
models are applied to classification of functional data and achieve high
performance in applications involv- ing SONAR signals of underwater objects,
handwritten signatures, and periodic body movements recorded by smart phones.Comment: 19 Pages, accepted for publication to Computational Statistics and
Data Analysis (Dec 2012
Robust Temporally Coherent Laplacian Protrusion Segmentation of 3D Articulated Bodies
In motion analysis and understanding it is important to be able to fit a
suitable model or structure to the temporal series of observed data, in order
to describe motion patterns in a compact way, and to discriminate between them.
In an unsupervised context, i.e., no prior model of the moving object(s) is
available, such a structure has to be learned from the data in a bottom-up
fashion. In recent times, volumetric approaches in which the motion is captured
from a number of cameras and a voxel-set representation of the body is built
from the camera views, have gained ground due to attractive features such as
inherent view-invariance and robustness to occlusions. Automatic, unsupervised
segmentation of moving bodies along entire sequences, in a temporally-coherent
and robust way, has the potential to provide a means of constructing a
bottom-up model of the moving body, and track motion cues that may be later
exploited for motion classification. Spectral methods such as locally linear
embedding (LLE) can be useful in this context, as they preserve "protrusions",
i.e., high-curvature regions of the 3D volume, of articulated shapes, while
improving their separation in a lower dimensional space, making them in this
way easier to cluster. In this paper we therefore propose a spectral approach
to unsupervised and temporally-coherent body-protrusion segmentation along time
sequences. Volumetric shapes are clustered in an embedding space, clusters are
propagated in time to ensure coherence, and merged or split to accommodate
changes in the body's topology. Experiments on both synthetic and real
sequences of dense voxel-set data are shown. This supports the ability of the
proposed method to cluster body-parts consistently over time in a totally
unsupervised fashion, its robustness to sampling density and shape quality, and
its potential for bottom-up model constructionComment: 31 pages, 26 figure
Comparison and validation of community structures in complex networks
The issue of partitioning a network into communities has attracted a great
deal of attention recently. Most authors seem to equate this issue with the one
of finding the maximum value of the modularity, as defined by Newman. Since the
problem formulated this way is NP-hard, most effort has gone into the
construction of search algorithms, and less to the question of other measures
of community structures, similarities between various partitionings and the
validation with respect to external information. Here we concentrate on a class
of computer generated networks and on three well-studied real networks which
constitute a bench-mark for network studies; the karate club, the US college
football teams and a gene network of yeast. We utilize some standard ways of
clustering data (originally not designed for finding community structures in
networks) and show that these classical methods sometimes outperform the newer
ones. We discuss various measures of the strength of the modular structure, and
show by examples features and drawbacks. Further, we compare different
partitions by applying some graph-theoretic concepts of distance, which
indicate that one of the quality measures of the degree of modularity
corresponds quite well with the distance from the true partition. Finally, we
introduce a way to validate the partitionings with respect to external data
when the nodes are classified but the network structure is unknown. This is
here possible since we know everything of the computer generated networks, as
well as the historical answer to how the karate club and the football teams are
partitioned in reality. The partitioning of the gene network is validated by
use of the Gene Ontology database, where we show that a community in general
corresponds to a biological process.Comment: To appear in Physica A; 25 page
Weak gravitational lensing in the standard Cold Dark Matter model, using an algorithm for three-dimensional shear
We investigate the effects of weak gravitational lensing in the standard Cold
Dark Matter cosmology, using an algorithm which evaluates the shear in three
dimensions. The algorithm has the advantage of variable softening for the
particles, and our method allows the appropriate angular diameter distances to
be applied to every evaluation location within each three-dimensional
simulation box. We investigate the importance of shear in the distance-redshift
relation, and find it to be very small. We also establish clearly defined
values for the smoothness parameter in the relation, finding its value to be at
least 0.88 at all redshifts in our simulations. From our results, obtained by
linking the simulation boxes back to source redshifts of 4, we are able to
observe the formation of structure in terms of the computed shear, and also
note that the major contributions to the shear come from a very broad range of
redshifts. We show the probability distributions for the magnification, source
ellipticity and convergence, and also describe the relationships amongst these
quantities for a range of source redshifts. We find a broad range of
magnifications and ellipticities; for sources at a redshift of 4, 97{1/2}% of
all lines of sight show magnifications up to 1.3 and ellipticities up to 0.195.
There is clear evidence that the magnification is not linear in the
convergence, as might be expected for weak lensing, but contains contributions
from higher order terms in both the convergence and the shear.Comment: 14 pages, LaTeX, 15 figures include
Geometrical congruence and efficient greedy navigability of complex networks
Hyperbolic networks are supposed to be congruent with their underlying latent
geometry and following geodesics in the hyperbolic space is believed equivalent
to navigate through topological shortest paths (TSP). This assumption of
geometrical congruence is considered the reason for nearly maximally efficient
greedy navigation of hyperbolic networks. Here, we propose a complex network
measure termed geometrical congruence (GC) and we show that there might exist
different TSP, whose projections (pTSP) in the hyperbolic space largely
diverge, and significantly differ from the respective geodesics. We discover
that, contrary to current belief, hyperbolic networks do not demonstrate in
general geometrical congruence and efficient navigability which, in networks
generated with nPSO model, seem to emerge only for power-law exponent close to
2. We conclude by showing that GC measure can impact also real networks
analysis, indeed it significantly changes in structural brain connectomes
grouped by gender or age
Hubble expansion as a curvature of space
By considering the expansion of space as an additional component of general
relativity, a model is described that adds a Hubble curvature term as a new
solution to the general equation. Correlation with the CDM model was
assessed using the extensive type~Ia supernovae (SNe~Ia) data with redshift
corrected to the CMB, and recent baryonic acoustic oscillation (BAO) measures.
For the SNe~Ia data, the modified GR and CDM models differed by
~mag. over , with overall weighted
RMS errors of and ~mag respectively. For the BAO
measures, the weighted RMS errors were and Mpc with
for the modified GR and for the CDM
models, over the range . The derived GR metric accurately
describes both the SNe Ia and the baryonic acoustic oscillation (BAO)
observations without requiring dark matter or -corrected dark energy while
allowing the spatial term to remain flat, suggesting that the standard metric
may accept an additional term for the curvature of space due to its Hubble
expansion.Comment: 11 pages, 7 figures, submitted to Results in Physic
A Geometric Approach to Pairwise Bayesian Alignment of Functional Data Using Importance Sampling
We present a Bayesian model for pairwise nonlinear registration of functional
data. We use the Riemannian geometry of the space of warping functions to
define appropriate prior distributions and sample from the posterior using
importance sampling. A simple square-root transformation is used to simplify
the geometry of the space of warping functions, which allows for computation of
sample statistics, such as the mean and median, and a fast implementation of a
-means clustering algorithm. These tools allow for efficient posterior
inference, where multiple modes of the posterior distribution corresponding to
multiple plausible alignments of the given functions are found. We also show
pointwise credible intervals to assess the uncertainty of the alignment
in different clusters. We validate this model using simulations and present
multiple examples on real data from different application domains including
biometrics and medicine
NCARD: Improving Neighborhood Construction by Apollonius Region Algorithm based on Density
Due to the increased rate of information in the present era, local
identification of similar and related data points by using neighborhood
construction algorithms is highly significant for processing information in
various sciences. Geometric methods are especially useful for their accuracy in
locating highly similar neighborhood points using efficient geometric
structures. Geometric methods should be examined for each individual point in
neighborhood data set so that similar groups would be formed. Those algorithms
are not highly accurate for high dimension of data. Due to the important
challenges in data point analysis, we have used geometric method in which the
Apollonius circle is used to achieve high local accuracy with high dimension
data. In this paper, we propose a neighborhood construction algorithm, namely
Neighborhood Construction by Apollonius Region Density (NCARD). In this study,
the neighbors of data points are determined using not only the geometric
structures, but also the density information. Apollonius circle, one of the
state-of-the-art proximity geometry methods, Apollonius circle, is used for
this purpose. For efficient clustering, our algorithm works better with high
dimension of data than the previous methods; it is also able to identify the
local outlier data. We have no prior information about the data in the proposed
algorithm. Moreover, after locating similar data points with Apollonius circle,
we will extract density and relationship among the points, and a unique and
accurate neighborhood is created in this way. The proposed algorithm is more
accurate than the state-of-the-art and well-known algorithms up to almost 8-13%
in real and artificial data sets
Multifractal Analysis of Packed Swiss Cheese Cosmologies
The multifractal spectrum of various three-dimensional representations of
Packed Swiss Cheese cosmologies in open, closed, and flat spaces are measured,
and it is determined that the curvature of the space does not alter the
associated fractal structure. These results are compared to observational data
and simulated models of large scale galaxy clustering, to assess the viability
of the PSC as a candidate for such structure formation. It is found that the
PSC dimension spectra do not match those of observation, and possible solutions
to this discrepancy are offered, including accounting for potential luminosity
biasing effects. Various random and uniform sets are also analyzed to provide
insight into the meaning of the multifractal spectrum as it relates to the
observed scaling behaviors.Comment: 3 latex files, 18 ps figure
- …