1,288 research outputs found

    A Survey of Neighbourhood Construction Models for Categorizing Data Points

    Full text link
    Finding neighbourhood structures is very useful in extracting valuable relationships among data samples. This paper presents a survey of recent neighbourhood construction algorithms for pattern clustering and classifying data points. Extracting neighbourhoods and connections among the points is extremely useful for clustering and classifying the data. Many applications such as detecting social network communities, bundling related edges, and solving location and routing problems all indicate the usefulness of this problem. Finding data point neighbourhood in data mining and pattern recognition should generally improve knowledge extraction from databases. Several algorithms of data point neighbourhood construction have been proposed to analyse the data in this sense. They will be described and discussed from different aspects in this paper. Finally, the future challenges concerning the title of the present paper will be outlined

    Generative Models for Functional Data using Phase and Amplitude Separation

    Full text link
    Constructing generative models for functional observations is an important task in statistical functional analysis. In general, functional data contains both phase (or x or horizontal) and amplitude (or y or vertical) variability. Tradi- tional methods often ignore the phase variability and focus solely on the amplitude variation, using cross-sectional techniques such as fPCA for dimensional reduction and data modeling. Ignoring phase variability leads to a loss of structure in the data and inefficiency in data models. This paper presents an approach that relies on separating the phase (x-axis) and amplitude (y-axis), then modeling these components using joint distributions. This separation, in turn, is performed using a technique called elastic shape analysis of curves that involves a new mathematical representation of functional data. Then, using individual fPCAs, one each for phase and amplitude components, while respecting the nonlinear geometry of the phase representation space; impose joint probability models on principal coefficients of these components. These ideas are demonstrated using random sampling, for models estimated from simulated and real datasets, and show their superiority over models that ignore phase-amplitude separation. Furthermore, the generative models are applied to classification of functional data and achieve high performance in applications involv- ing SONAR signals of underwater objects, handwritten signatures, and periodic body movements recorded by smart phones.Comment: 19 Pages, accepted for publication to Computational Statistics and Data Analysis (Dec 2012

    Robust Temporally Coherent Laplacian Protrusion Segmentation of 3D Articulated Bodies

    Get PDF
    In motion analysis and understanding it is important to be able to fit a suitable model or structure to the temporal series of observed data, in order to describe motion patterns in a compact way, and to discriminate between them. In an unsupervised context, i.e., no prior model of the moving object(s) is available, such a structure has to be learned from the data in a bottom-up fashion. In recent times, volumetric approaches in which the motion is captured from a number of cameras and a voxel-set representation of the body is built from the camera views, have gained ground due to attractive features such as inherent view-invariance and robustness to occlusions. Automatic, unsupervised segmentation of moving bodies along entire sequences, in a temporally-coherent and robust way, has the potential to provide a means of constructing a bottom-up model of the moving body, and track motion cues that may be later exploited for motion classification. Spectral methods such as locally linear embedding (LLE) can be useful in this context, as they preserve "protrusions", i.e., high-curvature regions of the 3D volume, of articulated shapes, while improving their separation in a lower dimensional space, making them in this way easier to cluster. In this paper we therefore propose a spectral approach to unsupervised and temporally-coherent body-protrusion segmentation along time sequences. Volumetric shapes are clustered in an embedding space, clusters are propagated in time to ensure coherence, and merged or split to accommodate changes in the body's topology. Experiments on both synthetic and real sequences of dense voxel-set data are shown. This supports the ability of the proposed method to cluster body-parts consistently over time in a totally unsupervised fashion, its robustness to sampling density and shape quality, and its potential for bottom-up model constructionComment: 31 pages, 26 figure

    Comparison and validation of community structures in complex networks

    Full text link
    The issue of partitioning a network into communities has attracted a great deal of attention recently. Most authors seem to equate this issue with the one of finding the maximum value of the modularity, as defined by Newman. Since the problem formulated this way is NP-hard, most effort has gone into the construction of search algorithms, and less to the question of other measures of community structures, similarities between various partitionings and the validation with respect to external information. Here we concentrate on a class of computer generated networks and on three well-studied real networks which constitute a bench-mark for network studies; the karate club, the US college football teams and a gene network of yeast. We utilize some standard ways of clustering data (originally not designed for finding community structures in networks) and show that these classical methods sometimes outperform the newer ones. We discuss various measures of the strength of the modular structure, and show by examples features and drawbacks. Further, we compare different partitions by applying some graph-theoretic concepts of distance, which indicate that one of the quality measures of the degree of modularity corresponds quite well with the distance from the true partition. Finally, we introduce a way to validate the partitionings with respect to external data when the nodes are classified but the network structure is unknown. This is here possible since we know everything of the computer generated networks, as well as the historical answer to how the karate club and the football teams are partitioned in reality. The partitioning of the gene network is validated by use of the Gene Ontology database, where we show that a community in general corresponds to a biological process.Comment: To appear in Physica A; 25 page

    Weak gravitational lensing in the standard Cold Dark Matter model, using an algorithm for three-dimensional shear

    Get PDF
    We investigate the effects of weak gravitational lensing in the standard Cold Dark Matter cosmology, using an algorithm which evaluates the shear in three dimensions. The algorithm has the advantage of variable softening for the particles, and our method allows the appropriate angular diameter distances to be applied to every evaluation location within each three-dimensional simulation box. We investigate the importance of shear in the distance-redshift relation, and find it to be very small. We also establish clearly defined values for the smoothness parameter in the relation, finding its value to be at least 0.88 at all redshifts in our simulations. From our results, obtained by linking the simulation boxes back to source redshifts of 4, we are able to observe the formation of structure in terms of the computed shear, and also note that the major contributions to the shear come from a very broad range of redshifts. We show the probability distributions for the magnification, source ellipticity and convergence, and also describe the relationships amongst these quantities for a range of source redshifts. We find a broad range of magnifications and ellipticities; for sources at a redshift of 4, 97{1/2}% of all lines of sight show magnifications up to 1.3 and ellipticities up to 0.195. There is clear evidence that the magnification is not linear in the convergence, as might be expected for weak lensing, but contains contributions from higher order terms in both the convergence and the shear.Comment: 14 pages, LaTeX, 15 figures include

    Geometrical congruence and efficient greedy navigability of complex networks

    Full text link
    Hyperbolic networks are supposed to be congruent with their underlying latent geometry and following geodesics in the hyperbolic space is believed equivalent to navigate through topological shortest paths (TSP). This assumption of geometrical congruence is considered the reason for nearly maximally efficient greedy navigation of hyperbolic networks. Here, we propose a complex network measure termed geometrical congruence (GC) and we show that there might exist different TSP, whose projections (pTSP) in the hyperbolic space largely diverge, and significantly differ from the respective geodesics. We discover that, contrary to current belief, hyperbolic networks do not demonstrate in general geometrical congruence and efficient navigability which, in networks generated with nPSO model, seem to emerge only for power-law exponent close to 2. We conclude by showing that GC measure can impact also real networks analysis, indeed it significantly changes in structural brain connectomes grouped by gender or age

    Hubble expansion as a curvature of space

    Full text link
    By considering the expansion of space as an additional component of general relativity, a model is described that adds a Hubble curvature term as a new solution to the general equation. Correlation with the Λ\LambdaCDM model was assessed using the extensive type~Ia supernovae (SNe~Ia) data with redshift corrected to the CMB, and recent baryonic acoustic oscillation (BAO) measures. For the SNe~Ia data, the modified GR and Λ\LambdaCDM models differed by −0.15+0.11 μB^{+0.11}_{-0.15}~\mu_B~mag. over zcmb=0.01−1.3z_{cmb}=0.01-1.3, with overall weighted RMS errors of ±0.136\pm0.136 and ±0.151\pm0.151 μB\mu_B~mag respectively. For the BAO measures, the weighted RMS errors were ±0.034\pm0.034 and ±0.085\pm0.085 Mpc with H0=67.6±0.25H_0=67.6\pm0.25 for the modified GR and 70.0±0.2570.0\pm0.25 for the Λ\LambdaCDM models, over the range z=0.106−2.36z=0.106-2.36. The derived GR metric accurately describes both the SNe Ia and the baryonic acoustic oscillation (BAO) observations without requiring dark matter or ww-corrected dark energy while allowing the spatial term to remain flat, suggesting that the standard metric may accept an additional term for the curvature of space due to its Hubble expansion.Comment: 11 pages, 7 figures, submitted to Results in Physic

    A Geometric Approach to Pairwise Bayesian Alignment of Functional Data Using Importance Sampling

    Full text link
    We present a Bayesian model for pairwise nonlinear registration of functional data. We use the Riemannian geometry of the space of warping functions to define appropriate prior distributions and sample from the posterior using importance sampling. A simple square-root transformation is used to simplify the geometry of the space of warping functions, which allows for computation of sample statistics, such as the mean and median, and a fast implementation of a kk-means clustering algorithm. These tools allow for efficient posterior inference, where multiple modes of the posterior distribution corresponding to multiple plausible alignments of the given functions are found. We also show pointwise 95%95\% credible intervals to assess the uncertainty of the alignment in different clusters. We validate this model using simulations and present multiple examples on real data from different application domains including biometrics and medicine

    NCARD: Improving Neighborhood Construction by Apollonius Region Algorithm based on Density

    Full text link
    Due to the increased rate of information in the present era, local identification of similar and related data points by using neighborhood construction algorithms is highly significant for processing information in various sciences. Geometric methods are especially useful for their accuracy in locating highly similar neighborhood points using efficient geometric structures. Geometric methods should be examined for each individual point in neighborhood data set so that similar groups would be formed. Those algorithms are not highly accurate for high dimension of data. Due to the important challenges in data point analysis, we have used geometric method in which the Apollonius circle is used to achieve high local accuracy with high dimension data. In this paper, we propose a neighborhood construction algorithm, namely Neighborhood Construction by Apollonius Region Density (NCARD). In this study, the neighbors of data points are determined using not only the geometric structures, but also the density information. Apollonius circle, one of the state-of-the-art proximity geometry methods, Apollonius circle, is used for this purpose. For efficient clustering, our algorithm works better with high dimension of data than the previous methods; it is also able to identify the local outlier data. We have no prior information about the data in the proposed algorithm. Moreover, after locating similar data points with Apollonius circle, we will extract density and relationship among the points, and a unique and accurate neighborhood is created in this way. The proposed algorithm is more accurate than the state-of-the-art and well-known algorithms up to almost 8-13% in real and artificial data sets

    Multifractal Analysis of Packed Swiss Cheese Cosmologies

    Full text link
    The multifractal spectrum of various three-dimensional representations of Packed Swiss Cheese cosmologies in open, closed, and flat spaces are measured, and it is determined that the curvature of the space does not alter the associated fractal structure. These results are compared to observational data and simulated models of large scale galaxy clustering, to assess the viability of the PSC as a candidate for such structure formation. It is found that the PSC dimension spectra do not match those of observation, and possible solutions to this discrepancy are offered, including accounting for potential luminosity biasing effects. Various random and uniform sets are also analyzed to provide insight into the meaning of the multifractal spectrum as it relates to the observed scaling behaviors.Comment: 3 latex files, 18 ps figure
    • …
    corecore