90,179 research outputs found

    Representation Learning for Scale-free Networks

    Full text link
    Network embedding aims to learn the low-dimensional representations of vertexes in a network, while structure and inherent properties of the network is preserved. Existing network embedding works primarily focus on preserving the microscopic structure, such as the first- and second-order proximity of vertexes, while the macroscopic scale-free property is largely ignored. Scale-free property depicts the fact that vertex degrees follow a heavy-tailed distribution (i.e., only a few vertexes have high degrees) and is a critical property of real-world networks, such as social networks. In this paper, we study the problem of learning representations for scale-free networks. We first theoretically analyze the difficulty of embedding and reconstructing a scale-free network in the Euclidean space, by converting our problem to the sphere packing problem. Then, we propose the "degree penalty" principle for designing scale-free property preserving network embedding algorithm: punishing the proximity between high-degree vertexes. We introduce two implementations of our principle by utilizing the spectral techniques and a skip-gram model respectively. Extensive experiments on six datasets show that our algorithms are able to not only reconstruct heavy-tailed distributed degree distribution, but also outperform state-of-the-art embedding models in various network mining tasks, such as vertex classification and link prediction.Comment: 8 figures; accepted by AAAI 201

    Analysis of approximate nearest neighbor searching with clustered point sets

    Full text link
    We present an empirical analysis of data structures for approximate nearest neighbor searching. We compare the well-known optimized kd-tree splitting method against two alternative splitting methods. The first, called the sliding-midpoint method, which attempts to balance the goals of producing subdivision cells of bounded aspect ratio, while not producing any empty cells. The second, called the minimum-ambiguity method is a query-based approach. In addition to the data points, it is also given a training set of query points for preprocessing. It employs a simple greedy algorithm to select the splitting plane that minimizes the average amount of ambiguity in the choice of the nearest neighbor for the training points. We provide an empirical analysis comparing these two methods against the optimized kd-tree construction for a number of synthetically generated data and query sets. We demonstrate that for clustered data and query sets, these algorithms can provide significant improvements over the standard kd-tree construction for approximate nearest neighbor searching.Comment: 20 pages, 8 figures. Presented at ALENEX '99, Baltimore, MD, Jan 15-16, 199

    Origin of Scaling Behavior of Protein Packing Density: A Sequential Monte Carlo Study of Compact Long Chain Polymers

    Full text link
    Single domain proteins are thought to be tightly packed. The introduction of voids by mutations is often regarded as destabilizing. In this study we show that packing density for single domain proteins decreases with chain length. We find that the radius of gyration provides poor description of protein packing but the alpha contact number we introduce here characterize proteins well. We further demonstrate that protein-like scaling relationship between packing density and chain length is observed in off-lattice self-avoiding walks. A key problem in studying compact chain polymer is the attrition problem: It is difficult to generate independent samples of compact long self-avoiding walks. We develop an algorithm based on the framework of sequential Monte Carlo and succeed in generating populations of compact long chain off-lattice polymers up to length N=2,000N=2,000. Results based on analysis of these chain polymers suggest that maintaining high packing density is only characteristic of short chain proteins. We found that the scaling behavior of packing density with chain length of proteins is a generic feature of random polymers satisfying loose constraint in compactness. We conclude that proteins are not optimized by evolution to eliminate packing voids.Comment: 9 pages, 10 figures. Accepted by J. Chem. Phy

    Multifractal Analysis of Packed Swiss Cheese Cosmologies

    Full text link
    The multifractal spectrum of various three-dimensional representations of Packed Swiss Cheese cosmologies in open, closed, and flat spaces are measured, and it is determined that the curvature of the space does not alter the associated fractal structure. These results are compared to observational data and simulated models of large scale galaxy clustering, to assess the viability of the PSC as a candidate for such structure formation. It is found that the PSC dimension spectra do not match those of observation, and possible solutions to this discrepancy are offered, including accounting for potential luminosity biasing effects. Various random and uniform sets are also analyzed to provide insight into the meaning of the multifractal spectrum as it relates to the observed scaling behaviors.Comment: 3 latex files, 18 ps figure

    Grain Dynamics in a Two-dimensional Granular Flow

    Full text link
    We have used particle tracking methods to study the dynamics of individual balls comprising a granular flow in a small-angle two-dimensional funnel. We statistically analyze many ball trajectories to examine the mechanisms of shock propagation. In particular, we study the creation of, and interactions between, shock waves. We also investigate the role of granular temperature and draw parallels to traffic flow dynamics.Comment: 17 pages, 24 figures. To appear in Phys.Rev.E. High res./color figures etc. on http://www.nbi.dk/CATS/Granular/GrainDyn.htm
    • …
    corecore