8,441 research outputs found

    From random walks to distances on unweighted graphs

    Full text link
    Large unweighted directed graphs are commonly used to capture relations between entities. A fundamental problem in the analysis of such networks is to properly define the similarity or dissimilarity between any two vertices. Despite the significance of this problem, statistical characterization of the proposed metrics has been limited. We introduce and develop a class of techniques for analyzing random walks on graphs using stochastic calculus. Using these techniques we generalize results on the degeneracy of hitting times and analyze a metric based on the Laplace transformed hitting time (LTHT). The metric serves as a natural, provably well-behaved alternative to the expected hitting time. We establish a general correspondence between hitting times of the Brownian motion and analogous hitting times on the graph. We show that the LTHT is consistent with respect to the underlying metric of a geometric graph, preserves clustering tendency, and remains robust against random addition of non-geometric edges. Tests on simulated and real-world data show that the LTHT matches theoretical predictions and outperforms alternatives.Comment: To appear in NIPS 201

    On the maximal number of real embeddings of minimally rigid graphs in R2\mathbb{R}^2, R3\mathbb{R}^3 and S2S^2

    Get PDF
    Rigidity theory studies the properties of graphs that can have rigid embeddings in a euclidean space Rd\mathbb{R}^d or on a sphere and which in addition satisfy certain edge length constraints. One of the major open problems in this field is to determine lower and upper bounds on the number of realizations with respect to a given number of vertices. This problem is closely related to the classification of rigid graphs according to their maximal number of real embeddings. In this paper, we are interested in finding edge lengths that can maximize the number of real embeddings of minimally rigid graphs in the plane, space, and on the sphere. We use algebraic formulations to provide upper bounds. To find values of the parameters that lead to graphs with a large number of real realizations, possibly attaining the (algebraic) upper bounds, we use some standard heuristics and we also develop a new method inspired by coupler curves. We apply this new method to obtain embeddings in R3\mathbb{R}^3. One of its main novelties is that it allows us to sample efficiently from a larger number of parameters by selecting only a subset of them at each iteration. Our results include a full classification of the 7-vertex graphs according to their maximal numbers of real embeddings in the cases of the embeddings in R2\mathbb{R}^2 and R3\mathbb{R}^3, while in the case of S2S^2 we achieve this classification for all 6-vertex graphs. Additionally, by increasing the number of embeddings of selected graphs, we improve the previously known asymptotic lower bound on the maximum number of realizations. The methods and the results concerning the spatial embeddings are part of the proceedings of ISSAC 2018 (Bartzos et al, 2018)

    edge2vec: Representation learning using edge semantics for biomedical knowledge discovery

    Full text link
    Representation learning provides new and powerful graph analytical approaches and tools for the highly valued data science challenge of mining knowledge graphs. Since previous graph analytical methods have mostly focused on homogeneous graphs, an important current challenge is extending this methodology for richly heterogeneous graphs and knowledge domains. The biomedical sciences are such a domain, reflecting the complexity of biology, with entities such as genes, proteins, drugs, diseases, and phenotypes, and relationships such as gene co-expression, biochemical regulation, and biomolecular inhibition or activation. Therefore, the semantics of edges and nodes are critical for representation learning and knowledge discovery in real world biomedical problems. In this paper, we propose the edge2vec model, which represents graphs considering edge semantics. An edge-type transition matrix is trained by an Expectation-Maximization approach, and a stochastic gradient descent model is employed to learn node embedding on a heterogeneous graph via the trained transition matrix. edge2vec is validated on three biomedical domain tasks: biomedical entity classification, compound-gene bioactivity prediction, and biomedical information retrieval. Results show that by considering edge-types into node embedding learning in heterogeneous graphs, \textbf{edge2vec}\ significantly outperforms state-of-the-art models on all three tasks. We propose this method for its added value relative to existing graph analytical methodology, and in the real world context of biomedical knowledge discovery applicability.Comment: 10 page
    corecore