Search CORE

44 research outputs found

Recommended from our members

GRAPH REPRESENTATION LEARNING WITH BOX EMBEDDINGS

Author: Zhang Dongxu
Publication venue: ScholarWorks@UMass Amherst
Publication date: 08/08/2023
Field of study

Graphs are ubiquitous data structures, present in many machine-learning tasks, such as link prediction of products and node classification of scientific papers. As gradient descent drives the training of most modern machine learning architectures, the ability to encode graph-structured data using a differentiable representation is essential to make use of this data. Most approaches encode graph structure in Euclidean space, however, it is non-trivial to model directed edges. The naive solution is to represent each node using a separate source and target vector, however, this can decouple the representation, making it harder for the model to capture information within longer paths in the graph. In this dissertation, we propose to model graphs by representing each node as a \textit{box} (a Cartesian product of intervals) where directed edges are captured by the relative containment of one box in another. Theoretical proof shows that our proposed box embeddings have the expressiveness to represent any \emph{directed acyclic graph}. We also perform rigorous empirical evaluations of vector, hyperbolic, and region-based geometric representations on several families of synthetic and real-world directed graphs. Extensive experimental results suggest that the box containment can allow for transitive relationships to be modeled easily. We further propose t-Box, a variant of box embeddings that learns the temperature together during training. t-Box uses a learned smoothing parameter to achieve better representational capacity than vector models in low dimensions, while also avoiding performance saturation common to other geometric models in high dimensions. Though promising, modeling directed graphs that both contain cycles and some element of transitivity, two properties common in real-world settings, is challenging. Box embeddings, which can be thought of as representing the graph as an intersection over some learned super-graphs, have a natural inductive bias toward modeling transitivity, but (as we prove) cannot model cycles. To address this issue, we propose binary code box embeddings, where a learned binary code selects a subset of graphs for intersection. We explore several variants, including global binary codes (amounting to a union over intersections) and per-vertex binary codes (allowing greater flexibility) as well as methods of regularization. Theoretical and empirical results show that the proposed models not only preserve a useful inductive bias of transitivity but also have sufficient representational capacity to model arbitrary graphs, including graphs with cycles. Lastly, we discuss the use case where box embeddings are not free parameters but are produced by functions. In particular, we explore whether neural networks can map node features into the box space. This is critical in many real-world scenarios. On the one hand, graphs are sparse and the majority of vertices only have few connections or are completely isolated. On the other hand, there may exist rich node features such as attributes and descriptions, that could be useful for prediction tasks. The experimental analysis points out both the effectiveness and insufficiency of multi-layer perceptron-based encoders under different circumstances

ScholarWorks@UMass Amherst

グラフや超グラフに含まれる非巡回部分構造の列挙に関する研究

Author: 和佐州洋
Publication venue
Publication date: 24/03/2016
Field of study

Hokkaido University Collection of Scholarly and Academic Papers

Algorithmen für Topologiebewusstsein in Sensornetzen

Author: Kröller Alexander
Publication venue
Publication date: 23/11/2007
Field of study

This work deals with algorithmic and geometric challenges in wireless sensor networks (WSNs). Classical algorithm theory, with a single processor executing one sequential program while having access to the complete data of the problem at hand, does not suit the needs of WSNs. Instead, we need distributed protocols where nodes collaboratively solve problems that are too complex for a single node. First we analyze a location problem, where the nodes obtain a sense of the network topology and their position in it. Computing coordinates in a global coordinate system is NP-hard in almost all relevant variants. So we present a completely new approach instead. The network builds clusters and constructs an abstract graph that closely reflects the topology of the network region. The resulting topology awareness suits the needs of some applications much better than the coordinate-based approach. In the second part, we present a novel flow problem, which adds battery constraints to dynamic network flows. Given a time horizon, we seek a flow from source to sink that maximizes the total amount of delivered data. As there is no prior work on this problem, we also analyze it in a centralized setting. We prove complexity results for several variants and present approximation schemes. The third part introduces the WSN simulator Shawn. By letting the user choose among different geometric communication models and data structures for the resulting graph, Shawn can adapt to many different setups, including mobile ones. Due to its design, Shawn is much faster than comparable simulation environments.Die vorliegende Arbeit beschäftigt sich mit algorithmischen und geometrischen Fragestellungen in Sensornetzwerken. Im Gegensatz zur klassischen Algorithmik, bei der ein einzelner Prozessor sequenziell Anweisungen abarbeitet und vollen Zugriff auf die Probleminstanz hat, werden hier verteilte Protokolle benötigt, bei denen die Knoten gemeinsam eine Aufgabe bewältigen, zu der sie allein nicht in der Lage wären. Zuerst untersuchen wir das grundlegende Problem, wie Sensorknoten ein Bewusstsein für ihre Position erlangen können. Motiviert daraus, dass das Problem, Koordinaten für ein globales Koordinatensystem zu bestimmen, in fast allen Varianten NP-schwer ist, wird ein vollkommen neuer Ansatz skizziert, bei dem das Netzwerk selbständig geometrische Cluster bildet und einen abstrakten Graphen konstruiert, der die Topologie des zugrunde liegenden Gebiets sehr genau widerspiegelt. Das sich daraus ergebende Positionsbewusstsein ist für einige Anwendungen dem klassischen euklidischen Ansatz deutlich überlegen. Der zweite Teil widmet sich einem Flussproblems für Sensornetzwerke, dass klassische dynamische Flüsse um Batteriebeschränkungen erweitert. Gesucht ist ein Fluss, der für gegebenen Zeithorizont die Datenmenge maximiert, die von einer Quelle zur Senke geschickt werden kann. Dieses Problem wird auch im zentralisierten Modell untersucht, da keine Vorarbeiten existieren. Wir beweisen Komplexitäten von Problemvarianten und entwickeln Approximationsschemata. Der dritte Teil stellt den Netzwerksimulator Shawn vor. Da der Benutzer zwischen verschiedenen geometrischen Kommunikationsmodellen wählen kann und das Speichermodell für den daraus resultierenden Graphen an den verfügbaren Speicher sowie an Simulationsparameter wie eventuell mögliche Mobilität der Knoten anpassen kann, ist Shawn hochflexibel und gleichzeitig deutlich schneller als vergleichbare Simulationsumgebungen

Digitale Bibliothek Braunschweig

Book of Abstracts of the Sixth SIAM Workshop on Combinatorial Scientific Computing

Author: Uçar Bora
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 01/08/2014
Field of study

Book of Abstracts of CSC14 edited by Bora UçarInternational audienceThe Sixth SIAM Workshop on Combinatorial Scientific Computing, CSC14, was organized at the Ecole Normale Supérieure de Lyon, France on 21st to 23rd July, 2014. This two and a half day event marked the sixth in a series that started ten years ago in San Francisco, USA. The CSC14 Workshop's focus was on combinatorial mathematics and algorithms in high performance computing, broadly interpreted. The workshop featured three invited talks, 27 contributed talks and eight poster presentations. All three invited talks were focused on two interesting fields of research specifically: randomized algorithms for numerical linear algebra and network analysis. The contributed talks and the posters targeted modeling, analysis, bisection, clustering, and partitioning of graphs, applied in the context of networks, sparse matrix factorizations, iterative solvers, fast multi-pole methods, automatic differentiation, high-performance computing, and linear programming. The workshop was held at the premises of the LIP laboratory of ENS Lyon and was generously supported by the LABEX MILYON (ANR-10-LABX-0070, Université de Lyon, within the program ''Investissements d'Avenir'' ANR-11-IDEX-0007 operated by the French National Research Agency), and by SIAM

HAL-ENS-LYON

INRIA a CCSD electronic archive server

Hal-Diderot

Computational methods in protein structure comparison and analysis of protein interaction networks

Author: Zotenko Elena
Publication venue
Publication date: 01/01/2007
Field of study

Proteins are versatile biological macromolecules that perform numerous functions in a living organism. For example, proteins catalyze chemical reactions, store and transport various small molecules, and are involved in transmitting nerve signals. As the number of completely sequenced genomes grows, we are faced with the important but daunting task of assigning function to proteins encoded by newly sequenced genomes. In this thesis we contribute to this effort by developing computational methods for which one use is to facilitate protein function assignment. Functional annotation of a newly discovered protein can often be transferred from that of evolutionarily related proteins of known function. However, distantly related proteins can still only be detected by the most accurate protein structure alignment methods. As these methods are computationally expensive, they are combined with less accurate but fast methods to allow large-scale comparative studies. In this thesis we propose a general framework to define a family of protein structure comparison methods that reduce protein structure comparison to distance computation between high-dimensional vectors and therefore are extremely fast. Interactions among proteins can be detected through the use of several mature experimental techniques. These interactions are routinely represented by a graph, called a protein interaction network, with nodes representing the proteins and edges representing the interactions between the proteins. In this thesis we present two computational studies that explore the connection between the topology of protein interaction networks and protein biological function. Unfortunately, protein interaction networks do not explicitly capture an important aspect of protein interactions, their dynamic nature. In this thesis, we present an automatic method that relies on graph theoretic tools for chordal and cograph graph families to extract dynamic properties of protein interactions from the network topology. An intriguing question in the analysis of biological networks is whether biological characteristics of a protein, such as essentiality, can be explained by its placement in the network. In this thesis we analyze protein interaction networks for Saccharomyces cerevisiae to identify the main topological determinant of essentiality and to provide a biological explanation for the connection between the network topology and essentiality

CiteSeerX

Digital Repository at the University of Maryland

Graph Neural Networks for Link Prediction with Subgraph Sketching

Author: Bronstein Michael M.
Chamberlain Benjamin Paul
Frasca Fabrizio
Hammerla Nils
Hansmire Max
Markovich Thomas
Rossi Emanuele
Shirobokov Sergey
Publication venue
Publication date: 02/05/2023
Field of study

Many Graph Neural Networks (GNNs) perform poorly compared to simple heuristics on Link Prediction (LP) tasks. This is due to limitations in expressive power such as the inability to count triangles (the backbone of most LP heuristics) and because they can not distinguish automorphic nodes (those having identical structural roles). Both expressiveness issues can be alleviated by learning link (rather than node) representations and incorporating structural features such as triangle counts. Since explicit link representations are often prohibitively expensive, recent works resorted to subgraph-based methods, which have achieved state-of-the-art performance for LP, but suffer from poor efficiency due to high levels of redundancy between subgraphs. We analyze the components of subgraph GNN (SGNN) methods for link prediction. Based on our analysis, we propose a novel full-graph GNN called ELPH (Efficient Link Prediction with Hashing) that passes subgraph sketches as messages to approximate the key components of SGNNs without explicit subgraph construction. ELPH is provably more expressive than Message Passing GNNs (MPNNs). It outperforms existing SGNN models on many standard LP benchmarks while being orders of magnitude faster. However, it shares the common GNN limitation that it is only efficient when the dataset fits in GPU memory. Accordingly, we develop a highly scalable model, called BUDDY, which uses feature precomputation to circumvent this limitation without sacrificing predictive performance. Our experiments show that BUDDY also outperforms SGNNs on standard LP benchmarks while being highly scalable and faster than ELPH.Comment: 29 pages, 19 figures, 6 appendice

arXiv.org e-Print Archive

27th Annual European Symposium on Algorithms: ESA 2019, September 9-11, 2019, Munich/Garching, Germany

Author: ESA <27. 2019, München>
Publication venue: Schloss Dagstuhl - Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing
Publication date: 01/09/2019
Field of study

Digitale Bibliothek Thüringen

Proceedings of the 8th Cologne-Twente Workshop on Graphs and Combinatorial Optimization

Author: Cafieri Sonia
Liberti Leo
Mucherino Antonio
Nannicini Giacomo
Tarissan Fabien
Publication venue: HAL CCSD
Publication date: 01/01/2009
Field of study

International audienceThe Cologne-Twente Workshop (CTW) on Graphs and Combinatorial Optimization started off as a series of workshops organized bi-annually by either Köln University or Twente University. As its importance grew over time, it re-centered its geographical focus by including northern Italy (CTW04 in Menaggio, on the lake Como and CTW08 in Gargnano, on the Garda lake). This year, CTW (in its eighth edition) will be staged in France for the first time: more precisely in the heart of Paris, at the Conservatoire National d’Arts et Métiers (CNAM), between 2nd and 4th June 2009, by a mixed organizing committee with members from LIX, Ecole Polytechnique and CEDRIC, CNAM

HAL-Paris1

HAL-Polytechnique

An Integer Programming approach to Bayesian Network Structure Learning

Author
Publication venue
Publication date
Field of study

We study the problem of learning a Bayesian Network structure from data using an Integer Programming approach. We study the existing approaches, an in particular some recent works that formulate the problem as an Integer Programming model. By discussing some weaknesses of the existing approaches, we propose an alternative solution, based on a statistical sparsification of the search space. Results show how our approach can lead to promising results, especially for large network

Padua Thesis and Dissertation Archive