    Approximate triangle counting with vertex coloring on the UPMEM architecture

    A causa della necessità, negli ultimi anni, di analizzare una quantità sempre crescente di dati, le restrizioni poste dalla larghezza di banda e dalla latenza della memoria DRAM limitano la scalabilità dei sistemi computazionali e impediscono un'esecuzione più rapida di processi memory-bound. Per ovviare a queste limitazioni, negli ultimi decenni è stato proposto un paradigma noto come Processing-In-Memory (PIM), che consente di spostare parte delle operazioni computazionali più vicino a dove risiedono i dati. Questa tesi presenta un'implementazione di un algoritmo di conteggio approssimato dei triangoli in un grafo specificamente progettato per l'architettura PIM di UPMEM, la prima architettura PIM commercializzata in hardware reale. Attraverso una serie di test, vengono evidenziate le capacità dell'architettura PIM e vengono dimostrati i potenziali miglioramenti che offre rispetto alle soluzioni hardware tradizionali. Inoltre, questa tesi presenta ulteriori miglioramenti che potrebbero essere apportati all'algoritmo proposto, sfruttando tutte le capacità dell'architettura PIM di UPMEM.Due to the need in recent years to analyze an ever-increasing amount of data, the restrictions posed by DRAM memory bandwidth and latency limit the scalability of computing systems and impede faster execution of memory-bound workloads. To address these limitations, in the past decades, a paradigm known as Processing-In-Memory (PIM) has been proposed, which allows for part of the computation to be moved closer to where the data resides. This thesis presents an implementation of an algorithm for approximate triangle counting in a graph specifically designed for the UPMEM PIM architecture, the first PIM architecture commercialized in real hardware. Through a series of comprehensive tests, the capabilities of the PIM architecture are highlighted, and the potential improvements it offers compared to traditional hardware solutions are demonstrated. Furthermore, this thesis presents additional improvements that could be made to the proposed algorithm, taking advantage of all the capabilities of the UPMEM PIM architecture

    Triangle Estimation Using Tripartite Independent Set Queries

    Estimating the number of triangles in a graph is one of the most fundamental problems in sublinear algorithms. In this work, we provide an approximate triangle counting algorithm using only polylogarithmic queries when the number of triangles on any edge in the graph is polylogarithmically bounded. Our query oracle Tripartite Independent Set (TIS) takes three disjoint sets of vertices A, B and C as input, and answers whether there exists a triangle having one endpoint in each of these three sets. Our query model generally belongs to the class of group queries (Ron and Tsur, ACM ToCT, 2016; Dell and Lapinskas, STOC 2018) and in particular is inspired by the Bipartite Independent Set (BIS) query oracle of Beame et al. (ITCS 2018). We extend the algorithmic framework of Beame et al., with TIS replacing BIS, for triangle counting using ideas from color coding due to Alon et al. (J. ACM, 1995) and a concentration inequality for sums of random variables with bounded dependency (Janson, Rand. Struct. Alg., 2004)

    Efficient Triangle Counting in Large Graphs via Degree-based Vertex Partitioning

    The number of triangles is a computationally expensive graph statistic which is frequently used in complex network analysis (e.g., transitivity ratio), in various random graph models (e.g., exponential random graph model) and in important real world applications such as spam detection, uncovering of the hidden thematic structure of the Web and link recommendation. Counting triangles in graphs with millions and billions of edges requires algorithms which run fast, use small amount of space, provide accurate estimates of the number of triangles and preferably are parallelizable. In this paper we present an efficient triangle counting algorithm which can be adapted to the semistreaming model. The key idea of our algorithm is to combine the sampling algorithm of Tsourakakis et al. and the partitioning of the set of vertices into a high degree and a low degree subset respectively as in the Alon, Yuster and Zwick work treating each set appropriately. We obtain a running time O(m+m3/2Δlogntϵ2)O \left(m + \frac{m^{3/2} \Delta \log{n}}{t \epsilon^2} \right) and an ϵ\epsilon approximation (multiplicative error), where nn is the number of vertices, mm the number of edges and Δ\Delta the maximum number of triangles an edge is contained. Furthermore, we show how this algorithm can be adapted to the semistreaming model with space usage O(m1/2logn+m3/2Δlogntϵ2)O\left(m^{1/2}\log{n} + \frac{m^{3/2} \Delta \log{n}}{t \epsilon^2} \right) and a constant number of passes (three) over the graph stream. We apply our methods in various networks with several millions of edges and we obtain excellent results. Finally, we propose a random projection based method for triangle counting and provide a sufficient condition to obtain an estimate with low variance.Comment: 1) 12 pages 2) To appear in the 7th Workshop on Algorithms and Models for the Web Graph (WAW 2010

    Parallel Algorithms for Small Subgraph Counting

    Subgraph counting is a fundamental problem in analyzing massive graphs, often studied in the context of social and complex networks. There is a rich literature on designing efficient, accurate, and scalable algorithms for this problem. In this work, we tackle this challenge and design several new algorithms for subgraph counting in the Massively Parallel Computation (MPC) model: Given a graph GG over nn vertices, mm edges and TT triangles, our first main result is an algorithm that, with high probability, outputs a (1+ε)(1+\varepsilon)-approximation to TT, with optimal round and space complexity provided any Smax(m,n2/m)S \geq \max{(\sqrt m, n^2/m)} space per machine, assuming T=Ω(m/n)T=\Omega(\sqrt{m/n}). Our second main result is an O~δ(loglogn)\tilde{O}_{\delta}(\log \log n)-rounds algorithm for exactly counting the number of triangles, parametrized by the arboricity α\alpha of the input graph. The space per machine is O(nδ)O(n^{\delta}) for any constant δ\delta, and the total space is O(mα)O(m\alpha), which matches the time complexity of (combinatorial) triangle counting in the sequential model. We also prove that this result can be extended to exactly counting kk-cliques for any constant kk, with the same round complexity and total space O(mαk2)O(m\alpha^{k-2}). Alternatively, allowing O(α2)O(\alpha^2) space per machine, the total space requirement reduces to O(nα2)O(n\alpha^2). Finally, we prove that a recent result of Bera, Pashanasangi and Seshadhri (ITCS 2020) for exactly counting all subgraphs of size at most 55, can be implemented in the MPC model in O~δ(logn)\tilde{O}_{\delta}(\sqrt{\log n}) rounds, O(nδ)O(n^{\delta}) space per machine and O(mα3)O(m\alpha^3) total space. Therefore, this result also exhibits the phenomenon that a time bound in the sequential model translates to a space bound in the MPC model

    Massively Parallel Algorithms for Small Subgraph Counting

    Comparing MapReduce and pipeline implementations for counting triangles

    A generalized method to define the Divide & Conquer paradigm in order to have processors acting on its own data and scheduled in a parallel fashion. MapReduce is a programming model that follows this paradigm, and allows for the definition of efficient solutions by both decomposing a problem into steps on subsets of the input data and combining the results of each step to produce final results. Albeit used for the implementation of a wide variety of computational problems, MapReduce performance can be negatively affected whenever the replication factor grows or the size of the input is larger than the resources available at each processor. In this paper we show an alternative approach to implement the Divide & Conquer paradigm, named pipeline. The main features of pipeline are illustrated on a parallel implementation of the well-known problem of counting triangles in a graph. This problem is especially interesting either when the input graph does not fit in memory or is dynamically generated. To evaluate the properties of pipeline, a dynamic pipeline of processes and an ad-hoc version of MapReduce are implemented in the language Go, exploiting its ability to deal with channels and spawned processes. An empirical evaluation is conducted on graphs of different sizes and densities. Observed results suggest that pipeline allows for the implementation of an efficient solution of the problem of counting triangles in a graph, particularly, in dense and large graphs, drastically reducing the execution time with respect to the MapReduce implementation.Peer ReviewedPostprint (published version