3,649 research outputs found

    Enumerating Maximal Bicliques from a Large Graph using MapReduce

    Get PDF
    We consider the enumeration of maximal bipartite cliques (bicliques) from a large graph, a task central to many practical data mining problems in social network analysis and bioinformatics. We present novel parallel algorithms for the MapReduce platform, and an experimental evaluation using Hadoop MapReduce. Our algorithm is based on clustering the input graph into smaller sized subgraphs, followed by processing different subgraphs in parallel. Our algorithm uses two ideas that enable it to scale to large graphs: (1) the redundancy in work between different subgraph explorations is minimized through a careful pruning of the search space, and (2) the load on different reducers is balanced through the use of an appropriate total order among the vertices. Our evaluation shows that the algorithm scales to large graphs with millions of edges and tens of mil- lions of maximal bicliques. To our knowledge, this is the first work on maximal biclique enumeration for graphs of this scale.Comment: A preliminary version of the paper was accepted at the Proceedings of the 3rd IEEE International Congress on Big Data 201

    Optimization of Real-World MapReduce Applications With Flame-MR: Practical Use Cases

    Get PDF
    [Abstract] Apache Hadoop is a widely used MapReduce framework for storing and processing large amounts of data. However, it presents some performance issues that hinder its utilization in many practical use cases. Although existing alternatives like Spark or Hama can outperform Hadoop, they require to rewrite the source code of the applications due to API incompatibilities. This paper studies the use of Flame-MR, an in-memory processing architecture for MapReduce applications, to improve the performance of real-world use cases in a transparent way while keeping application compatibility. Flame-MR adapts to the characteristics of the workloads, managing efficiently the use of custom data formats and iterative computations, while also reducing workload imbalance. The experimental evaluation, conducted in high performance clusters and the Microsoft Azure cloud, shows a clear outperformance of Flame-MR over Hadoop. In most cases, Flame-MR reduces the execution times by more than a half

    An efficient implementation of the Bellman-Ford algorithm for Kepler GPU architectures

    Get PDF
    Finding the shortest paths from a single source to all other vertices is a common problem in graph analysis. The Bellman-Ford's algorithm is the solution that solves such a single-source shortest path (SSSP) problem and better applies to be parallelized for many-core architectures. Nevertheless, the high degree of parallelism is guaranteed at the cost of low work efficiency, which, compared to similar algorithms in literature (e.g., Dijkstra's) involves much more redundant work and a consequent waste of power consumption. This article presents a parallel implementation of the Bellman-Ford algorithm that exploits the architectural characteristics of recent GPU architectures (i.e., NVIDIA Kepler, Maxwell) to improve both performance and work efficiency. The article presents different optimizations to the implementation, which are oriented both to the algorithm and to the architecture. The experimental results show that the proposed implementation provides an average speedup of 5x higher than the existing most efficient parallel implementations for SSSP, that it works on graphs where those implementations cannot work or are inefficient (e.g., graphs with negative weight edges, sparse graphs), and that it sensibly reduces the redundant work caused by the parallelization process

    Hierarchical structure-and-motion recovery from uncalibrated images

    Full text link
    This paper addresses the structure-and-motion problem, that requires to find camera motion and 3D struc- ture from point matches. A new pipeline, dubbed Samantha, is presented, that departs from the prevailing sequential paradigm and embraces instead a hierarchical approach. This method has several advantages, like a provably lower computational complexity, which is necessary to achieve true scalability, and better error containment, leading to more stability and less drift. Moreover, a practical autocalibration procedure allows to process images without ancillary information. Experiments with real data assess the accuracy and the computational efficiency of the method.Comment: Accepted for publication in CVI

    A Novel Approach to Load Balancing in P2P Overlay Networks for Edge Systems

    Get PDF
    Edge computing aims at addressing some limitations of cloud computing by bringing computation towards the edge of the system, i.e., closer to the client. There is a panoply of devices that can be integrated into future edge computing platforms, from local datacenters and ISP points of presence, to 5G towers, and even, multiple user devices like smartphones, laptops, and IoT devices. For all of these devices to communicate fruitfully, we need to build systems that enable the seamless interaction and cooperation among these diverse devices. However, creating and maintaining these systems is not trivial since there are numerous types of devices with different capacities. This resource heterogeneity has to be taken into account so that different types of machines contribute to the management of the distributed infrastructure differently, and the operation of the overall system becomes more efficient. In this work, we addressed the challenges identified above by exploring unstructured overlay networks, that have been shown to be possible to manage efficiently and in a fully decentralized way, while being highly robust to failures. To that end, we devised a solution that adapts the number of neighbors of each device (i.e., how many other devices that device knows) according to the capacity of that device and the distribution of capacities of the other devices in the network, as to ensure that the load is fairly distributed between them and, as a consequence, improve the operation of other services atop the unstructured overlay network, for instance, reducing the latencies experienced when broadcasting information. This solution can be easily integrated into most existing peer-to-peer distributed systems, requiring just a slight adaptation to their membership protocol. To show the correction and benefits of our proposal, we evaluated it by comparing it with state of the art decentralized solutions to manage unstructured overlay networks, combining both simulation (to observe the performance of the solution at large scale) and prototype deployments in realistic distributed infrastructures.A computação de periferia visa abordar algumas limitações da computação em nuvem, trazendo a computação para mais perto do cliente. Há uma enorme variedade de dispositivos que podem ser integrados em futuras plataformas de computação de periferia, de data centers locais e pontos de presença de ISPs a torres 5G e até mesmo dispositivos de cliente, como smartphones, laptops e dispositivos IoT. Para que todos esses dispositivos comuniquem de forma proveitosa entre si, precisamos construir sistemas que possibilitem a interação e cooperação eficaz entre eles. No entanto, criar e manter esses sistemas não é trivial, uma vez que existem vários tipos de dispositivos com diferentes capacidades. Essa heterogeneidade de recursos deve ser levada em consideração para que diferentes tipos de máquinas contribuam para o gerenciamento da infraestrutura distribuída de forma distinta e a operação do sistema se torne mais eficiente. Neste trabalho, enfrentámos os desafios identificados acima explorando redes sobrepostas não estruturadas, que se têm mostrado possíveis de gerenciar de forma eficiente e totalmente descentralizada, sendo altamente resistentes a falhas. Para tal, concebemos uma solução que adapta o número de vizinhos de cada dispositivo (ou seja, quantos outros dispositivos aquele dispositivo conhece) de acordo com a sua capacidade e a capacidade dos demais dispositivos da rede, de forma a garantir que a carga seja proporcionalmente distribuída entre eles e, como consequência, reduzindo as latências experienciadas por esses dispositivos. Esta solução pode ser facilmente integrada num sistema distribuído entre-pares existente, exigindo apenas uma ligeira adaptação ao seu protocolo de filiação. Avaliámos a nossa solução comparando-a com outras soluções descentralizadas de última geração, combinando simulação (para observar o desempenho da soluç
    corecore