3,649 research outputs found
Enumerating Maximal Bicliques from a Large Graph using MapReduce
We consider the enumeration of maximal bipartite cliques (bicliques) from a
large graph, a task central to many practical data mining problems in social
network analysis and bioinformatics. We present novel parallel algorithms for
the MapReduce platform, and an experimental evaluation using Hadoop MapReduce.
Our algorithm is based on clustering the input graph into smaller sized
subgraphs, followed by processing different subgraphs in parallel. Our
algorithm uses two ideas that enable it to scale to large graphs: (1) the
redundancy in work between different subgraph explorations is minimized through
a careful pruning of the search space, and (2) the load on different reducers
is balanced through the use of an appropriate total order among the vertices.
Our evaluation shows that the algorithm scales to large graphs with millions of
edges and tens of mil- lions of maximal bicliques. To our knowledge, this is
the first work on maximal biclique enumeration for graphs of this scale.Comment: A preliminary version of the paper was accepted at the Proceedings of
the 3rd IEEE International Congress on Big Data 201
Optimization of Real-World MapReduce Applications With Flame-MR: Practical Use Cases
[Abstract] Apache Hadoop is a widely used MapReduce framework for storing and processing large amounts of data. However, it presents some performance issues that hinder its utilization in many practical use cases. Although existing alternatives like Spark or Hama can outperform Hadoop, they require to rewrite the source code of the applications due to API incompatibilities. This paper studies the use of Flame-MR, an in-memory processing architecture for MapReduce applications, to improve the performance of real-world use cases in a transparent way while keeping application compatibility. Flame-MR adapts to the characteristics of the workloads, managing efficiently the use of custom data formats and iterative computations, while also reducing workload imbalance. The experimental evaluation, conducted in high performance clusters and the Microsoft Azure cloud, shows a clear outperformance of Flame-MR over Hadoop. In most cases, Flame-MR reduces the execution times by more than a half
An efficient implementation of the Bellman-Ford algorithm for Kepler GPU architectures
Finding the shortest paths from a single source to all other vertices is a common problem in graph analysis. The Bellman-Ford's algorithm is the solution that solves such a single-source shortest path (SSSP) problem and better applies to be parallelized for many-core architectures. Nevertheless, the high degree of parallelism is guaranteed at the cost of low work efficiency, which, compared to similar algorithms in literature (e.g., Dijkstra's) involves much more redundant work and a consequent waste of power consumption. This article presents a parallel implementation of the Bellman-Ford algorithm that exploits the architectural characteristics of recent GPU architectures (i.e., NVIDIA Kepler, Maxwell) to improve both performance and work efficiency. The article presents different optimizations to the implementation, which are oriented both to the algorithm and to the architecture. The experimental results show that the proposed implementation provides an average speedup of 5x higher than the existing most efficient parallel implementations for SSSP, that it works on graphs where those implementations cannot work or are inefficient (e.g., graphs with negative weight edges, sparse graphs), and that it sensibly reduces the redundant work caused by the parallelization process
Hierarchical structure-and-motion recovery from uncalibrated images
This paper addresses the structure-and-motion problem, that requires to find
camera motion and 3D struc- ture from point matches. A new pipeline, dubbed
Samantha, is presented, that departs from the prevailing sequential paradigm
and embraces instead a hierarchical approach. This method has several
advantages, like a provably lower computational complexity, which is necessary
to achieve true scalability, and better error containment, leading to more
stability and less drift. Moreover, a practical autocalibration procedure
allows to process images without ancillary information. Experiments with real
data assess the accuracy and the computational efficiency of the method.Comment: Accepted for publication in CVI
A Novel Approach to Load Balancing in P2P Overlay Networks for Edge Systems
Edge computing aims at addressing some limitations of cloud computing by bringing
computation towards the edge of the system, i.e., closer to the client. There is a panoply
of devices that can be integrated into future edge computing platforms, from local datacenters
and ISP points of presence, to 5G towers, and even, multiple user devices like
smartphones, laptops, and IoT devices. For all of these devices to communicate fruitfully,
we need to build systems that enable the seamless interaction and cooperation among
these diverse devices. However, creating and maintaining these systems is not trivial
since there are numerous types of devices with different capacities. This resource heterogeneity
has to be taken into account so that different types of machines contribute to the
management of the distributed infrastructure differently, and the operation of the overall
system becomes more efficient.
In this work, we addressed the challenges identified above by exploring unstructured
overlay networks, that have been shown to be possible to manage efficiently and in a
fully decentralized way, while being highly robust to failures. To that end, we devised
a solution that adapts the number of neighbors of each device (i.e., how many other devices
that device knows) according to the capacity of that device and the distribution
of capacities of the other devices in the network, as to ensure that the load is fairly distributed
between them and, as a consequence, improve the operation of other services
atop the unstructured overlay network, for instance, reducing the latencies experienced
when broadcasting information. This solution can be easily integrated into most existing
peer-to-peer distributed systems, requiring just a slight adaptation to their membership
protocol. To show the correction and benefits of our proposal, we evaluated it by comparing
it with state of the art decentralized solutions to manage unstructured overlay
networks, combining both simulation (to observe the performance of the solution at large
scale) and prototype deployments in realistic distributed infrastructures.A computação de periferia visa abordar algumas limitações da computação em nuvem,
trazendo a computação para mais perto do cliente. Há uma enorme variedade de dispositivos
que podem ser integrados em futuras plataformas de computação de periferia, de
data centers locais e pontos de presença de ISPs a torres 5G e até mesmo dispositivos de
cliente, como smartphones, laptops e dispositivos IoT. Para que todos esses dispositivos comuniquem
de forma proveitosa entre si, precisamos construir sistemas que possibilitem
a interação e cooperação eficaz entre eles. No entanto, criar e manter esses sistemas não é
trivial, uma vez que existem vários tipos de dispositivos com diferentes capacidades. Essa
heterogeneidade de recursos deve ser levada em consideração para que diferentes tipos
de máquinas contribuam para o gerenciamento da infraestrutura distribuída de forma
distinta e a operação do sistema se torne mais eficiente.
Neste trabalho, enfrentámos os desafios identificados acima explorando redes sobrepostas
não estruturadas, que se têm mostrado possíveis de gerenciar de forma eficiente
e totalmente descentralizada, sendo altamente resistentes a falhas. Para tal, concebemos
uma solução que adapta o número de vizinhos de cada dispositivo (ou seja, quantos outros
dispositivos aquele dispositivo conhece) de acordo com a sua capacidade e a capacidade
dos demais dispositivos da rede, de forma a garantir que a carga seja proporcionalmente
distribuída entre eles e, como consequência, reduzindo as latências experienciadas por
esses dispositivos. Esta solução pode ser facilmente integrada num sistema distribuído
entre-pares existente, exigindo apenas uma ligeira adaptação ao seu protocolo de filiação.
Avaliámos a nossa solução comparando-a com outras soluções descentralizadas de última
geração, combinando simulação (para observar o desempenho da soluç
- …