9 research outputs found
D4M 3.0: Extended Database and Language Capabilities
The D4M tool was developed to address many of today's data needs. This tool
is used by hundreds of researchers to perform complex analytics on unstructured
data. Over the past few years, the D4M toolbox has evolved to support
connectivity with a variety of new database engines, including SciDB.
D4M-Graphulo provides the ability to do graph analytics in the Apache Accumulo
database. Finally, an implementation using the Julia programming language is
also now available. In this article, we describe some of our latest additions
to the D4M toolbox and our upcoming D4M 3.0 release. We show through
benchmarking and scaling results that we can achieve fast SciDB ingest using
the D4M-SciDB connector, that using Graphulo can enable graph algorithms on
scales that can be memory limited, and that the Julia implementation of D4M
achieves comparable performance or exceeds that of the existing MATLAB(R)
implementation.Comment: IEEE HPEC 201
Distributed Triangle Counting in the Graphulo Matrix Math Library
Triangle counting is a key algorithm for large graph analysis. The Graphulo
library provides a framework for implementing graph algorithms on the Apache
Accumulo distributed database. In this work we adapt two algorithms for
counting triangles, one that uses the adjacency matrix and another that also
uses the incidence matrix, to the Graphulo library for server-side processing
inside Accumulo. Cloud-based experiments show a similar performance profile for
these different approaches on the family of power law Graph500 graphs, for
which data skew increasingly bottlenecks. These results motivate the design of
skew-aware hybrid algorithms that we propose for future work.Comment: Honorable mention in the 2017 IEEE HPEC's Graph Challeng
Database Operations in D4M.jl
Each step in the data analytics pipeline is important, including database
ingest and query. The D4M-Accumulo database connector has allowed analysts to
quickly and easily ingest to and query from Apache Accumulo using MATLAB(R)/GNU
Octave syntax. D4M.jl, a Julia implementation of D4M, provides much of the
functionality of the original D4M implementation to the Julia community. In
this work, we extend D4M.jl to include many of the same database capabilities
that the MATLAB(R)/GNU Octave implementation provides. Here we will describe
the D4M.jl database connector, demonstrate how it can be used, and show that it
has comparable or better performance to the original implementation in
MATLAB(R)/GNU Octave.Comment: IEEE HPEC 2018. arXiv admin note: text overlap with arXiv:1708.0293
SIAM Data Mining Brings It to Annual Meeting
The Data Mining Activity Group is one of SIAM\u27s most vibrant and dynamic activity groups. To better share our enthusiasm for data mining with the broader SIAM community, our activity group organized six minisymposia at the 2016 Annual Meeting. These minisymposia included 48 talks organized by 11 SIAM members on - GraphBLAS (Aydın Buluç) - Algorithms and statistical methods for noisy network analysis (Sanjukta Bhowmick & Ben Miller) - Inferring networks from non-network data (Rajmonda Caceres, Ivan Brugere & Tanya Y. Berger-Wolf) - Visual analytics (Jordan Crouser) - Mining in graph data (Jennifer Webster, Mahantesh Halappanavar & Emilie Hogan) - Scientific computing and big data (Vijay Gadepally) These minisymposia were well received by the broader SIAM community, and below are some of the key highlights
On Large-Scale Graph Generation with Validation of Diverse Triangle Statistics at Edges and Vertices
Researchers developing implementations of distributed graph analytic
algorithms require graph generators that yield graphs sharing the challenging
characteristics of real-world graphs (small-world, scale-free, heavy-tailed
degree distribution) with efficiently calculable ground-truth solutions to the
desired output. Reproducibility for current generators used in benchmarking are
somewhat lacking in this respect due to their randomness: the output of a
desired graph analytic can only be compared to expected values and not exact
ground truth. Nonstochastic Kronecker product graphs meet these design criteria
for several graph analytics. Here we show that many flavors of triangle
participation can be cheaply calculated while generating a Kronecker product
graph. Given two medium-sized scale-free graphs with adjacency matrices and
, their Kronecker product graph has adjacency matrix . Such
graphs are highly compressible: edges are represented in memory and can be built in a distributed setting from
small data structures, making them easy to share in compressed form. Many
interesting graph calculations have worst-case complexity bounds and often these are reduced to
for Kronecker product graphs, when a Kronecker formula can be derived yielding
the sought calculation on in terms of related calculations on and .
We focus on deriving formulas for triangle participation at vertices, , a vector storing the number of triangles that every vertex is involved
in, and triangle participation at edges, , a sparse matrix storing
the number of triangles at every edge.Comment: 10 pages, 7 figures, IEEE IPDPS Graph Algorithms Building Block
High-Performance and Power-Aware Graph Processing on GPUs
Graphs are a common representation in many problem domains, including engineering, finance, medicine, and scientific applications. Different problems map to very large graphs, often involving millions of vertices. Even though very efficient sequential implementations of graph algorithms exist, they become impractical when applied on such actual very large graphs. On the other hand, graphics processing units (GPUs) have become widespread architectures as they provide massive parallelism at low cost. Parallel execution on GPUs may achieve speedup up to three orders of magnitude with respect to the sequential counterparts. Nevertheless, accelerating efficient and optimized sequential algorithms and porting (i.e., parallelizing) their implementation to such many-core architectures is a very challenging task. The task is made even harder since energy and power consumption are becoming constraints in addition, or in same case as an alternative, to performance. This work aims at developing a platform that provides (I) a library of parallel, efficient, and tunable implementations of the most important graph algorithms for GPUs, and (II) an advanced profiling model to analyze both performance and power consumption of the algorithm implementations. The platform goal is twofold. Through the library, it aims at saving developing effort in the parallelization task through a primitive-based approach. Through the profiling framework, it aims at customizing such primitives by considering both the architectural details and the target efficiency metrics (i.e., performance or power)
Adaptive indoor positioning system based on locating globally deployed WiFi signal sources
Recent trends in data driven applications have encouraged expanding
location awareness to indoors. Various attributes driven by location data
indoors require large scale deployment that could expand beyond specific
venue to a city, country or even global coverage. Social media, assets or
personnel tracking, marketing or advertising are examples of applications
that heavily utilise location attributes. Various solutions suggest
triangulation between WiFi access points to obtain location attribution
indoors imitating the GPS accurate estimation through satellites
constellations. However, locating signal sources deep indoors introduces
various challenges that cannot be addressed via the traditional war-driving
or war-walking methods.
This research sets out to address the problem of locating WiFi signal
sources deep indoors in unsupervised deployment, without previous
training or calibration. To achieve this, we developed a grid approach to
mitigate for none line of site (NLoS) conditions by clustering signal readings
into multi-hypothesis Gaussians distributions. We have also employed
hypothesis testing classification to estimate signal attenuation through
unknown layouts to remove dependencies on indoor maps availability.
Furthermore, we introduced novel methods for locating signal sources
deep indoors and presented the concept of WiFi access point (WAP)
temporal profiles as an adaptive radio-map with global coverage.
Nevertheless, the primary contribution of this research appears in
utilisation of data streaming, creation and maintenance of self-organising
networks of WAPs through an adaptive deployment of mass-spring
relaxation algorithm. In addition, complementary database utilisation
components such as error estimation, position estimation and expanding to
3D have been discussed. To justify the outcome of this research, we
present results for testing the proposed system on large scale dataset
covering various indoor environments in different parts of the world.
Finally, we propose scalable indoor positioning system based on received
signal strength (RSSI) measurements of WiFi access points to resolve the
indoor positioning challenge. To enable the adoption of the proposed
solution to global scale, we deployed a piece of software on multitude of
smartphone devices to collect data occasionally without the context of
venue, environment or custom hardware. To conclude, this thesis provides
learning for novel adaptive crowd-sourcing system that automatically deals
with tolerance of imprecise data when locating signal sources
Benchmarking the graphulo processing framework
Graph algorithms have wide applicablity to a variety of domains and are often
used on massive datasets. Recent standardization efforts such as the GraphBLAS
specify a set of key computational kernels that hardware and software
developers can adhere to. Graphulo is a processing framework that enables
GraphBLAS kernels in the Apache Accumulo database. In our previous work, we
have demonstrated a core Graphulo operation called \textit{TableMult} that
performs large-scale multiplication operations of database tables. In this
article, we present the results of scaling the Graphulo engine to larger
problems and scalablity when a greater number of resources is used.
Specifically, we present two experiments that demonstrate Graphulo scaling
performance is linear with the number of available resources. The first
experiment demonstrates cluster processing rates through Graphulo's TableMult
operator on two large graphs, scaled between and vertices.
The second experiment uses TableMult to extract a random set of rows from a
large graph ( nodes) to simulate a cued graph analytic. These
benchmarking results are of relevance to Graphulo users who wish to apply
Graphulo to their graph problems.Comment: 5 pages, 4 figures, IEEE High Performance Extreme Computing (HPEC)
conference 201