308 research outputs found
A Comparison of Parallel Graph Processing Implementations
The rapidly growing number of large network analysis problems has led to the
emergence of many parallel and distributed graph processing systems---one
survey in 2014 identified over 80. Since then, the landscape has evolved; some
packages have become inactive while more are being developed. Determining the
best approach for a given problem is infeasible for most developers. To enable
easy, rigorous, and repeatable comparison of the capabilities of such systems,
we present an approach and associated software for analyzing the performance
and scalability of parallel, open-source graph libraries. We demonstrate our
approach on five graph processing packages: GraphMat, the Graph500, the Graph
Algorithm Platform Benchmark Suite, GraphBIG, and PowerGraph using synthetic
and real-world datasets. We examine previously overlooked aspects of parallel
graph processing performance such as phases of execution and energy usage for
three algorithms: breadth first search, single source shortest paths, and
PageRank and compare our results to Graphalytics.Comment: 10 pages, 10 figures, Submitted to EuroPar 2017 and rejected. Revised
and submitted to IEEE Cluster 201
Ringo: Interactive Graph Analytics on Big-Memory Machines
We present Ringo, a system for analysis of large graphs. Graphs provide a way
to represent and analyze systems of interacting objects (people, proteins,
webpages) with edges between the objects denoting interactions (friendships,
physical interactions, links). Mining graphs provides valuable insights about
individual objects as well as the relationships among them.
In building Ringo, we take advantage of the fact that machines with large
memory and many cores are widely available and also relatively affordable. This
allows us to build an easy-to-use interactive high-performance graph analytics
system. Graphs also need to be built from input data, which often resides in
the form of relational tables. Thus, Ringo provides rich functionality for
manipulating raw input data tables into various kinds of graphs. Furthermore,
Ringo also provides over 200 graph analytics functions that can then be applied
to constructed graphs.
We show that a single big-memory machine provides a very attractive platform
for performing analytics on all but the largest graphs as it offers excellent
performance and ease of use as compared to alternative approaches. With Ringo,
we also demonstrate how to integrate graph analytics with an iterative process
of trial-and-error data exploration and rapid experimentation, common in data
mining workloads.Comment: 6 pages, 2 figure
LLM: Realizing Low-Latency Memory by Exploiting Embedded Silicon Photonics for Irregular Workloads
As emerging workloads exhibit irregular memory access patterns with poor data reuse and locality, they would benefit from a DRAM that achieves low latency without sacrificing bandwidth and energy efficiency. We propose LLM (Low Latency Memory), a codesign of the DRAM microarchitecture, the memory controller and the LLC/DRAM interconnect by leveraging embedded silicon photonics in 2.5D/3D integrated system on chip. LLM relies on Wavelength Division Multiplexing (WDM)-based photonic interconnects to reduce the contention throughout the memory subsystem. LLM also increases the bank-level parallelism, eliminates bus conflicts by using dedicated optical data paths, and reduces the access energy per bit with shorter global bitlines and smaller row buffers. We evaluate the design space of LLM for a variety of synthetic benchmarks and representative graph workloads on a full-system simulator (gem5). LLM exhibits low memory access latency for traffics with both regular and irregular access patterns. For irregular traffic, LLM achieves high bandwidth utilization (over 80% peak throughput compared to 20% of HBM2.0). For real workloads, LLM achieves 3 × and 1.8 × lower execution time compared to HBM2.0 and a state-of-the-art memory system with high memory level parallelism, respectively. This study also demonstrates that by reducing queuing on the data path, LLM can achieve on average 3.4 × lower memory latency variation compared to HBM2.0
Recommended from our members
MILLENNIAL GENERATION COLLEGE STUDENTS: OBSERVATIONS AND EXPERIENCES OF COUNSELING FACULTY AT SELECTED CALIFORNIA COMMUNITY COLLEGE DISTRICTS
Higher education, specifically the California community colleges, is being inundated with a large new generation of students called millennials. They are the majority student group enrolled in record numbers at California Community Colleges. California community colleges continue to evolve in order to accommodate millennial generation college students. A phenomenological design was utilized, using face-to-face interviews. This research explores the phenomenon of “millennial college students (millennials)” through the lived experiences of California community college counseling faculty who interact with them. Their observations and experiences could prove to be informative and help advance the purpose of this research.
The following are the research questions that guided this study. What type of experiences have California community college counseling faculty encountered while providing counseling services to millennial college students? What type of experiences have California community college counseling faculty encountered while teaching millennial college students? Have California community college counseling faculty modified their counseling or teaching practices to better serve millennial college students? Will the observations and experiences of California community college counseling faculty closely align with the literature in describing millennial college students
How to verify the precision of density-functional-theory implementations via reproducible and universal workflows
In the past decades many density-functional theory methods and codes adopting
periodic boundary conditions have been developed and are now extensively used
in condensed matter physics and materials science research. Only in 2016,
however, their precision (i.e., to which extent properties computed with
different codes agree among each other) was systematically assessed on
elemental crystals: a first crucial step to evaluate the reliability of such
computations. We discuss here general recommendations for verification studies
aiming at further testing precision and transferability of
density-functional-theory computational approaches and codes. We illustrate
such recommendations using a greatly expanded protocol covering the whole
periodic table from Z=1 to 96 and characterizing 10 prototypical cubic
compounds for each element: 4 unaries and 6 oxides, spanning a wide range of
coordination numbers and oxidation states. The primary outcome is a reference
dataset of 960 equations of state cross-checked between two all-electron codes,
then used to verify and improve nine pseudopotential-based approaches. Such
effort is facilitated by deploying AiiDA common workflows that perform
automatic input parameter selection, provide identical input/output interfaces
across codes, and ensure full reproducibility. Finally, we discuss the extent
to which the current results for total energies can be reused for different
goals (e.g., obtaining formation energies).Comment: Main text: 23 pages, 4 figures. Supplementary: 68 page
- …