222 research outputs found
Performance Characterization of Multi-threaded Graph Processing Applications on Intel Many-Integrated-Core Architecture
Intel Xeon Phi many-integrated-core (MIC) architectures usher in a new era of
terascale integration. Among emerging killer applications, parallel graph
processing has been a critical technique to analyze connected data. In this
paper, we empirically evaluate various computing platforms including an Intel
Xeon E5 CPU, a Nvidia Geforce GTX1070 GPU and an Xeon Phi 7210 processor
codenamed Knights Landing (KNL) in the domain of parallel graph processing. We
show that the KNL gains encouraging performance when processing graphs, so that
it can become a promising solution to accelerating multi-threaded graph
applications. We further characterize the impact of KNL architectural
enhancements on the performance of a state-of-the art graph framework.We have
four key observations: 1 Different graph applications require distinctive
numbers of threads to reach the peak performance. For the same application,
various datasets need even different numbers of threads to achieve the best
performance. 2 Only a few graph applications benefit from the high bandwidth
MCDRAM, while others favor the low latency DDR4 DRAM. 3 Vector processing units
executing AVX512 SIMD instructions on KNLs are underutilized when running the
state-of-the-art graph framework. 4 The sub-NUMA cache clustering mode offering
the lowest local memory access latency hurts the performance of graph
benchmarks that are lack of NUMA awareness. At last, We suggest future works
including system auto-tuning tools and graph framework optimizations to fully
exploit the potential of KNL for parallel graph processing.Comment: published as L. Jiang, L. Chen and J. Qiu, "Performance
Characterization of Multi-threaded Graph Processing Applications on
Many-Integrated-Core Architecture," 2018 IEEE International Symposium on
Performance Analysis of Systems and Software (ISPASS), Belfast, United
Kingdom, 2018, pp. 199-20
A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures
Scientific problems that depend on processing large amounts of data require
overcoming challenges in multiple areas: managing large-scale data
distribution, co-placement and scheduling of data with compute resources, and
storing and transferring large volumes of data. We analyze the ecosystems of
the two prominent paradigms for data-intensive applications, hereafter referred
to as the high-performance computing and the Apache-Hadoop paradigm. We propose
a basis, common terminology and functional factors upon which to analyze the
two approaches of both paradigms. We discuss the concept of "Big Data Ogres"
and their facets as means of understanding and characterizing the most common
application workloads found across the two paradigms. We then discuss the
salient features of the two paradigms, and compare and contrast the two
approaches. Specifically, we examine common implementation/approaches of these
paradigms, shed light upon the reasons for their current "architecture" and
discuss some typical workloads that utilize them. In spite of the significant
software distinctions, we believe there is architectural similarity. We discuss
the potential integration of different implementations, across the different
levels and components. Our comparison progresses from a fully qualitative
examination of the two paradigms, to a semi-quantitative methodology. We use a
simple and broadly used Ogre (K-means clustering), characterize its performance
on a range of representative platforms, covering several implementations from
both paradigms. Our experiments provide an insight into the relative strengths
of the two paradigms. We propose that the set of Ogres will serve as a
benchmark to evaluate the two paradigms along different dimensions.Comment: 8 pages, 2 figure
Parallel clustering of high-dimensional social media data streams
We introduce Cloud DIKW as an analysis environment supporting scientific
discovery through integrated parallel batch and streaming processing, and apply
it to one representative domain application: social media data stream
clustering. Recent work demonstrated that high-quality clusters can be
generated by representing the data points using high-dimensional vectors that
reflect textual content and social network information. Due to the high cost of
similarity computation, sequential implementations of even single-pass
algorithms cannot keep up with the speed of real-world streams. This paper
presents our efforts to meet the constraints of real-time social stream
clustering through parallelization. We focus on two system-level issues. Most
stream processing engines like Apache Storm organize distributed workers in the
form of a directed acyclic graph, making it difficult to dynamically
synchronize the state of parallel workers. We tackle this challenge by creating
a separate synchronization channel using a pub-sub messaging system. Due to the
sparsity of the high-dimensional vectors, the size of centroids grows quickly
as new data points are assigned to the clusters. Traditional synchronization
that directly broadcasts cluster centroids becomes too expensive and limits the
scalability of the parallel algorithm. We address this problem by communicating
only dynamic changes of the clusters rather than the whole centroid vectors.
Our algorithm under Cloud DIKW can process the Twitter 10% data stream in
real-time with 96-way parallelism. By natural improvements to Cloud DIKW,
including advanced collective communication techniques developed in our Harp
project, we will be able to process the full Twitter stream in real-time with
1000-way parallelism. Our use of powerful general software subsystems will
enable many other applications that need integration of streaming and batch
data analytics.Comment: IEEE/ACM CCGrid 2015: 15th IEEE/ACM International Symposium on
Cluster, Cloud and Grid Computing, 201
Hedge fund incentives, management commitment and survivorship
Management ownership in hedge funds sends conflicting signals—signals which reduce investors’ perception of survivorship risk. We document that decisions on management ownership are purposely self-selected. Such decisions are most likely motivated by unique incentive mechanisms imbedded in hedge funds. We examine the impact of managerial ownership decisions on fund survivorship risk by accounting for unobserved fund manager motivations that affect both ownership decisions and survivorship risk. Our findings suggest that the conventional argument that having management commitment can reduce survival risk (and therefore align the interests between managers and investors) is significantly overstated. These results are robust to using alternative ownership measures and controlling for different samples
Finding Complex Biological Relationships in Recent PubMed Articles Using Bio-LDA
The overwhelming amount of available scholarly literature in the life
sciences poses significant challenges to scientists wishing to keep up with
important developments related to their research, but also provides a useful
resource for the discovery of recent information concerning genes, diseases,
compounds and the interactions between them. In this paper, we describe an
algorithm called Bio-LDA that uses extracted biological terminology to
automatically identify latent topics, and provides a variety of measures to
uncover putative relations among topics and bio-terms. Relationships identified
using those approaches are combined with existing data in life science datasets
to provide additional insight. Three case studies demonstrate the utility of
the Bio-LDA model, including association predication, association search and
connectivity map generation. This combined approach offers new opportunities
for knowledge discovery in many areas of biology including target
identification, lead hopping and drug repurposing.Comment: 14 pages, 8 figures, 10 table
Multidimensional Scaling by Deterministic Annealing with Iterative Majorization Algorithm
Abstract—Multidimensional Scaling (MDS) is a dimension reduction method for information visualization, which is set up as a non-linear optimization problem. It is applicable to many data intensive scientific problems including studies of DNA sequences but tends to get trapped in local minima. Deterministic Annealing (DA) has been applied to many optimization problems to avoid local minima. We apply DA approach to MDS problem in this paper and show that our proposed DA approach improves the mapping quality and shows high reliability in a variety of experimental results. Further its execution time is similar to that of the un-annealed approach. We use different data sets for comparing the proposed DA approach with both a well known algorithm called SMACOF and a MDS with distance smoothing method which aims to avoid local optima. Our proposed DA method outperforms SMACOF algorithm and the distance smoothing MDS algorithm in terms of the mapping quality and shows much less sensitivity with respect to initial configurations and stopping condition. We also investigate various temperature cooling parameters for our deterministic annealing method within an exponential cooling scheme. I
MapReduce in the Clouds for Science
Abstract — The utility computing model introduced by cloud computing combined with the rich set of cloud infrastructure services offers a very viable alternative to traditional servers and computing clusters. MapReduce distributed data processing architecture has become the weapon of choice for data-intensive analyses in the clouds and in commodity clusters due to its excellent fault tolerance features, scalability and the ease of use. Currently, there are several options for using MapReduce in cloud environments, such as using MapReduce as a service, setting up one’s own MapReduce cluster on cloud instances, or using specialized cloud MapReduce runtimes that take advantage of cloud infrastructure services. In this paper, we introduce AzureMapReduce, a novel MapReduce runtime built using the Microsoft Azure cloud infrastructure services. AzureMapReduce architecture successfully leverages the high latency, eventually consistent, yet highly scalable Azure infrastructure services to provide an efficient, on demand alternative to traditional MapReduce clusters. Further we evaluate the use and performance of MapReduce frameworks, including AzureMapReduce, in cloud environments for scientific applications using sequence assembly and sequence alignment as use cases
- …