Search CORE

222 research outputs found

Performance Characterization of Multi-threaded Graph Processing Applications on Intel Many-Integrated-Core Architecture

Author: Chen Langshi
Jiang Lei
Qiu Judy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/02/2019
Field of study

Intel Xeon Phi many-integrated-core (MIC) architectures usher in a new era of terascale integration. Among emerging killer applications, parallel graph processing has been a critical technique to analyze connected data. In this paper, we empirically evaluate various computing platforms including an Intel Xeon E5 CPU, a Nvidia Geforce GTX1070 GPU and an Xeon Phi 7210 processor codenamed Knights Landing (KNL) in the domain of parallel graph processing. We show that the KNL gains encouraging performance when processing graphs, so that it can become a promising solution to accelerating multi-threaded graph applications. We further characterize the impact of KNL architectural enhancements on the performance of a state-of-the art graph framework.We have four key observations: 1 Different graph applications require distinctive numbers of threads to reach the peak performance. For the same application, various datasets need even different numbers of threads to achieve the best performance. 2 Only a few graph applications benefit from the high bandwidth MCDRAM, while others favor the low latency DDR4 DRAM. 3 Vector processing units executing AVX512 SIMD instructions on KNLs are underutilized when running the state-of-the-art graph framework. 4 The sub-NUMA cache clustering mode offering the lowest local memory access latency hurts the performance of graph benchmarks that are lack of NUMA awareness. At last, We suggest future works including system auto-tuning tools and graph framework optimizations to fully exploit the potential of KNL for parallel graph processing.Comment: published as L. Jiang, L. Chen and J. Qiu, "Performance Characterization of Multi-threaded Graph Processing Applications on Many-Integrated-Core Architecture," 2018 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), Belfast, United Kingdom, 2018, pp. 199-20

arXiv.org e-Print Archive

IUScholarWorks Open

A Tale of Two Data-Intensive Paradigms: Applications, Abstractions, and Architectures

Author: Fox Geoffrey C.
Jha Shantenu
Luckow Andre
Mantha Pradeep
Qiu Judy
Publication venue
Publication date: 01/01/2014
Field of study

Scientific problems that depend on processing large amounts of data require overcoming challenges in multiple areas: managing large-scale data distribution, co-placement and scheduling of data with compute resources, and storing and transferring large volumes of data. We analyze the ecosystems of the two prominent paradigms for data-intensive applications, hereafter referred to as the high-performance computing and the Apache-Hadoop paradigm. We propose a basis, common terminology and functional factors upon which to analyze the two approaches of both paradigms. We discuss the concept of "Big Data Ogres" and their facets as means of understanding and characterizing the most common application workloads found across the two paradigms. We then discuss the salient features of the two paradigms, and compare and contrast the two approaches. Specifically, we examine common implementation/approaches of these paradigms, shed light upon the reasons for their current "architecture" and discuss some typical workloads that utilize them. In spite of the significant software distinctions, we believe there is architectural similarity. We discuss the potential integration of different implementations, across the different levels and components. Our comparison progresses from a fully qualitative examination of the two paradigms, to a semi-quantitative methodology. We use a simple and broadly used Ogre (K-means clustering), characterize its performance on a range of representative platforms, covering several implementations from both paradigms. Our experiments provide an insight into the relative strengths of the two paradigms. We propose that the set of Ogres will serve as a benchmark to evaluate the two paradigms along different dimensions.Comment: 8 pages, 2 figure

arXiv.org e-Print Archive

CiteSeerX

Crossref

Parallel clustering of high-dimensional social media data streams

Author: Ferrara Emilio
Gao Xiaoming
Qiu Judy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2015
Field of study

We introduce Cloud DIKW as an analysis environment supporting scientific discovery through integrated parallel batch and streaming processing, and apply it to one representative domain application: social media data stream clustering. Recent work demonstrated that high-quality clusters can be generated by representing the data points using high-dimensional vectors that reflect textual content and social network information. Due to the high cost of similarity computation, sequential implementations of even single-pass algorithms cannot keep up with the speed of real-world streams. This paper presents our efforts to meet the constraints of real-time social stream clustering through parallelization. We focus on two system-level issues. Most stream processing engines like Apache Storm organize distributed workers in the form of a directed acyclic graph, making it difficult to dynamically synchronize the state of parallel workers. We tackle this challenge by creating a separate synchronization channel using a pub-sub messaging system. Due to the sparsity of the high-dimensional vectors, the size of centroids grows quickly as new data points are assigned to the clusters. Traditional synchronization that directly broadcasts cluster centroids becomes too expensive and limits the scalability of the parallel algorithm. We address this problem by communicating only dynamic changes of the clusters rather than the whole centroid vectors. Our algorithm under Cloud DIKW can process the Twitter 10% data stream in real-time with 96-way parallelism. By natural improvements to Cloud DIKW, including advanced collective communication techniques developed in our Harp project, we will be able to process the full Twitter stream in real-time with 1000-way parallelism. Our use of powerful general software subsystems will enable many other applications that need integration of streaming and batch data analytics.Comment: IEEE/ACM CCGrid 2015: 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing, 201

arXiv.org e-Print Archive

Crossref

Hedge fund incentives, management commitment and survivorship

Author: Qiu Judy
Tang Leilei
Walter Ingo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/05/2018
Field of study

Management ownership in hedge funds sends conflicting signals—signals which reduce investors’ perception of survivorship risk. We document that decisions on management ownership are purposely self-selected. Such decisions are most likely motivated by unique incentive mechanisms imbedded in hedge funds. We examine the impact of managerial ownership decisions on fund survivorship risk by accounting for unobserved fund manager motivations that affect both ownership decisions and survivorship risk. Our findings suggest that the conventional argument that having management commitment can reduce survival risk (and therefore align the interests between managers and investors) is significantly overstated. These results are robust to using alternative ownership measures and controlling for different samples

Crossref

University of Strathclyde Institutional Repository

Finding Complex Biological Relationships in Recent PubMed Articles Using Bio-LDA

Author: Ding Ying
Dong Xiao
He Bing
Qiu Judy
Tang Jie
Wang Huijun
Wild David J.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2011
Field of study

The overwhelming amount of available scholarly literature in the life sciences poses significant challenges to scientists wishing to keep up with important developments related to their research, but also provides a useful resource for the discovery of recent information concerning genes, diseases, compounds and the interactions between them. In this paper, we describe an algorithm called Bio-LDA that uses extracted biological terminology to automatically identify latent topics, and provides a variety of measures to uncover putative relations among topics and bio-terms. Relationships identified using those approaches are combined with existing data in life science datasets to provide additional insight. Three case studies demonstrate the utility of the Bio-LDA model, including association predication, association search and connectivity map generation. This combined approach offers new opportunities for knowledge discovery in many areas of biology including target identification, lead hopping and drug repurposing.Comment: 14 pages, 8 figures, 10 table

arXiv.org e-Print Archive

CiteSeerX

Public Library of Science (PLOS)

Directory of Open Access Journals

PubMed Central

Multidimensional Scaling by Deterministic Annealing with Iterative Majorization Algorithm

Author: Geoffrey C. Fox
Judy Qiu
Seung-hee Bae
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Abstract—Multidimensional Scaling (MDS) is a dimension reduction method for information visualization, which is set up as a non-linear optimization problem. It is applicable to many data intensive scientific problems including studies of DNA sequences but tends to get trapped in local minima. Deterministic Annealing (DA) has been applied to many optimization problems to avoid local minima. We apply DA approach to MDS problem in this paper and show that our proposed DA approach improves the mapping quality and shows high reliability in a variety of experimental results. Further its execution time is similar to that of the un-annealed approach. We use different data sets for comparing the proposed DA approach with both a well known algorithm called SMACOF and a MDS with distance smoothing method which aims to avoid local optima. Our proposed DA method outperforms SMACOF algorithm and the distance smoothing MDS algorithm in terms of the mapping quality and shows much less sensitivity with respect to initial configurations and stopping condition. We also investigate various temperature cooling parameters for our deterministic annealing method within an exponential cooling scheme. I

CiteSeerX

Crossref

MapReduce in the Clouds for Science

Author: Geoffrey Fox
Judy Qiu
Tak-lon Wu
Thilina Gunarathne
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2010
Field of study

Abstract — The utility computing model introduced by cloud computing combined with the rich set of cloud infrastructure services offers a very viable alternative to traditional servers and computing clusters. MapReduce distributed data processing architecture has become the weapon of choice for data-intensive analyses in the clouds and in commodity clusters due to its excellent fault tolerance features, scalability and the ease of use. Currently, there are several options for using MapReduce in cloud environments, such as using MapReduce as a service, setting up one’s own MapReduce cluster on cloud instances, or using specialized cloud MapReduce runtimes that take advantage of cloud infrastructure services. In this paper, we introduce AzureMapReduce, a novel MapReduce runtime built using the Microsoft Azure cloud infrastructure services. AzureMapReduce architecture successfully leverages the high latency, eventually consistent, yet highly scalable Azure infrastructure services to provide an efficient, on demand alternative to traditional MapReduce clusters. Further we evaluate the use and performance of MapReduce frameworks, including AzureMapReduce, in cloud environments for scientific applications using sequence assembly and sequence alignment as use cases

CiteSeerX

Crossref