Search CORE

60,169 research outputs found

Darwinian Data Structure Selection

Author: Basios M.
Binder R. V.
Dan H.
Li L.
Li L.
Li L.
Nagel F.
Nanavati J.
Petke J.
Vlissides J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/08/2018
Field of study

Data structure selection and tuning is laborious but can vastly improve an application's performance and memory footprint. Some data structures share a common interface and enjoy multiple implementations. We call them Darwinian Data Structures (DDS), since we can subject their implementations to survival of the fittest. We introduce ARTEMIS a multi-objective, cloud-based search-based optimisation framework that automatically finds optimal, tuned DDS modulo a test suite, then changes an application to use that DDS. ARTEMIS achieves substantial performance improvements for \emph{every} project in

5

Java projects from DaCapo benchmark,

8

popular projects and

30

uniformly sampled projects from GitHub. For execution time, CPU usage, and memory consumption, ARTEMIS finds at least one solution that improves \emph{all} measures for

86\%

(

37/43

) of the projects. The median improvement across the best solutions is

4.8\%

10.1\%

5.1\%

for runtime, memory and CPU usage. These aggregate results understate ARTEMIS's potential impact. Some of the benchmarks it improves are libraries or utility functions. Two examples are gson, a ubiquitous Java serialization framework, and xalan, Apache's XML transformation tool. ARTEMIS improves gson by

16.5

\%,

1\%

and

2.2\%

for memory, runtime, and CPU; ARTEMIS improves xalan's memory consumption by

23.5

\%. \emph{Every} client of these projects will benefit from these performance improvements.Comment: 11 page

arXiv.org e-Print Archive

Crossref

UCL Discovery

Vega: A Computer Vision Processing Enhancement Framework with Graph-based Acceleration

Author: Dong Shi
Gutierrez Julian
Kaeli David
Publication venue: AIS Electronic Library (AISeL)
Publication date: 07/01/2020
Field of study

The popularity of Computer Vision (CV) algorithms has been on the rise given their growing dependence on machine learning and deep neural networks. The resulting improvement in inference accuracy has revolutionized a number of fields. However, given that CV algorithms consist of many different stages, each having different computing characteristics, their execution is frequently irregular and inefficient, unable to leverage the full potential of the computing platform. Presently, supporting real-time video processing for high resolution images on edge systems involves a significant amount of programming effort and performance tuning. To overcome this challenge, we present Vega, a parallel graph-based framework that enables better utilization of multi-core edge computing platforms. Vega provides a highly flexible and user-friendly interface to execute a range of CV algorithms efficiently, leveraging multiple external libraries for performance. First, Vega maps independent stages of a CV algorithm to nodes in a pipeline graph. Next, it dynamically schedules nodes on a multi-core CPU using multi-threading. From our experimental results, our framework improves performance of all selected algorithms by at least 1.75x and up to 4.82x on the same platform. We analyze the impact of using our framework in terms of hardware utilization, frame processing latency and throughput

ScholarSpace at University of Hawai'i at Manoa

AIS Electronic Library (AISeL)

Evaluation and Implementation of n-Gram-Based Algorithm for Fast Text Comparison

Author: Jamro Ernest
Pietroń Marcin
Russek Pawel
Szczepka Paweł
Wiatr Kazimierz
Wielgosz Maciej
Żurek Dominik
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 29/11/2017
Field of study

This paper presents a study of an n-gram-based document comparison method. The method is intended to build a large-scale plagiarism detection system. The work focuses not only on an efficiency of the text similarity extraction but also on the execution performance of the implemented algorithms. We took notice of detection performance, storage requirements and execution time of the proposed approach. The obtained results show the trade-offs between detection quality and computational requirements. The GPGPU and multi-CPU platforms were considered to implement the algorithms and to achieve good execution speed. The method consists of two main algorithms: a document's feature extraction and fast text comparison. The winnowing algorithm is used to generate a compressed representation of the analyzed documents. The authors designed and implemented a dedicated test framework for the algorithm. That allowed for the tuning, evaluation, and optimization of the parameters. Well-known metrics (e.g. precision, recall) were used to evaluate detection performance. The authors conducted the tests to determine the performance of the winnowing algorithm for obfuscated and unobfuscated texts for a different window and n-gram size. Also, a simplified version of the text comparison algorithm was proposed and evaluated to reduce the computational complexity of the text comparison process. The paper also presents GPGPU and multi-CPU implementations of the algorithms for different data structures. The implementation speed was tested for different algorithms' parameters and the size of data. The scalability of the algorithm on multi-CPU platforms was verified. The authors of the paper provide the repository of software tools and programs used to perform the conducted experiments.he appropriate fast document comparison system. Its performance is given in the paper

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)

Garbage collection auto-tuning for Java MapReduce on Multi-Cores

Author: Allen E.
Chu C.
Dean J.
DeWitt D.
Gavin Brown
George Kovoor
Jeremy Singer
Kovoor G.
M
Mikel Luján
Printezis T.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2011
Field of study

MapReduce has been widely accepted as a simple programming pattern that can form the basis for efficient, large-scale, distributed data processing. The success of the MapReduce pattern has led to a variety of implementations for different computational scenarios. In this paper we present MRJ, a MapReduce Java framework for multi-core architectures. We evaluate its scalability on a four-core, hyperthreaded Intel Core i7 processor, using a set of standard MapReduce benchmarks. We investigate the significant impact that Java runtime garbage collection has on the performance and scalability of MRJ. We propose the use of memory management auto-tuning techniques based on machine learning. With our auto-tuning approach, we are able to achieve MRJ performance within 10% of optimal on 75% of our benchmark tests

Crossref

Enlighten