Search CORE

169 research outputs found

July 11th, 2017

Author: Berger Emery
Publication venue: Barcelona Supercomputing Center
Publication date: 10/09/2017
Field of study

Performance clearly matters to users. The most common software update on the AppStore *by far* is "Bug fixes and performance enhancements." Now that Moore's Law Free Lunch has ended, programmers have to work hard to get high performance for their applications. But why is performance so hard to deliver? I will first explain why our current approaches to evaluating and optimizing performance don't work, especially on modern hardware and for modern applications. I will then present two systems that address these challenges. Stabilizer is a tool that enables statistically sound performance evaluation, making it possible to understand the impact of optimizations and conclude things like the fact that the -O2 and -O3 optimization levels are indistinguishable from noise (unfortunately true). Since compiler optimizations have largely run out of steam, we need better profiling support, especially for modern concurrent, multi-threaded applications. Coz is a novel "causal profiler" that lets programmers optimize for throughput or latency, and which pinpoints and accurately predicts the impact of optimizations. Coz's approach unlocks numerous previously unknown optimization opportunities. Guided by Coz, we improved the performance of Memcached by 9%, SQLite by 25%, and accelerated six Parsec applications by as much as 68%; in most cases, these optimizations involved modifying under 10 lines of code. This talk is based on work with Charlie Curtsinger published at ASPLOS 2013 (Stabilizer) and SOSP 2015 (Coz), which received a Best Paper Award and was selected as a CACM Research Highlight

UPCommons. Portal del coneixement obert de la UPC

Mesh: automatically compacting memory for C/C++ applications

Author: Berger Emery
Publication venue: Barcelona Supercomputing Center
Publication date: 01/01/2020
Field of study

Programs written in C and C++ — and languages implemented in C, like Python and Ruby — can suffer from serious memory fragmentation, leading to low utilization of memory, degraded performance, and application failure due to memory exhaustion. This talk introduces Mesh, a plug-in replacement for malloc that, for the first time, eliminates fragmentation in unmodified applications through compaction. A key challenge is that, unlike in garbage-collected environments, the addresses of allocated objects in C/C++ are directly exposed to programmers, and applications may do things like stash addresses in integers, and store flags in the low bits of aligned addresses. This hostile environment makes it impossible to safely relocate objects, as the runtime cannot precisely locate and update pointers

UPCommons. Portal del coneixement obert de la UPC

Research evaluation for computer science

Author: Choppy Christine
Leeuwen Jan van
Meyer Bertrand
Staunstrup Jørgen
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2009
Field of study

Crossref

The IT University of Copenhagen's Repository

Improving publication quality by reducing bias with double-blind reviewing and author response

Author: Kathryn S. McKinley
Laband D.N.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date
Field of study

Crossref

30 Years of Software Refactoring Research:A Systematic Literature Review

Author: Abid Chaima
Alizadeh Vahid
Dig Danny
Ferreira Thiago do Nascimento
Kessentini Marouane
Publication venue
Publication date: 25/06/2020
Field of study

Due to the growing complexity of software systems, there has been a dramatic increase and industry demand for tools and techniques on software refactoring in the last ten years, defined traditionally as a set of program transformations intended to improve the system design while preserving the behavior. Refactoring studies are expanded beyond code-level restructuring to be applied at different levels (architecture, model, requirements, etc.), adopted in many domains beyond the object-oriented paradigm (cloud computing, mobile, web, etc.), used in industrial settings and considered objectives beyond improving the design to include other non-functional requirements (e.g., improve performance, security, etc.). Thus, challenges to be addressed by refactoring work are, nowadays, beyond code transformation to include, but not limited to, scheduling the opportune time to carry refactoring, recommendations of specific refactoring activities, detection of refactoring opportunities, and testing the correctness of applied refactorings. Therefore, the refactoring research efforts are fragmented over several research communities, various domains, and objectives. To structure the field and existing research results, this paper provides a systematic literature review and analyzes the results of 3183 research papers on refactoring covering the last three decades to offer the most scalable and comprehensive literature review of existing refactoring research studies. Based on this survey, we created a taxonomy to classify the existing research, identified research trends, and highlighted gaps in the literature and avenues for further research.Comment: 23 page

arXiv.org e-Print Archive

Deep Blue Documents at the University of Michigan

30 Years of Software Refactoring Research: A Systematic Literature Review

Author: Abid Chaima
Alizadeh Vahid
Ferreira Thiago
Kessentini Marouane
Publication venue
Publication date: 25/06/2020
Field of study

Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/155872/4/30YRefactoring.pd

Deep Blue Documents at the University of Michigan

GPU LSM: A Dynamic Dictionary Data Structure for the GPU

Author: Amenta Nina
Ashkiani Saman
Farach-Colton Martin
Li Shengren
Owens John D.
Publication venue
Publication date: 01/01/2018
Field of study

We develop a dynamic dictionary data structure for the GPU, supporting fast insertions and deletions, based on the Log Structured Merge tree (LSM). Our implementation on an NVIDIA K40c GPU has an average update (insertion or deletion) rate of 225 M elements/s, 13.5x faster than merging items into a sorted array. The GPU LSM supports the retrieval operations of lookup, count, and range query operations with an average rate of 75 M, 32 M and 23 M queries/s respectively. The trade-off for the dynamic updates is that the sorted array is almost twice as fast on retrievals. We believe that our GPU LSM is the first dynamic general-purpose dictionary data structure for the GPU.Comment: 11 pages, accepted to appear on the Proceedings of IEEE International Parallel and Distributed Processing Symposium (IPDPS'18

arXiv.org e-Print Archive

eScholarship - University of California

Washington University Record, September 22, 2000

Author
Publication venue: Digital Commons@Becker
Publication date: 22/09/2000
Field of study

https://digitalcommons.wustl.edu/record/1872/thumbnail.jp

Digital Commons@Becker

Let's Start Reducing the Carbon Footprint of Academic Conferences

Author: Funke Markus
Lago Patricia
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/07/2022
Field of study

VU Research Portal