Search CORE

1,763 research outputs found

Improving I/O performance through an in-kernel disk simulator

Author: Cortés Toni
González Férez Pilar
Piernas Canovas Juan
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2016
Field of study

This paper presents two mechanisms that can significantly improve the I/O performance of both hard and solid-state drives for read operations: KDSim and REDCAP. KDSim is an in-kernel disk simulator that provides a framework for simultaneously simulating the performance obtained by different I/O system mechanisms and algorithms, and for dynamically turning them on and off, or selecting between different options or policies, to improve the overall system performance. REDCAP is a RAM-based disk cache that effectively enlarges the built-in cache present in disk drives. By using KDSim, this cache is dynamically activated/deactivated according to the throughput achieved. Results show that, by using KDSim and REDCAP together, a system can improve its I/O performance up to 88% for workloads with some spatial locality on both hard and solid-state drives, while it achieves the same performance as a ‘regular system’ for workloads with random or sequential access patterns.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Leveraging Program Analysis to Reduce User-Perceived Latency in Mobile Applications

Author: Mickens James W
Netravali Ravi
Ossa B De La
PRESTO
Ravindranath Lenin
Wang Xiao Sophia
Wu C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/10/2018
Field of study

Reducing network latency in mobile applications is an effective way of improving the mobile user experience and has tangible economic benefits. This paper presents PALOMA, a novel client-centric technique for reducing the network latency by prefetching HTTP requests in Android apps. Our work leverages string analysis and callback control-flow analysis to automatically instrument apps using PALOMA's rigorous formulation of scenarios that address "what" and "when" to prefetch. PALOMA has been shown to incur significant runtime savings (several hundred milliseconds per prefetchable HTTP request), both when applied on a reusable evaluation benchmark we have developed and on real applicationsComment: ICSE 201

arXiv.org e-Print Archive

Crossref

Three-dimensional memory vectorization for high bandwidth media memory systems

Author: Corbal San Adrián Jesús
Espasa Sans Roger
Valero Cortés Mateo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2002
Field of study

Vector processors have good performance, cost and adaptability when targeting multimedia applications. However, for a significant number of media programs, conventional memory configurations fail to deliver enough memory references per cycle to feed the SIMD functional units. This paper addresses the problem of the memory bandwidth. We propose a novel mechanism suitable for 2-dimensional vector architectures and targeted at providing high effective bandwidth for SIMD memory instructions. The basis of this mechanism is the extension of the scope of vectorization at the memory level, so that 3-dimensional memory patterns can be fetched into a second-level register file. By fetching long blocks of data and by reusing 2-dimensional memory streams at this second-level register file, we obtain a significant increase in the effective memory bandwidth. As side benefits, the new 3-dimensional load instructions provide a high robustness to memory latency and a significant reduction of the cache activity, thus reducing power and energy requirements. At the investment of a 50% more area than a regular SIMD register file, we have measured and average speed-up of 13% and the potential for power savings in the L2 cache of a 30%.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

CloudTree: A Library to Extend Cloud Services for Trees

Author: Ji Yanqing
Scholer Jesse
Tian Yun
Xu Bojian
Publication venue
Publication date: 30/04/2015
Field of study

In this work, we propose a library that enables on a cloud the creation and management of tree data structures from a cloud client. As a proof of concept, we implement a new cloud service CloudTree. With CloudTree, users are able to organize big data into tree data structures of their choice that are physically stored in a cloud. We use caching, prefetching, and aggregation techniques in the design and implementation of CloudTree to enhance performance. We have implemented the services of Binary Search Trees (BST) and Prefix Trees as current members in CloudTree and have benchmarked their performance using the Amazon Cloud. The idea and techniques in the design and implementation of a BST and prefix tree is generic and thus can also be used for other types of trees such as B-tree, and other link-based data structures such as linked lists and graphs. Preliminary experimental results show that CloudTree is useful and efficient for various big data applications

arXiv.org e-Print Archive

Crossref

Survey of Branch Prediction, Pipelining, Memory Systems as Related to Computer Architecture

Author: Landen Kristina
Publication venue: Scholarly Commons
Publication date: 01/04/2017
Field of study

This paper is a survey of topics introduced in Computer Engineering Course CEC470: Computer Architecture (CEC470). The topics covered in this paper provide much more depth than what was provided in CEC470, in addition to exploring new concepts not touched on in the course. Topics presented include branch prediction, pipelining, registers, memory, and the operating system, as well as some general design considerations for computer architecture as a whole. The design considerations explored include a discussion on different types of instruction types specific to the ARM Instruction Set Architecture, known as ARM and Thumb, as well as an exploration of the differences between heterogeneous and homogeneous multi-processors. Further sections explain the interoperability of various portions of the computer architecture with a focus on performance optimizations. Branch prediction is introduced, and the quality improvement which branch prediction provides is detailed. An explanation of pipelining is given followed by how pipelining on different types of processors may be difficult. Registers, one of the fundamental parts of a computer, are explained in detail, as well as their importance to computer systems as a whole. The memory and operating systems sections tie this paper together by delving deeper into the architecture of computers, then resurfacing with how the software and hardware interact through the operating system. This paper concludes by tying each section discussed together and presenting the importance of computer architecture

Embry-Riddle Aeronautical University

Performance analysis and improvement of PostgreSQL

Author: Lindblom Martin
Strandin Fredrik
Publication venue: Lunds universitet/Institutionen för datavetenskap
Publication date: 01/01/2015
Field of study

PostgreSQL is a database management system, used in many different applications throughout the industry. As databases often are the bottlenecks in the performance of applications, their performance becomes crucial. Better performance can either be achieved by using more and faster hardware, or by making the software more efficient. In this master thesis we do a performance analysis of the PostgreSQL database server from the perspective of compiler optimizations, file systems and software prefetching. We will also show how a data structure used in PostgreSQL can benefit from manually introducing software prefetching, as it is hard for the compiler to predict cache misses and insert prefetching instructions in a profitable way.PostgreSQL är en populär databasserver som används i stora delar av industrin. Databasservrar används för att lagra, bearbeta och hämta uppgifter åt t.ex. företag, organisationer och universitet som de förlitar sig på i sitt arbete. Ofta blir dock dessa databaser flaskhalsen i hur snabbt arbete kan utföras, och därför har vi analyserat och förbättrat en populär databasserver