Search CORE

774 research outputs found

Adaptive runtime-assisted block prefetching on chip-multiprocessors

Author: Carpenter Paul M.
García Flores Víctor
Navarro Nacho
Ramirez Alex
Rico Carro Alejandro
Villavieja Prados Carlos
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Memory stalls are a significant source of performance degradation in modern processors. Data prefetching is a widely adopted and well studied technique used to alleviate this problem. Prefetching can be performed by the hardware, or be initiated and controlled by software. Among software controlled prefetching we find a wide variety of schemes, including runtime-directed prefetching and more specifically runtime-directed block prefetching. This paper proposes a hybrid prefetching mechanism that integrates a software driven block prefetcher with existing hardware prefetching techniques. Our runtime-assisted software prefetcher brings large blocks of data on-chip with the support of a low cost hardware engine, and synergizes with existing hardware prefetchers that manage locality at a finer granularity. The runtime system that drives the prefetch engine dynamically selects which cache to prefetch to. Our evaluation on a set of scientific benchmarks obtains a maximum speed up of 32 and 10 % on average compared to a baseline with hardware prefetching only. As a result, we also achieve a reduction of up to 18 and 3 % on average in energy-to-solution.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Architecture for Cooperative Prefetching in P2P Video-on- Demand System

Author: Abbasi Ubaid
Ahmed Toufik
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 11/05/2010
Field of study

Most P2P VoD schemes focused on service architectures and overlays optimization without considering segments rarity and the performance of prefetching strategies. As a result, they cannot better support VCRoriented service in heterogeneous environment having clients using free VCR controls. Despite the remarkable popularity in VoD systems, there exist no prior work that studies the performance gap between different prefetching strategies. In this paper, we analyze and understand the performance of different prefetching strategies. Our analytical characterization brings us not only a better understanding of several fundamental tradeoffs in prefetching strategies, but also important insights on the design of P2P VoD system. On the basis of this analysis, we finally proposed a cooperative prefetching strategy called "cooching". In this strategy, the requested segments in VCR interactivities are prefetched into session beforehand using the information collected through gossips. We evaluate our strategy through extensive simulations. The results indicate that the proposed strategy outperforms the existing prefetching mechanisms.Comment: 13 Pages, IJCN

arXiv.org e-Print Archive

CiteSeerX

Crossref

Prefetching Schemes and Performance Analysis for TV on Demand Services

Author: Arvidsson Åke
Du Manxing
Gavler Anders
Kihl Maria
Lagerstedt Christina
Zhang Huimin
Publication venue: IARIA
Publication date: 01/01/2015
Field of study

TV-on-Demand services have become one of the most popular Internet applications that continuously attracts high user interest. With rapidly increasing user demands, the existing network conditions may not be able to ensure a low start-up delay of video playback. Prefetching has been broadly investigated to cope with the start-up latency problem, which is also known as user perceived latency. In this paper, two datasets from different IPTV providers are used to analyse the TV program request patterns. According to the results, we propose a prefetching scheme at the user end to preload videos before user requests. For both datasets, our prefetching scheme significantly improves the cache hit ratio compared to passive caching and we note that there is a potential to further improve prefetching performance by customizing prefetching schemes for different video categories. We further present a cost model to determine the optimal number of videos to prefetch. We also discuss if there is enough time for prefetching. Finally, more factors, which may have an impact on optimizing prefetching performance, are further discussed, such as the jump patterns over different time in a day and the the distribution of each video’s viewing length

Lund University Publications

Leveraging Program Analysis to Reduce User-Perceived Latency in Mobile Applications

Author: Mickens James W
Netravali Ravi
Ossa B De La
PRESTO
Ravindranath Lenin
Wang Xiao Sophia
Wu C.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/10/2018
Field of study

Reducing network latency in mobile applications is an effective way of improving the mobile user experience and has tangible economic benefits. This paper presents PALOMA, a novel client-centric technique for reducing the network latency by prefetching HTTP requests in Android apps. Our work leverages string analysis and callback control-flow analysis to automatically instrument apps using PALOMA's rigorous formulation of scenarios that address "what" and "when" to prefetch. PALOMA has been shown to incur significant runtime savings (several hundred milliseconds per prefetchable HTTP request), both when applied on a reusable evaluation benchmark we have developed and on real applicationsComment: ICSE 201

arXiv.org e-Print Archive

Crossref

A Video Timeline with Bookmarks and Prefetch State for Faster Video Browsing

Author: Carlier Axel
Charvillat Vincent
Ooi Wei Tsang
Publication venue: HAL CCSD
Publication date: 01/01/2015
Field of study

International audienceReducing seek latency by predicting what the users will access is important for user experience, particularly during video browsing, where users seek frequently to skim through a video. Much existing research strived to predict user access pattern more accurately to improve the prefetching hit rate. This paper proposed a different approach whereby the prefetch hit rate is improved by biasing the users to seek to prefetched content with higher probability, through changing the video player user interface. Through a user study, we demonstrated that our player interface can lead to up to 4

\times

more seeks to bookmarked segments and reduce seek latency by 40\%, compared to a video player interface commonly used today. The user study also showed that the user experience and the understanding of the video content when browsing is not compromised by the changes in seek behavior.

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

Prefetching techniques for client server object-oriented database systems

Author: Knafla Nils
Publication venue: The University of Edinburgh
Publication date: 01/01/1999
Field of study

The performance of many object-oriented database applications suffers from the page fetch latency which is determined by the expense of disk access. In this work we suggest several prefetching techniques to avoid, or at least to reduce, page fetch latency. In practice no prediction technique is perfect and no prefetching technique can entirely eliminate delay due to page fetch latency. Therefore we are interested in the trade-off between the level of accuracy required for obtaining good results in terms of elapsed time reduction and the processing overhead needed to achieve this level of accuracy. If prefetching accuracy is high then the total elapsed time of an application can be reduced significantly otherwise if the prefetching accuracy is low, many incorrect pages are prefetched and the extra load on the client, network, server and disks decreases the whole system performance. Access pattern of object-oriented databases are often complex and usually hard to predict accurately. The ..

CiteSeerX

Edinburgh Research Archive

When Backpressure Meets Predictive Scheduling

Author: Chen Minghua
Huang Longbo
Liu Xin
Zhang Shaoquan
Publication venue
Publication date: 04/09/2013
Field of study

Motivated by the increasing popularity of learning and predicting human user behavior in communication and computing systems, in this paper, we investigate the fundamental benefit of predictive scheduling, i.e., predicting and pre-serving arrivals, in controlled queueing systems. Based on a lookahead window prediction model, we first establish a novel equivalence between the predictive queueing system with a \emph{fully-efficient} scheduling scheme and an equivalent queueing system without prediction. This connection allows us to analytically demonstrate that predictive scheduling necessarily improves system delay performance and can drive it to zero with increasing prediction power. We then propose the \textsf{Predictive Backpressure (PBP)} algorithm for achieving optimal utility performance in such predictive systems. \textsf{PBP} efficiently incorporates prediction into stochastic system control and avoids the great complication due to the exponential state space growth in the prediction window size. We show that \textsf{PBP} can achieve a utility performance that is within

O(\epsilon)

of the optimal, for any

\epsilon>0

, while guaranteeing that the system delay distribution is a \emph{shifted-to-the-left} version of that under the original Backpressure algorithm. Hence, the average packet delay under \textsf{PBP} is strictly better than that under Backpressure, and vanishes with increasing prediction window size. This implies that the resulting utility-delay tradeoff with predictive scheduling beats the known optimal

[O(\epsilon), O(\log(1/\epsilon))]

tradeoff for systems without prediction

arXiv.org e-Print Archive

CiteSeerX

Crossref

IMP: Indirect Memory Prefetcher

Author: Devadas Srinivas
Hughes Christopher J.
Satish Nadathur
Yu Xiangyao
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/12/2015
Field of study

Machine learning, graph analytics and sparse linear algebra-based applications are dominated by irregular memory accesses resulting from following edges in a graph or non-zero elements in a sparse matrix. These accesses have little temporal or spatial locality, and thus incur long memory stalls and large bandwidth requirements. A traditional streaming or striding prefetcher cannot capture these irregular access patterns. A majority of these irregular accesses come from indirect patterns of the form A[B[i]]. We propose an efficient hardware indirect memory prefetcher (IMP) to capture this access pattern and hide latency. We also propose a partial cacheline accessing mechanism for these prefetches to reduce the network and DRAM bandwidth pressure from the lack of spatial locality. Evaluated on 7 applications, IMP shows 56% speedup on average (up to 2.3×) compared to a baseline 64 core system with streaming prefetchers. This is within 23% of an idealized system. With partial cacheline accessing, we see another 9.4% speedup on average (up to 46.6%).Intel Science and Technology Center for Big Dat

DSpace@MIT