Search CORE

1,228 research outputs found

Byte-Range Asynchronous Locking in Distributed Settings

Author: Quinson Martin
Vernier Flavien
Publication venue: HAL CCSD
Publication date: 18/02/2009
Field of study

International audienceThis paper investigate a mutual exclusion algorithm on distributed systems. We introduce a new algorithm based on the Naimi-Trehel algorithm, taking advantage of the distributed approach of Naimi-Trehel while allowing to request partial locks. Such ranged locks offer a semantic close to POSIX file locking, where threads lock some parts of the shared file. We evaluate our algorithm by comparing its performance with to the original Naimi-Trehel algorithm and to a centralized mutual exclusion algorithm. The considered performance metric is the average time to obtain a lock

INRIA a CCSD electronic archive server

Virtual Machine Support for Many-Core Architectures: Decoupling Abstract from Concrete Concurrency Models

Author: A. Peymandoust
Alastair R. Beresford
Andreas Gal Albert Noll
Bram Adams
Bratin Saha
Carl Hewitt
Charles Antony Richard Hoare
Charles R. Johns
Chen-Yong Cher
Colin Blundell
David Ungar
David Wentzlaff
Doug Lea
ECMA International
Edward A. Lee
freescale semiconductor
Georg Sorst
Gul Agha
Hans Schippers
Haris Volos
Intel Corporation
James Gosling
Jim Gray
John A. Trono
John S. Danaher
John Zigman
Jos'e M. Piquer
Kevin Casey
Kevin Williams
Larry Seiler
Lukasz Ziarek
M. Anton Ertl
Mark S. Miller
Maurice Herlihy
Michael Haupt
Michael R. Marty
Nir Shavit
Pascal Costanza
Philipp Haller
Rajesh K. Karmani
Robert D. Blumofe
Robert Virding
Simon Gay
Sriram Srinivasan
Stefan Marr
Stefan Marr
Stijn Timbermont
Theo D'Hondt
Thomas Kistler
Tom Van Cutsem
Uwe Kastens
Vijay A. Saraswat
Virendra J. Marathe
Wenzhang Zhu
Wolfgang De Meuter
Xu Wang
Yaoqing Gao
Publication venue: 'Open Publishing Association'
Publication date: 01/02/2010
Field of study

The upcoming many-core architectures require software developers to exploit concurrency to utilize available computational power. Today's high-level language virtual machines (VMs), which are a cornerstone of software development, do not provide sufficient abstraction for concurrency concepts. We analyze concrete and abstract concurrency models and identify the challenges they impose for VMs. To provide sufficient concurrency support in VMs, we propose to integrate concurrency operations into VM instruction sets. Since there will always be VMs optimized for special purposes, our goal is to develop a methodology to design instruction sets with concurrency support. Therefore, we also propose a list of trade-offs that have to be investigated to advise the design of such instruction sets. As a first experiment, we implemented one instruction set extension for shared memory and one for non-shared memory concurrency. From our experimental results, we derived a list of requirements for a full-grown experimental environment for further research

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Kent Academic Repository

B+-tree Index Optimization by Exploiting Internal Parallelism of Flash-based Solid State Drives

Author: Kim Sungho
Lee Sang-Won
Park Sanghyun
Roh Hongchan
Shin Mincheol
Publication venue
Publication date: 01/01/2011
Field of study

Previous research addressed the potential problems of the hard-disk oriented design of DBMSs of flashSSDs. In this paper, we focus on exploiting potential benefits of flashSSDs. First, we examine the internal parallelism issues of flashSSDs by conducting benchmarks to various flashSSDs. Then, we suggest algorithm-design principles in order to best benefit from the internal parallelism. We present a new I/O request concept, called psync I/O that can exploit the internal parallelism of flashSSDs in a single process. Based on these ideas, we introduce B+-tree optimization methods in order to utilize internal parallelism. By integrating the results of these methods, we present a B+-tree variant, PIO B-tree. We confirmed that each optimization method substantially enhances the index performance. Consequently, PIO B-tree enhanced B+-tree's insert performance by a factor of up to 16.3, while improving point-search performance by a factor of 1.2. The range search of PIO B-tree was up to 5 times faster than that of the B+-tree. Moreover, PIO B-tree outperformed other flash-aware indexes in various synthetic workloads. We also confirmed that PIO B-tree outperforms B+-tree in index traces collected inside the Postgresql DBMS with TPC-C benchmark.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

Towards a Performance Model for Special Purpose ORB Middleware

Author: Cytron Ron
Gill Christopher
Subramonian Venkita
Xing Guoliang
Publication venue: Washington University Open Scholarship
Publication date: 15/02/2003
Field of study

General purpose middleware has been shown effective in meeting diverse functional requirements for a wide range of distributed systems. Advanced middleware projects have also supported single quality-of-service dimensions such as real-time, fault tolerance, or small memory foot-print. However, there is limited experience supporting multiple quality-of-service dimensions in middleware to meet the needs of special purpose applications. Even though general purpose middleware can cover an entire spectrum of functionality by supporting the union of all features required by each application, this approach breaks down for distributed real-time and embedded sys-tems. For example, the breadth of features supported may interfere with small memory footprint requirements. In this paper, we describe experiments comparing application-level and mechanism-level real-time perfor-\mance of a representative sensor-network application running on three middleware alternatives: (1) a real-time object request broker (ORB) for small-footprint networked embedded sensor nodes, that we have named nORB, (2) TAO, a robust and widely-used general-purpose Real-Time CORBA ORB, and (3) ACE, the low-level middleware framework upon which both nORB and TAO are based. This paper makes two main contributions to the state of the art in customized middleware for distributed real-time and embedded applications. First, we present mechanism-level timing measurements for each of the alternative middleware layers and compare them to the observed performance of the sensor-network application. Second, we provide a preliminary performance model for the observed application timing behavior based on the mechanism-level measurements in each case, and suggest further potential performance optimizations that we plan to study as future work

Washington University St. Louis: Open Scholarship

A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

Author: Buyya Rajkumar
Ramamohanarao Kotagiri
Venugopal Srikumar
Publication venue
Publication date: 10/06/2005
Field of study

Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

arXiv.org e-Print Archive

CiteSeerX

University of Melbourne Institutional Repository

Exploiting method semantics in client cache consistency protocols for object-oriented databases

Author: Dwiartanto Johannes
Publication venue: Newcastle University
Publication date: 01/01/2006
Field of study

PhD ThesisData-shipping systems are commonly used in client-server object-oriented databases. This is in- tended to utilise clients' resources and improve scalability by allowing clients to run transactions locally after fetching the required database items from the database server. A consequence of this is that a database item can be cached at more than one client. This therefore raises issues regarding client cache consistency and concurrency control. A number of client cache consistency protocols have been studied, and some approaches to concurrency control for object-oriented datahases have been proposed. Existing client consistency protocols, however, do not consider method semantics in concurrency control. This study proposes a client cache consistency protocol where method se- mantic can be exploited in concurrency control. It identifies issues regarding the use of method semantics for the protocol and investigates the performance using simulation. The performance re- sults show that this can result in performance gains when compared to existing protocols. The study also shows the potential benefits of asynchronous version of the protoco

Newcastle University eTheses

Muistikeskeisen radioverkon vaikutus tietopääsyjen suoritusnopeuteen

Author: Lindfors Tero
Publication venue
Publication date: 08/10/2018
Field of study

Future 5G-based mobile networks will be largely defined by virtualized network functions (VNF). The related computing is being moved to cloud where a set of servers is provided to run all the software components of the VNFs. Such software component can be run on any server in the mobile network cloud infrastructure. The servers conventionally communicate via TCP/IP -network. To realize planned low-latency use cases in 5G, some servers are placed to data centers near the end users (edge clouds). Many of these use cases involve data accesses from one VNF to another, or to other network elements. The accesses are desired to take as little time as possible to stay within the stringent latency requirements of the new use cases. As a possible approach for reaching this, a novel memory-centric platform was studied. The main ideas of the memory-centric platform are to collapse the hierarchy between volatile and persistent memory by utilizing non-volatile memory (NVM) and use memory-semantic communication between computer components. In this work, a surrogate memory-centric platform was set up as a storage for VNFs and the latency of data accesses from VNF application was measured in different experiments. Measurements against a current platform showed that memory-centric platform was significantly faster to access than the current, TCP/IP using platform. Measurements for accessing RAM with different memory bandwidths within the memory-centric platform showed that the order of latency was roughly independent of the available memory bandwidth. These results mean that memory-centric platform is a promising alternative to be used as a storage system for edge clouds. However, more research is needed to study how other service qualities, such as low latency variation, are fulfilled in memory-centric platform in a VNF environment.Tulevaisuuden 5G:hen perustuvissa mobiiliverkoissa verkkolaitteisto on pääosin virtualisoitu. Tällaisen verkon virtuaaliverkkolaite (VNF) koostuu ohjelmistokomponenteista, joita ajetaan tarkoitukseen määrätyiltä mobiiliverkon pilven palvelimilta. Ohjelmistokomponentti voi olla ajossa millä vain mobiiliverkon näistä pilvi-infrastruktuurin palvelimista. Palvelimet on tavallisesti yhdistetty TCP/IP-verkolla. Jotta suunnitellut alhaisen viiveen käyttötapaukset voisivat toteutua 5G-verkoissa, pilvipalvelimia on sijoitettu niin kutsuttuihin reunadatakeskuksiin lähelle loppukäyttäjiä. Monet näistä käyttötapauksista sisältävät tietopääsyjä virtuaaliverkkolaitteesta toisiin tai muihin verkkoelementteihin. Tietopääsyviiveen halutaan olevan mahdollisimman pieni, jotta käyttötapausten tiukoissa viiverajoissa pysytään. Mahdollisena lähestymistapana tietopääsyviiveen minimoimiseen tutkittiin muistikeskeistä laitteistoalustaa. Tämän laitteistoalustan pääperiaatteita on korvata nykyiset lyhytkestoiset ja pysyvät muistit haihtumattomalla muistilla sekä kommunikoida muistisemanttisilla viesteillä tietokonekomponenttien kesken. Tässä työssä muistikeskeisyyttä hyödyntävää sijaislaitteistoa käytettiin VNF-datan varastona ja ohjelmistokomponenttien tietopääsyviivettä sinne mitattiin erilaisilla kokeilla. Kokeet osoittivat nykyisen, TCP/IP-pohjaisen alustan huomattavasti muistikeskeistä alustaa hitaammaksi. Toiseksi, kokeet osoittivat tietopääsyviiveiden olevan saman suuruisia muistikeskeisen alustan sisällä, riippumatta saatavilla olevasta muistikaistasta. Tulokset merkitsevät, että muistikeskeinen alusta on lupaava vaihtoehto reunadatakeskuksen tietovarastojärjestelmäksi. Lisää tutkimusta alustasta kuitenkin tarvitaan, jotta muiden palvelun laatukriteerien, kuten matalan viivevaihtelun, toteutumisesta saadaan tietoa

Aaltodoc Publication Archive

Algorithm Libraries for Multi-Core Processors

Author: Singler Johannes
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2010
Field of study

By providing parallelized versions of established algorithm libraries, we ease the exploitation of the multiple cores on modern processors for the programmer. The Multi-Core STL provides basic algorithms for internal memory, while the parallelized STXXL enables multi-core acceleration for algorithms on large data sets stored on disk. Some parallelized geometric algorithms are introduced into CGAL. Further, we design and implement sorting algorithms for huge data in distributed external memory

KITopen

Distributed GraphLab: A Framework for Machine Learning in the Cloud

Author: Bickson Danny
Gonzalez Joseph
Guestrin Carlos
Hellerstein Joseph M.
Kyrola Aapo
Low Yucheng
Publication venue
Publication date: 01/01/2012
Field of study

While high-level data parallel frameworks, like MapReduce, simplify the design and implementation of large-scale data processing systems, they do not naturally or efficiently support many important data mining and machine learning algorithms and can lead to inefficient learning systems. To help fill this critical void, we introduced the GraphLab abstraction which naturally expresses asynchronous, dynamic, graph-parallel computation while ensuring data consistency and achieving a high degree of parallel performance in the shared-memory setting. In this paper, we extend the GraphLab framework to the substantially more challenging distributed setting while preserving strong data consistency guarantees. We develop graph based extensions to pipelined locking and data versioning to reduce network congestion and mitigate the effect of network latency. We also introduce fault tolerance to the GraphLab abstraction using the classic Chandy-Lamport snapshot algorithm and demonstrate how it can be easily implemented by exploiting the GraphLab abstraction itself. Finally, we evaluate our distributed implementation of the GraphLab abstraction on a large Amazon EC2 deployment and show 1-2 orders of magnitude performance gains over Hadoop-based implementations.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX