1,228 research outputs found
Byte-Range Asynchronous Locking in Distributed Settings
International audienceThis paper investigate a mutual exclusion algorithm on distributed systems. We introduce a new algorithm based on the Naimi-Trehel algorithm, taking advantage of the distributed approach of Naimi-Trehel while allowing to request partial locks. Such ranged locks offer a semantic close to POSIX file locking, where threads lock some parts of the shared file. We evaluate our algorithm by comparing its performance with to the original Naimi-Trehel algorithm and to a centralized mutual exclusion algorithm. The considered performance metric is the average time to obtain a lock
Virtual Machine Support for Many-Core Architectures: Decoupling Abstract from Concrete Concurrency Models
The upcoming many-core architectures require software developers to exploit
concurrency to utilize available computational power. Today's high-level
language virtual machines (VMs), which are a cornerstone of software
development, do not provide sufficient abstraction for concurrency concepts. We
analyze concrete and abstract concurrency models and identify the challenges
they impose for VMs. To provide sufficient concurrency support in VMs, we
propose to integrate concurrency operations into VM instruction sets.
Since there will always be VMs optimized for special purposes, our goal is to
develop a methodology to design instruction sets with concurrency support.
Therefore, we also propose a list of trade-offs that have to be investigated to
advise the design of such instruction sets.
As a first experiment, we implemented one instruction set extension for
shared memory and one for non-shared memory concurrency. From our experimental
results, we derived a list of requirements for a full-grown experimental
environment for further research
B+-tree Index Optimization by Exploiting Internal Parallelism of Flash-based Solid State Drives
Previous research addressed the potential problems of the hard-disk oriented
design of DBMSs of flashSSDs. In this paper, we focus on exploiting potential
benefits of flashSSDs. First, we examine the internal parallelism issues of
flashSSDs by conducting benchmarks to various flashSSDs. Then, we suggest
algorithm-design principles in order to best benefit from the internal
parallelism. We present a new I/O request concept, called psync I/O that can
exploit the internal parallelism of flashSSDs in a single process. Based on
these ideas, we introduce B+-tree optimization methods in order to utilize
internal parallelism. By integrating the results of these methods, we present a
B+-tree variant, PIO B-tree. We confirmed that each optimization method
substantially enhances the index performance. Consequently, PIO B-tree enhanced
B+-tree's insert performance by a factor of up to 16.3, while improving
point-search performance by a factor of 1.2. The range search of PIO B-tree was
up to 5 times faster than that of the B+-tree. Moreover, PIO B-tree
outperformed other flash-aware indexes in various synthetic workloads. We also
confirmed that PIO B-tree outperforms B+-tree in index traces collected inside
the Postgresql DBMS with TPC-C benchmark.Comment: VLDB201
Towards a Performance Model for Special Purpose ORB Middleware
General purpose middleware has been shown effective in meeting diverse functional requirements for a wide range of distributed systems. Advanced middleware projects have also supported single quality-of-service dimensions such as real-time, fault tolerance, or small memory foot-print. However, there is limited experience supporting multiple quality-of-service dimensions in middleware to meet the needs of special purpose applications. Even though general purpose middleware can cover an entire spectrum of functionality by supporting the union of all features required by each application, this approach breaks down for distributed real-time and embedded sys-tems. For example, the breadth of features supported may interfere with small memory footprint requirements. In this paper, we describe experiments comparing application-level and mechanism-level real-time perfor-\mance of a representative sensor-network application running on three middleware alternatives: (1) a real-time object request broker (ORB) for small-footprint networked embedded sensor nodes, that we have named nORB, (2) TAO, a robust and widely-used general-purpose Real-Time CORBA ORB, and (3) ACE, the low-level middleware framework upon which both nORB and TAO are based. This paper makes two main contributions to the state of the art in customized middleware for distributed real-time and embedded applications. First, we present mechanism-level timing measurements for each of the alternative middleware layers and compare them to the observed performance of the sensor-network application. Second, we provide a preliminary performance model for the observed application timing behavior based on the mechanism-level measurements in each case, and suggest further potential performance optimizations that we plan to study as future work
A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing
Data Grids have been adopted as the platform for scientific communities that
need to share, access, transport, process and manage large data collections
distributed worldwide. They combine high-end computing technologies with
high-performance networking and wide-area storage management techniques. In
this paper, we discuss the key concepts behind Data Grids and compare them with
other data sharing and distribution paradigms such as content delivery
networks, peer-to-peer networks and distributed databases. We then provide
comprehensive taxonomies that cover various aspects of architecture, data
transportation, data replication and resource allocation and scheduling.
Finally, we map the proposed taxonomy to various Data Grid systems not only to
validate the taxonomy but also to identify areas for future exploration.
Through this taxonomy, we aim to categorise existing systems to better
understand their goals and their methodology. This would help evaluate their
applicability for solving similar problems. This taxonomy also provides a "gap
analysis" of this area through which researchers can potentially identify new
issues for investigation. Finally, we hope that the proposed taxonomy and
mapping also helps to provide an easy way for new practitioners to understand
this complex area of research.Comment: 46 pages, 16 figures, Technical Repor
Exploiting method semantics in client cache consistency protocols for object-oriented databases
PhD ThesisData-shipping systems are commonly used in client-server object-oriented databases. This is in-
tended to utilise clients' resources and improve scalability by allowing clients to run transactions
locally after fetching the required database items from the database server. A consequence of this
is that a database item can be cached at more than one client. This therefore raises issues regarding
client cache consistency and concurrency control. A number of client cache consistency protocols
have been studied, and some approaches to concurrency control for object-oriented datahases have
been proposed. Existing client consistency protocols, however, do not consider method semantics
in concurrency control. This study proposes a client cache consistency protocol where method se-
mantic can be exploited in concurrency control. It identifies issues regarding the use of method
semantics for the protocol and investigates the performance using simulation. The performance re-
sults show that this can result in performance gains when compared to existing protocols. The study
also shows the potential benefits of asynchronous version of the protoco
Muistikeskeisen radioverkon vaikutus tietopääsyjen suoritusnopeuteen
Future 5G-based mobile networks will be largely defined by virtualized network functions (VNF). The related computing is being moved to cloud where a set of servers is provided to run all the software components of the VNFs. Such software component can be run on any server in the mobile network cloud infrastructure. The servers conventionally communicate via TCP/IP -network.
To realize planned low-latency use cases in 5G, some servers are placed to data centers near the end users (edge clouds). Many of these use cases involve data accesses from one VNF to another, or to other network elements. The accesses are desired to take as little time as possible to stay within the stringent latency requirements of the new use cases. As a possible approach for reaching this, a novel memory-centric platform was studied. The main ideas of the memory-centric platform are to collapse the hierarchy between volatile and persistent memory by utilizing non-volatile memory (NVM) and use memory-semantic communication between computer components.
In this work, a surrogate memory-centric platform was set up as a storage for VNFs and the latency of data accesses from VNF application was measured in different experiments. Measurements against a current platform showed that memory-centric platform was significantly faster to access than the current, TCP/IP using platform. Measurements for accessing RAM with different memory bandwidths within the memory-centric platform showed that the order of latency was roughly independent of the available memory bandwidth. These results mean that memory-centric platform is a promising alternative to be used as a storage system for edge clouds. However, more research is needed to study how other service qualities, such as low latency variation, are fulfilled in memory-centric platform in a VNF environment.Tulevaisuuden 5G:hen perustuvissa mobiiliverkoissa verkkolaitteisto on pääosin virtualisoitu. Tällaisen verkon virtuaaliverkkolaite (VNF) koostuu ohjelmistokomponenteista, joita ajetaan tarkoitukseen määrätyiltä mobiiliverkon pilven palvelimilta. Ohjelmistokomponentti voi olla ajossa millä vain mobiiliverkon näistä pilvi-infrastruktuurin palvelimista. Palvelimet on tavallisesti yhdistetty TCP/IP-verkolla.
Jotta suunnitellut alhaisen viiveen käyttötapaukset voisivat toteutua 5G-verkoissa, pilvipalvelimia on sijoitettu niin kutsuttuihin reunadatakeskuksiin lähelle loppukäyttäjiä. Monet näistä käyttötapauksista sisältävät tietopääsyjä virtuaaliverkkolaitteesta toisiin tai muihin verkkoelementteihin. Tietopääsyviiveen halutaan olevan mahdollisimman pieni, jotta käyttötapausten tiukoissa viiverajoissa pysytään. Mahdollisena lähestymistapana tietopääsyviiveen minimoimiseen tutkittiin muistikeskeistä laitteistoalustaa. Tämän laitteistoalustan pääperiaatteita on korvata nykyiset lyhytkestoiset ja pysyvät muistit haihtumattomalla muistilla sekä kommunikoida muistisemanttisilla viesteillä tietokonekomponenttien kesken.
Tässä työssä muistikeskeisyyttä hyödyntävää sijaislaitteistoa käytettiin VNF-datan varastona ja ohjelmistokomponenttien tietopääsyviivettä sinne mitattiin erilaisilla kokeilla. Kokeet osoittivat nykyisen, TCP/IP-pohjaisen alustan huomattavasti muistikeskeistä alustaa hitaammaksi. Toiseksi, kokeet osoittivat tietopääsyviiveiden olevan saman suuruisia muistikeskeisen alustan sisällä, riippumatta saatavilla olevasta muistikaistasta. Tulokset merkitsevät, että muistikeskeinen alusta on lupaava vaihtoehto reunadatakeskuksen tietovarastojärjestelmäksi. Lisää tutkimusta alustasta kuitenkin tarvitaan, jotta muiden palvelun laatukriteerien, kuten matalan viivevaihtelun, toteutumisesta saadaan tietoa
Algorithm Libraries for Multi-Core Processors
By providing parallelized versions of established algorithm libraries, we ease the exploitation of the multiple cores on modern processors for the programmer. The Multi-Core STL provides basic algorithms for internal memory, while the parallelized STXXL enables multi-core acceleration for algorithms on large data sets stored on disk. Some parallelized geometric algorithms are introduced into CGAL. Further, we design and implement sorting algorithms for huge data in distributed external memory
Distributed GraphLab: A Framework for Machine Learning in the Cloud
While high-level data parallel frameworks, like MapReduce, simplify the
design and implementation of large-scale data processing systems, they do not
naturally or efficiently support many important data mining and machine
learning algorithms and can lead to inefficient learning systems. To help fill
this critical void, we introduced the GraphLab abstraction which naturally
expresses asynchronous, dynamic, graph-parallel computation while ensuring data
consistency and achieving a high degree of parallel performance in the
shared-memory setting. In this paper, we extend the GraphLab framework to the
substantially more challenging distributed setting while preserving strong data
consistency guarantees. We develop graph based extensions to pipelined locking
and data versioning to reduce network congestion and mitigate the effect of
network latency. We also introduce fault tolerance to the GraphLab abstraction
using the classic Chandy-Lamport snapshot algorithm and demonstrate how it can
be easily implemented by exploiting the GraphLab abstraction itself. Finally,
we evaluate our distributed implementation of the GraphLab abstraction on a
large Amazon EC2 deployment and show 1-2 orders of magnitude performance gains
over Hadoop-based implementations.Comment: VLDB201
- …