Search CORE

66 research outputs found

A peer to peer approach to large scale information monitoring

Author: Liu Ling
Publication venue: Georgia Institute of Technology
Publication date: 31/10/2008
Field of study

Issued as final reportNational Science Foundation (U.S.

Scholarly Materials And Research @ Georgia Tech

String Matching with Multicore CPUs: Performing Better with the Aho-Corasick Algorithm

Author: Arudchutha S.
Nishanthy T.
Ragel R. G.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 05/03/2014
Field of study

Multiple string matching is known as locating all the occurrences of a given number of patterns in an arbitrary string. It is used in bio-computing applications where the algorithms are commonly used for retrieval of information such as sequence analysis and gene/protein identification. Extremely large amount of data in the form of strings has to be processed in such bio-computing applications. Therefore, improving the performance of multiple string matching algorithms is always desirable. Multicore architectures are capable of providing better performance by parallelizing the multiple string matching algorithms. The Aho-Corasick algorithm is the one that is commonly used in exact multiple string matching algorithms. The focus of this paper is the acceleration of Aho-Corasick algorithm through a multicore CPU based software implementation. Through our implementation and evaluation of results, we prove that our method performs better compared to the state of the art

arXiv.org e-Print Archive

Crossref

HEC: Collaborative Research: SAM^2 Toolkit: Scalable and Adaptive Metadata Management for High-End Computing

Author: Zhu Yifeng
Publication venue: DigitalCommons@UMaine
Publication date: 04/06/2010
Field of study

The increasing demand for Exa-byte-scale storage capacity by high end computing applications requires a higher level of scalability and dependability than that provided by current file and storage systems. The proposal deals with file systems research for metadata management of scalable cluster-based parallel and distributed file storage systems in the HEC environment. It aims to develop a scalable and adaptive metadata management (SAM2) toolkit to extend features of and fully leverage the peak performance promised by state-of-the-art cluster-based parallel and distributed file storage systems used by the high performance computing community. There is a large body of research on data movement and management scaling, however, the need to scale up the attributes of cluster-based file systems and I/O, that is, metadata, has been underestimated. An understanding of the characteristics of metadata traffic, and an application of proper load-balancing, caching, prefetching and grouping mechanisms to perform metadata management correspondingly, will lead to a high scalability. It is anticipated that by appropriately plugging the scalable and adaptive metadata management components into the state-of-the-art cluster-based parallel and distributed file storage systems one could potentially increase the performance of applications and file systems, and help translate the promise and potential of high peak performance of such systems to real application performance improvements. The project involves the following components: 1. Develop multi-variable forecasting models to analyze and predict file metadata access patterns. 2. Develop scalable and adaptive file name mapping schemes using the duplicative Bloom filter array technique to enforce load balance and increase scalability 3. Develop decentralized, locality-aware metadata grouping schemes to facilitate the bulk metadata operations such as prefetching. 4. Develop an adaptive cache coherence protocol using a distributed shared object model for client-side and server-side metadata caching. 5. Prototype the SAM2 components into the state-of-the-art parallel virtual file system PVFS2 and a distributed storage data caching system, set up an experimental framework for a DOE CMS Tier 2 site at University of Nebraska-Lincoln and conduct benchmark, evaluation and validation studies

University of Maine

A Bag-of-Tasks Scheduler Tolerant to Temporal Failures in Clouds

Author: Arantes Luciana
Drummond Lúcia Maria de A.
Sens Pierre
Teylo Luan
Publication venue
Publication date: 24/10/2018
Field of study

Cloud platforms have emerged as a prominent environment to execute high performance computing (HPC) applications providing on-demand resources as well as scalability. They usually offer different classes of Virtual Machines (VMs) which ensure different guarantees in terms of availability and volatility, provisioning the same resource through multiple pricing models. For instance, in Amazon EC2 cloud, the user pays per hour for on-demand VMs while spot VMs are unused instances available for lower price. Despite the monetary advantages, a spot VM can be terminated, stopped, or hibernated by EC2 at any moment. Using both hibernation-prone spot VMs (for cost sake) and on-demand VMs, we propose in this paper a static scheduling for HPC applications which are composed by independent tasks (bag-of-task) with deadline constraints. However, if a spot VM hibernates and it does not resume within a time which guarantees the application's deadline, a temporal failure takes place. Our scheduling, thus, aims at minimizing monetary costs of bag-of-tasks applications in EC2 cloud, respecting its deadline and avoiding temporal failures. To this end, our algorithm statically creates two scheduling maps: (i) the first one contains, for each task, its starting time and on which VM (i.e., an available spot or on-demand VM with the current lowest price) the task should execute; (ii) the second one contains, for each task allocated on a VM spot in the first map, its starting time and on which on-demand VM it should be executed to meet the application deadline in order to avoid temporal failures. The latter will be used whenever the hibernation period of a spot VM exceeds a time limit. Performance results from simulation with task execution traces, configuration of Amazon EC2 VM classes, and VMs market history confirms the effectiveness of our scheduling and that it tolerates temporal failures

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

PList-based Divide and Conquer Parallel Programming

Author: Adrian Sterca
Darius Bufnea
Virginia Niculescu
Publication venue: 'Croatian Communications and Information Society'
Publication date: 01/01/2020
Field of study

This paper details an extension of a Java parallel programming framework – JPLF. The JPLF framework is a programming framework that helps programmers build parallel programs using existing building blocks. The framework is based on {em PowerLists} and PList Theories and it naturally supports multi-way Divide and Conquer. By using this framework, the programmer is exempted from dealing with all the complexities of writing parallel programs from scratch. This extension to the JPLF framework adds PLists support to the framework and so, it enlarges the applicability of the framework to a larger set of parallel solvable problems. Using this extension, we may apply more flexible data division strategies. In addition, the length of the input lists no longer has to be a power of two – as required by the PowerLists theory. In this paper we unveil new applications that emphasize the new class of computations that can be executed within the JPLF framework. We also give a detailed description of the data structures and functions involved in the PLists extension of the JPLF, and extended performance experiments are described and analyzed

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Recommended from our members

Performance analysis and improvement of InfiniBand networks. Modelling and effective Quality-of-Service mechanisms for interconnection networks in cluster computing systems.

Author: Yan Shihang
Publication venue: Department of Computing, School of Computing, Informatics and Media
Publication date: 01/01/2012
Field of study

The InfiniBand Architecture (IBA) network has been proposed as a new industrial standard with high-bandwidth and low-latency suitable for constructing high-performance interconnected cluster computing systems. This architecture replaces the traditional bus-based interconnection with a switch-based network for the server Input-Output (I/O) and inter-processor communications. The efficient Quality-of-Service (QoS) mechanism is fundamental to ensure the import at QoS metrics, such as maximum throughput and minimum latency, leaving aside other aspects like guarantee to reduce the delay, blocking probability, and mean queue length, etc. Performance modelling and analysis has been and continues to be of great theoretical and practical importance in the design and development of communication networks. This thesis aims to investigate efficient and cost-effective QoS mechanisms for performance analysis and improvement of InfiniBand networks in cluster-based computing systems. Firstly, a rate-based source-response link-by-link admission and congestion control function with improved Explicit Congestion Notification (ECN) packet marking scheme is developed. This function adopts the rate control to reduce congestion of multiple-class traffic. Secondly, a credit-based flow control scheme is presented to reduce the mean queue length, throughput and response time of the system. In order to evaluate the performance of this scheme, a new queueing network model is developed. Theoretical analysis and simulation experiments show that these two schemes are quite effective and suitable for InfiniBand networks. Finally, to obtain a thorough and deep understanding of the performance attributes of InfiniBand Architecture network, two efficient threshold function flow control mechanisms are proposed to enhance the QoS of InfiniBand networks; one is Entry Threshold that sets the threshold for each entry in the arbitration table, and other is Arrival Job Threshold that sets the threshold based on the number of jobs in each Virtual Lane. Furthermore, the principle of Maximum Entropy is adopted to analyse these two new mechanisms with the Generalized Exponential (GE)-Type distribution for modelling the inter-arrival times and service times of the input traffic. Extensive simulation experiments are conducted to validate the accuracy of the analytical models

Bradford Scholars

Traces Generation To Simulate Large-Scale Distributed Applications

Author: Dalle Olivier
Mancini Emilio
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2011
Field of study

International audienceIn order to study the performance of scheduling algorithms, simulators of parallel and distributed applications need accurate models of the application's behavior during execution. For this purpose, traces of low-level events collected during the actual execution of real applications are needed. Collecting such traces is a difficult task due to the timing, to the interference of instrumentation code, and to the storage and transfer of the collected data. To address this problem we propose a comprehensive software architecture, which instruments the application's executables, gather hierarchically the traces, and post-process them in order to feed simulation models. We designed it to be scalable, modular and extensible

INRIA a CCSD electronic archive server

Connectivity-guaranteed and obstacle-adaptive deployment schemes for mobile sensor networks

Author: Jarvis Stephen A.
Kermarrec Anne-Marie
Tan Guang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Mobile sensors can relocate and self-deploy into a network. While focusing on the problems of coverage, existing deployment schemes largely over-simplify the conditions for network connectivity: they either assume that the communication range is large enough for sensors in geometric neighborhoods to obtain location information through local communication, or they assume a dense network that remains connected. In addition, an obstacle-free field or full knowledge of the field layout is often assumed. We present new schemes that are not governed by these assumptions, and thus adapt to a wider range of application scenarios. The schemes are designed to maximize sensing coverage and also guarantee connectivity for a network with arbitrary sensor communication/sensing ranges or node densities, at the cost of a small moving distance. The schemes do not need any knowledge of the field layout, which can be irregular and have obstacles/holes of arbitrary shape. Our first scheme is an enhanced form of the traditional virtual-force-based method, which we term the Connectivity-Preserved Virtual Force (CPVF) scheme. We show that the localized communication, which is the very reason for its simplicity, results in poor coverage in certain cases. We then describe a Floor-based scheme which overcomes the difficulties of CPVF and, as a result, significantly outperforms it and other state-of-the-art approaches. Throughout the paper our conclusions are corroborated by the results from extensive simulations

HAL-CentraleSupelec

CiteSeerX

Crossref

University of Birmingham Research Portal

INRIA a CCSD electronic archive server

Warwick Research Archives Portal Repository

HAL-Rennes 1