Search CORE

502 research outputs found

Evaluating Cache Coherent Shared Virtual Memory for Heterogeneous Multicore Chips

Author: Hechtman Blake A.
Sorin Daniel J.
Publication venue
Publication date: 01/01/2013
Field of study

The trend in industry is towards heterogeneous multicore processors (HMCs), including chips with CPUs and massively-threaded throughput-oriented processors (MTTOPs) such as GPUs. Although current homogeneous chips tightly couple the cores with cache-coherent shared virtual memory (CCSVM), this is not the communication paradigm used by any current HMC. In this paper, we present a CCSVM design for a CPU/MTTOP chip, as well as an extension of the pthreads programming model, called xthreads, for programming this HMC. Our goal is to evaluate the potential performance benefits of tightly coupling heterogeneous cores with CCSVM

arXiv.org e-Print Archive

CiteSeerX

Crossref

Fleets: Scalable Services in a Factored Operating System

Author: Agarwal Anant
Beckmann Nathan
Belay Adam
Gruenwald Charles, III
Kasture Harshad
Miller Jason E.
Modzelewski Kevin
Wentzlaff David
Youseff Lamia
Publication venue
Publication date: 09/03/2011
Field of study

Current monolithic operating systems are designed for uniprocessor systems, and their architecture reflects this. The rise of multicore and cloud computing is drastically changing the tradeoffs in operating system design. The culture of scarce computational resources is being replaced with one of abundant cores, where spatial layout of processes supplants time multiplexing as the primary scheduling concern. Efforts to parallelize monolithic kernels have been difficult and only marginally successful, and new approaches are needed. This paper presents fleets, a novel way of constructing scalable OS services. With fleets, traditional OS services are factored out of the kernel and moved into user space, where they are further parallelized into a distributed set of concurrent, message-passing servers. We evaluate fleets within fos, a new factored operating system designed from the ground up with scalability as the first-order design constraint. This paper details the main design principles of fleets, and how the system architecture of fos enables their construction. We describe the design and implementation of three critical fleets (network stack, page allocation, and file system) and compare with Linux. These comparisons show that fos achieves superior performance and has better scalability than Linux for large multicores; at 32 cores, fos's page allocator performs 4.5 times better than Linux, and fos's network stack performs 2.5 times better. Additionally, we demonstrate how fleets can adapt to changing resource demand, and the importance of spatial scheduling for good performance in multicores

DSpace@MIT

Parallel Pipelines for DNA Sequence Alignment on a Cluster of Multicores: A Comparison of Communication Models

Author: Chichizola Franco
De Giusti Armando Eduardo
De Giusti Laura Cristina
Naiouf Marcelo
Rucci Enzo
Publication venue
Publication date: 29/08/2019
Field of study

HPC (high perfomance computing) based on clusters of multicores is one of the main research lines in parallel programming. It is important to study the impact of programming paradigms of shared memory, message passing or a combination of both on these architectures in order to efficiently exploit the power of these architectures. The Smith-Waterman algorithm is used as study case for the local alignment of DNA sequences, which allows establishing the similarity degree between two sequences. In this paper, the Smith-Waterman algorithm is parallelized by means of a pipeline scheme due to the data dependencies that are inherent to the problem, using the various communication/synchronization models mentioned above and then carrying out a comparative analysis. Finally, experimental results are presented, as well as future research lines.Facultad de Informátic

Servicio de Difusión de la Creación Intelectual

Parallel Pipelines for DNA Sequence Alignment on a Cluster of Multicores: A Comparison of Communication Models

Author: Chichizola Franco
De Giusti Armando Eduardo
De Giusti Laura Cristina
Naiouf Marcelo
Rucci Enzo
Publication venue
Publication date: 15/09/2012
Field of study

Dynamic Information Flow Tracking on Multicores

Author: Gupta Rajiv
Kim Ho-Seop
Nagarajan Vijay
Wu Youfeng
Publication venue
Publication date: 01/01/2008
Field of study

Dynamic Information Flow Tracking (DIFT) is a promising technique for detecting software attacks. Due to the computationally intensive nature of the technique, prior efficient implementations [21, 6] rely on specialized hardware support whose only purpose is to enable DIFT. Alternatively, prior software implementations are either too slow [17, 15] resulting in execution time increases as much as four fold for SPEC integer programs or they are not transparent [31] requiring source code modifications. In this paper, we propose the use of chip multiprocessors (CMP) to perform DIFT transparently and efficiently. We spawn a helper thread that is scheduled on a separate core and is only responsible for performing information flow tracking operations. This entails the communication of registers and flags between the main and helper threads. We explore software (shared memory) and hardware (dedicated interconnect) approaches to enable this communication. Finally, we propose a novel application of the DIFT infrastructure where, in addition to the detection of the software attack, DIFT assists in the process of identifying the cause of the bug in the code that enabled the exploit in the first place. We conducted detailed simulations to evaluate the overhead for performing DIFT and found that to be 48 % for SPEC integer programs

CiteSeerX

Edinburgh Research Explorer

S-Net for multi-memory multicores

Author: Grelck C.
Julku J.
Penczek F.
Peterson L.
Pontelli E.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

Copyright ACM, 2010. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 5th ACM SIGPLAN Workshop on Declarative Aspects of Multicore Programming: http://doi.acm.org/10.1145/1708046.1708054S-Net is a declarative coordination language and component technology aimed at modern multi-core/many-core architectures and systems-on-chip. It builds on the concept of stream processing to structure dynamically evolving networks of communicating asynchronous components. Components themselves are implemented using a conventional language suitable for the application domain. This two-level software architecture maintains a familiar sequential development environment for large parts of an application and offers a high-level declarative approach to component coordination. In this paper we present a conservative language extension for the placement of components and component networks in a multi-memory environment, i.e. architectures that associate individual compute cores or groups thereof with private memories. We describe a novel distributed runtime system layer that complements our existing multithreaded runtime system for shared memory multicores. Particular emphasis is put on efficient management of data communication. Last not least, we present preliminary experimental data

VTT Research System

University of Hertfordshire Research Archive

International Migration, Integration and Social Cohesion online publications

Automatic mapping tasks to cores : Evaluating AMTHA Algorithm in multicore architectures

Author: Chichizola Franco
De Giusti Armando Eduardo
De Giusti Laura Cristina
Luque Fadón Emilio
Naiouf Marcelo
Publication venue
Publication date: 01/10/2009
Field of study

The AMTHA (Automatic Mapping Task on Heterogeneous Architectures) algorithm for task-to-processors assignment and the MPAHA (Model of Parallel Algorithms on Heterogeneous Architectures) model are presented. The use of AMTHA is analyzed for multicore processor-based architectures, considering the communication model among processes in use. The results obtained in the tests carried out are presented, comparing the real execution times on multicores of a set of synthetic applications with the predictions obtained with AMTHA. Finally current lines of research are presented, focusing on clusters of multicores and hybrid programming paradigmsPresentado en el IX Workshop Procesamiento Distribuido y Paralelo (WPDP)Red de Universidades con Carreras en Informática (RedUNCI

DNA sequence alignment: hybrid parallel programming on a multicore cluster

Author: Chichizola Franco
De Giusti Armando Eduardo
De Giusti Laura Cristina
Naiouf Marcelo
Rucci Enzo
Publication venue
Publication date: 15/10/2019
Field of study

DNA sequence alignment is one of the most important operations of computational biology. In 1981, Smith and Waterman developed a method for sequences local alignment. Due to its computational power and memory requirements, various heuristics have been developed to reduce execution time at the expense of a loss of accuracy in the result. This is why heuristics do not ensure that the best alignment is found. For this reason, it is interesting to study how to apply the computer power of different parallel platforms to speed up the sequence alignment process without losing result accuracy. In this article, a new parallelization strategy (HI-M) of Smith-Waterman algorithm on a multi-core cluster is presented, configuring a pipeline with a hybrid communication model. Additionally, a performance analysis is carried out and compared with two previously presented parallel solutions. Finally, experimental results are presented, as well as future research lines.Facultad de Informátic