830 research outputs found

    Development of Cluster Computing –A Review

    Get PDF
    This paper presents the review work of “Cluster Computing” in depth and detail.  Cluster Computing: A Mobile Code Approach by R.B.Patel and Manpreet Singh (2006); Performance Evaluation of Parallel Applications Using Message Passing Interface In Network of Workstations Of Different Computing Powers by Rajkumar Sharma, Priyesh Kanungo and Manohar Chandwani (2011); On the Performance of MPI-OpenMP on a 12 nodes Multi-core Cluster by Abdelgadir Tageldin, Al-Sakib Khan Pathan , Mohiuddin Ahmed (2011); Dynamic Load Balancing in Parallel Processing on Non-Homogeneous Clusters by Armando E. De Giusti, Marcelo R. Naiouf, Laura C. De Giusti, Franco Chichizola (2005); Performance Evaluation of Computation Intensive Tasks in Grid by P.Raghu, K. Sriram (2011); Automatic Distribution of Vision-Tasks on Computing Clusters by Thomas Muller, Binh An Tran and Alois Knoll (2011); Terminology And Taxonomy Parallel Computing Architecture by Amardeep Singh, Satinder Pal Singh, Vandana, Sukhnandan Kaur (2011); Research of Distributed Algorithm based on Parallel Computer Cluster System by Xu He-li, Liu Yan (2010); Cluster Computing Using Orders Based Transparent Parallelizing by Vitaliy D. Pavlenko, Victor V. Burdejnyj (2007) and VCE: A New Personated Virtual Cluster Engine for Cluster Computing by Mohsen Sharifi, Masoud Hassani, Ehsan Mousavi Khaneghah, Seyedeh Leili Mirtaheri (2008). Keywords:Cluster computing, Cluster Architectures, Dynamic and Static Load Balancing, Distributed Systems, Homogeneous and Non-Homogeneous Processors, Multicore clusters, Parallel computing, Parallel Computer Vision, Task parallelism, Terminology and taxonomy, Virtualization, Virtual Cluster

    HERO: Heterogeneous Embedded Research Platform for Exploring RISC-V Manycore Accelerators on FPGA

    Full text link
    Heterogeneous embedded systems on chip (HESoCs) co-integrate a standard host processor with programmable manycore accelerators (PMCAs) to combine general-purpose computing with domain-specific, efficient processing capabilities. While leading companies successfully advance their HESoC products, research lags behind due to the challenges of building a prototyping platform that unites an industry-standard host processor with an open research PMCA architecture. In this work we introduce HERO, an FPGA-based research platform that combines a PMCA composed of clusters of RISC-V cores, implemented as soft cores on an FPGA fabric, with a hard ARM Cortex-A multicore host processor. The PMCA architecture mapped on the FPGA is silicon-proven, scalable, configurable, and fully modifiable. HERO includes a complete software stack that consists of a heterogeneous cross-compilation toolchain with support for OpenMP accelerator programming, a Linux driver, and runtime libraries for both host and PMCA. HERO is designed to facilitate rapid exploration on all software and hardware layers: run-time behavior can be accurately analyzed by tracing events, and modifications can be validated through fully automated hard ware and software builds and executed tests. We demonstrate the usefulness of HERO by means of case studies from our research

    Blazes: Coordination Analysis for Distributed Programs

    Full text link
    Distributed consistency is perhaps the most discussed topic in distributed systems today. Coordination protocols can ensure consistency, but in practice they cause undesirable performance unless used judiciously. Scalable distributed architectures avoid coordination whenever possible, but under-coordinated systems can exhibit behavioral anomalies under fault, which are often extremely difficult to debug. This raises significant challenges for distributed system architects and developers. In this paper we present Blazes, a cross-platform program analysis framework that (a) identifies program locations that require coordination to ensure consistent executions, and (b) automatically synthesizes application-specific coordination code that can significantly outperform general-purpose techniques. We present two case studies, one using annotated programs in the Twitter Storm system, and another using the Bloom declarative language.Comment: Updated to include additional materials from the original technical report: derivation rules, output stream label

    Parallelizing Scale Invariant Feature Transform on a Distributed Memory Cluster

    Get PDF
    Scale Invariant Feature Transform (SIFT) is a computer vision algorithm that is widely-used to extract features from images. We explored accelerating an existing implementation of this algorithm with message passing in order to analyze large data sets. We successfully tested two approaches to data decomposition in order to parallelize SIFT on a distributed memory cluster

    Incremental closeness centrality in distributed memory

    Get PDF
    Networks are commonly used to model traffic patterns, social interactions, or web pages. The vertices in a network do not possess the same characteristics: some vertices are naturally more connected and some vertices can be more important. Closeness centrality (CC) is a global metric that quantifies how important is a given vertex in the network. When the network is dynamic and keeps changing, the relative importance of the vertices also changes. The best known algorithm to compute the CC scores makes it impractical to recompute them from scratch after each modification. In this paper, we propose Streamer, a distributed memory framework for incrementally maintaining the closeness centrality scores of a network upon changes. It leverages pipelined, replicated parallelism, and SpMM-based BFSs, and it takes NUMA effects into account. It makes maintaining the Closeness Centrality values of real-life networks with millions of interactions significantly faster and obtains almost linear speedups on a 64 nodes 8 threads/node cluster
    corecore