23 research outputs found

    Data Aggregation techniques in Sensor Networks

    No full text
    Advancement in computing technology has led to the production of wireless sensors capable of observing and reporting various real world phenomena in a time sensitive manner. However such systems su#er from bandwidth, energy and throughput constraints which limit the amount of information transfered from end-to-end. Data aggregation is a known technique addressed to alleviate these problems but are limited due to their lack of adaptation to dynamic network topologies and unpredictable tra#c patterns. In this project, we propose three novel data aggregation schemes; in-network data aggregation, grid-based data aggregation and hybrid data aggregation, which increases throughput, decreases congestion and saves energy. Our simulation results show that the end-to-end transmission delay is reduced by a factor of 2.3, the throughput increases by a factor of 2.4 under heavy load conditions and the energy dissipated is reduced by a factor of 2.2. We conclude our evaluation by proposing an hybrid aggregation scheme through which sensor nodes can dynamically change from one aggregation technique to the other in an unpredictable environment and adapt to dynamic changes in the network

    DDSS: A Low-Overhead Distributed Data Sharing Substrate for Cluster-Based Data-Centers over Modern Interconnects

    No full text
    Abstract. Information-sharing is a key aspect of distributed applications such as database servers and web servers. Information-sharing also assists services such as caching, reconfiguration, etc. In the past, information-sharing has been implemented using ad-hoc messaging protocols which often incur high overheads and are not very scalable. This paper presents a new design for a scalable and a low-overhead Distributed Data Sharing Substrate (DDSS). DDSS is designed to support efficient data management and coherence models by leveraging the features of modern interconnects. It is implemented over the OpenFabrics interface and portable across multiple interconnects including iWARP-capable networks in LAN/WAN environments. Experimental evaluations with networks like Infini-Band and iWARP-capable Ammasso through data-center services show an order of magnitude performance improvement and the load resilient nature of the substrate. Application-level evaluations with Distributed STORM achieves close to 19 % performance improvement over traditional implementation, while evaluations with check-pointing application suggest that DDSS is highly scalable.

    Designing Efficient Cooperative Caching Schemes for Multi-Tier Data-Centers over RDMA-enabled Networks

    No full text
    Caching has been a very important technique in improving the performance and scalability of web-serving datacenters. Research community has proposed cooperation of caching servers to achieve higher performance benefits. These existing cooperative cache designs often partially duplicate cached data redundantly on multiple servers for higher performance while optimizing the data-fetch costs for multiple similar requests. With the advent of RDMA enabled interconnects these cost estimates have changed the basic factors involved. Further, utilization of large scale of resources available across the tiers in todays multi-tier data-centers is of obvious importance. Hence, a systematic study of these various trade-offs involved is of paramount importance. In this paper, we present cooperative cache schemes that are designed to benefit in the light of the above mentioned trends. In particular, we design schemes taking advantage of RDMA capabilities of networks and multiple tier resources of modern multi-tier data-centers. Our designs are implemented on InfiniBand based clusters to work in conjunction with Apache based servers. Our experimental results show that our schemes show throughput improvement of up to 35% better than the basic cooperative caching schemes and 180% better than the simple single node caching schemes

    Designing Passive Synchronization for MPI-2 One-Sided Communication to Maximize Overlap ∗

    No full text
    Scientific computing has seen an immense growth in recent years. MPI has become the defacto standard for parallel programming model for distributed memory systems. MPI-2 standard also introduced the one-sided programming model. Computation and communication overlap is an important goal for one-sided applications. While the passive synchronization mechanism for MPI-2 one-sided communication allows for good overlap, the actual overlap achieved is often limited by the design of both the MPI library and the application. In this paper we aim to improve the performance of MPI-2 one-sided communication. In particular, we focus on the following important aspects: (i) designing one-sided passive synchronization (Direct Passive) support using InfiniBand atomic operations to handle both exclusive as well as shared locks (ii) enhancing one-sided communication progress to provide scope for better overlap that one-sided applications can leverage. (iii) study the overlap potential of passive synchronization and its impact on applications. We demonstrate the possible benefits of our approaches for the MPI-2 SPLASH LU application benchmark. Our results show an improvement of up to 87 % for a 64 processes run over the existing design.

    Accurate Load Monitoring for Cluster-based Web Data-Centers over RDMA-enabled Networks

    No full text
    Monitoring a pool of resources in a cluster-based web datacenter environment can be critical for successful deployment of applications such as web servers, database servers, etc. In particular, the monitored information assists system-level services like load balancing in enabling the data-center environment to efficiently adapt to the changing system load and traffic pattern. This information is not only critical in terms of accuracy and content, but it must also be gathered without impacting performance or affecting other applications. In this paper, we propose two accurate load monitoring schemes, namely, user-level load monitoring (ULM) and kernel-level load monitoring (KLM) in a web datacenter environment and evaluate its benefits with respect to overall system load balancing. In our approach, we use the Remote Direct Memory Access (RDMA) operation (in user space or kernel space) provided by RDMA-enabled interconnects like InfiniBand. We further leverage the information provided by certain kernel data structures in designing these schemes without requiring any modifications to the existing data-center applications. Our experimental results show that the KLM and ULM schemes achieve an improvement of 22% and 12% in a single data-center and an improvement of 25% and 11% per web-site in shared data-centers, respectively. More importantly, our schemes take advantage of RDMA operations in accessing portions of kernel memory that is not exposed to user space for accurate load monitoring. Further, our design is resilient and well-conditioned to the load on the servers as compared to two-sided communication protocols such as TCP/IP

    MPI-2 One-sided Usage and Implementation for Read Modify Write Operations: A Case Study with HPCC ⋆

    No full text
    Abstract. MPI-2’s One-sided communication interface is being explored in scientific applications. One of the important operations in a one sided model is read-modify-write. MPI-2 semantics provide MPI Put, MPI Get and MPI Accumulate operations which can be used to implement read-modify-write functionality. The different strategies yield varying performance benefits depending on the underlying one-sided implementation. We use HPCC Random Access benchmark which primarily uses read-modify-write operations as a case study for evaluating the different implementation strategies in this paper. Currently this benchmark is implemented based on MPI two-sided semantics. In this work we design and evaluate MPI-2 versions of the HPCC Random Access benchmark using onesided operations. To improve the performance, we explore two different optimizations: (i) software based aggregation and (ii) hardware-based atomic operations. We implement aggregation techniques using MPI Accumulate with datatypes to improve the performance of one sided implementation. In order to study the impact of hardware capabilities provided by modern interconnects, we implement a prototype of Accumulate for MPI Sum (Direct Accumulate) using InfiniBand’s atomic fetch and add operation. We evaluate our different approaches on an InfiniBand cluster. The software based aggregation outperforms the basic one sided scheme without aggregation by a factor of 4.38. The hardware based scheme shows an improvement by a factor of 2.62 as compared to the basic one sided scheme. Our study shows that the software based aggregation performs the best. We also demonstrate the potential and scalability of the hardware based approach. keywords: MPI-2, One-sided, HPCC, Accumulate, InfiniBand

    Automatic Path Migration over InfiniBand: Early Experiences ∗

    No full text
    High computational power of commodity PCs combined with the emergence of low latency and high bandwidth interconnects has escalated the trends of cluster computing. Clusters with InfiniBand are being deployed, as reflected in the TOP 500 Supercomputer rankings. However, increasing scale of these clusters has reduced the Mean Time Between Failures (MTBF) of components. Network component is one such component of clusters, where failure of Network Interface Cards (NICs), cables and/or switches breaks existing path(s) of communication. InfiniBand provides a hardware mechanism, Automatic Path Migration (APM), which allows user transparent detection and recovery from network fault(s), without application restart. In this paper, we design a set of modules; which work together for providing network fault tolerance for user level applications leveraging the APM feature. Our performance evaluation at the MPI Layer shows that APM incurs negligible overhead in the absence of faults in the system. In the presence of network faults, APM incurs negligible overhead for reasonably long running applications
    corecore