2,579 research outputs found

    Reconfigurable interconnects in DSM systems: a focus on context switch behavior

    Get PDF
    Recent advances in the development of reconfigurable optical interconnect technologies allow for the fabrication of low cost and run-time adaptable interconnects in large distributed shared-memory (DSM) multiprocessor machines. This can allow the use of adaptable interconnection networks that alleviate the huge bottleneck present due to the gap between the processing speed and the memory access time over the network. In this paper we have studied the scheduling of tasks by the kernel of the operating system (OS) and its influence on communication between the processing nodes of the system, focusing on the traffic generated just after a context switch. We aim to use these results as a basis to propose a potential reconfiguration of the network that could provide a significant speedup

    The AXIOM software layers

    Get PDF
    AXIOM project aims at developing a heterogeneous computing board (SMP-FPGA).The Software Layers developed at the AXIOM project are explained.OmpSs provides an easy way to execute heterogeneous codes in multiple cores. People and objects will soon share the same digital network for information exchange in a world named as the age of the cyber-physical systems. The general expectation is that people and systems will interact in real-time. This poses pressure onto systems design to support increasing demands on computational power, while keeping a low power envelop. Additionally, modular scaling and easy programmability are also important to ensure these systems to become widespread. The whole set of expectations impose scientific and technological challenges that need to be properly addressed.The AXIOM project (Agile, eXtensible, fast I/O Module) will research new hardware/software architectures for cyber-physical systems to meet such expectations. The technical approach aims at solving fundamental problems to enable easy programmability of heterogeneous multi-core multi-board systems. AXIOM proposes the use of the task-based OmpSs programming model, leveraging low-level communication interfaces provided by the hardware. Modular scalability will be possible thanks to a fast interconnect embedded into each module. To this aim, an innovative ARM and FPGA-based board will be designed, with enhanced capabilities for interfacing with the physical world. Its effectiveness will be demonstrated with key scenarios such as Smart Video-Surveillance and Smart Living/Home (domotics).Peer ReviewedPostprint (author's final draft

    Smart Grid Communications: Overview of Research Challenges, Solutions, and Standardization Activities

    Full text link
    Optimization of energy consumption in future intelligent energy networks (or Smart Grids) will be based on grid-integrated near-real-time communications between various grid elements in generation, transmission, distribution and loads. This paper discusses some of the challenges and opportunities of communications research in the areas of smart grid and smart metering. In particular, we focus on some of the key communications challenges for realizing interoperable and future-proof smart grid/metering networks, smart grid security and privacy, and how some of the existing networking technologies can be applied to energy management. Finally, we also discuss the coordinated standardization efforts in Europe to harmonize communications standards and protocols.Comment: To be published in IEEE Communications Surveys and Tutorial

    Scenarios for the development of smart grids in the UK: synthesis report

    Get PDF
    ‘Smart grid’ is a catch-all term for the smart options that could transform the ways society produces, delivers and consumes energy, and potentially the way we conceive of these services. Delivering energy more intelligently will be fundamental to decarbonising the UK electricity system at least possible cost, while maintaining security and reliability of supply. Smarter energy delivery is expected to allow the integration of more low carbon technologies and to be much more cost effective than traditional methods, as well as contributing to economic growth by opening up new business and innovation opportunities. Innovating new options for energy system management could lead to cost savings of up to £10bn, even if low carbon technologies do not emerge. This saving will be much higher if UK renewable energy targets are achieved. Building on extensive expert feedback and input, this report describes four smart grid scenarios which consider how the UK’s electricity system might develop to 2050. The scenarios outline how political decisions, as well as those made in regulation, finance, technology, consumer and social behaviour, market design or response, might affect the decisions of other actors and limit or allow the availability of future options. The project aims to explore the degree of uncertainty around the current direction of the electricity system and the complex interactions of a whole host of factors that may lead to any one of a wide range of outcomes. Our addition to this discussion will help decision makers to understand the implications of possible actions and better plan for the future, whilst recognising that it may take any one of a number of forms

    Implementing Multithreaded Protocols for Release Consistency on Top of the Generic DSM-PM2 Platform

    Get PDF
    10.1007/3-540-47840-X_18DSM-PM2 is an implementation platform designed to facilitate the experimental studies with consistency protocoles for distributed shared memory. This platform provides basic building blocks, allowing for an easy design, implementation and evaluation of a large variety of multithreaded consistency protocols within a unified framework. DSM-PM2 is portable over a large variety of cluster architectures, using various communication interfaces (TCP, MPI, BIP, SCI, VIA, etc.). This paper presents the design of two multithreaded protocols implementing the release consistency model. We evaluate the impact of these consistency protocols on the overall performance of a typical distributed application, for two clusters with different interconnection networks and communication interfaces

    Packet Compression in GPU Architectures

    Get PDF
    Graphical processing unit (GPU) can support multiple operations in parallel by executing it on multiple thread unit known as warp i.e. multiple threads running the same instruction. Each time miss happens at private cache of Streaming Multiprocessor (SM), the request is migrated over the network to shared L2 cache and then later down to Memory Controller (MC) for supplying memory block. The interconnect delay becomes a bottleneck due to a large number of requests from different SM and multiple replies from the MCs. The compression technique can be used to mitigate the performance bottleneck caused by a large volume of data. In this work, I apply various compression algorithms and propose a new compression scheme, Data Segment Matching (DSM). I apply approximation to the floating-point elements to improve compressibility and develop a prediction model to identify number of approximation bits. I focus on compression techniques to resolve this bottleneck. The evaluations using a cycle accurate simulator show that this scheme improves Instructions per Cycle (IPC) by 12% on an average across various benchmarks with compressibility 50% in integer type benchmarks and 35% in floating-point type benchmarks when the proposed scheme is applied to packet compression in the interconnection network

    Venice: Exploring Server Architectures for Effective Resource Sharing

    Get PDF
    Consolidated server racks are quickly becoming the backbone of IT infrastructure for science, engineering, and business, alike. These servers are still largely built and organized as when they were distributed, individual entities. Given that many fields increasingly rely on analytics of huge datasets, it makes sense to support flexible resource utilization across servers to improve cost-effectiveness and performance. We introduce Venice, a family of data-center server architectures that builds a strong communication substrate as a first-class resource for server chips. Venice provides a diverse set of resource-joining mechanisms that enables user programs to efficiently leverage non-local resources. To better understand the implications of design decisions about system support for resource sharing we have constructed a hardware prototype that allows us to more accurately measure end-to-end performance of at-scale applications and to explore tradeoffs among performance, power, and resource-sharing transparency. We present results from our initial studies analyzing these tradeoffs when sharing memory, accelerators, or NICs. We find that it is particularly important to reduce or hide latency, that data-sharing access patterns should match the features of the communication channels employed, and that inter-channel collaboration can be exploited for better performance
    corecore