270 research outputs found

    Write-limited sorts and joins for persistent memory

    Get PDF
    To mitigate the impact of the widening gap between the memory needs of CPUs and what standard memory technology can deliver, system architects have introduced a new class of memory technology termed persistent memory. Persistent memory is byteaddressable, but exhibits asymmetric I/O: writes are typically one order of magnitude more expensive than reads. Byte addressability combined with I/O asymmetry render the performance profile of persistent memory unique. Thus, it becomes imperative to find new ways to seamlessly incorporate it into database systems. We do so in the context of query processing. We focus on the fundamental operations of sort and join processing. We introduce the notion of write-limited algorithms that effectively minimize the I/O cost. We give a high-level API that enables the system to dynamically optimize the workflow of the algorithms; or, alternatively, allows the developer to tune the write profile of the algorithms. We present four different techniques to incorporate persistent memory into the database processing stack in light of this API. We have implemented and extensively evaluated all our proposals. Our results show that the algorithms deliver on their promise of I/O-minimality and tunable performance. We showcase the merits and deficiencies of each implementation technique, thus taking a solid first step towards incorporating persistent memory into query processing. 1

    Dynamic Select Approach for Memory Allocation

    Get PDF
    When we need to use Memory allocation for relatively huge datasets, then we may have a possibility to encounter the exception that is OutOfMemoryException. This exception shows that memory is not available for the allocation. But exception does not occur due to limited memory system, it usually occurs due to non availability of virtual address space for that byte of data. This issue is because of the current implementation of memory allocation which uses single array byte as backing store. When the data set is huge the backing store of memory allocation space also requires more contiguous memory than that is available in the virtual address space. If there is no contiguous memory available for the process then it encounters the exception of OutOfMemoryException even there is enough space available but not continuous. This research proposed an approach for dynamically selecting the best memory allocator for every application. The proposed approach does not need any type of contiguous memory for storing the data in stream. This approach uses a dynamic list of small chunks as backing storage that are allocated on demand when the stream is used. If there is no contiguous memory available in the Stream then memory allocation can be done from these small chunks of memory with no OutOfMemoryException

    Autotuning for Automatic Parallelization on Heterogeneous Systems

    Get PDF

    Pulsar: Design and Simulation Methodology for Dynamic Bandwidth Allocation in Photonic Network-on-Chip Architectures in Heterogeneous Multicore Systems

    Get PDF
    As the computing industry moved toward faster and more energy-efficient solutions, multicore computers proved to be dependable. Soon after, the Network-on-Chip (NoC) paradigm made headway as an effective method of connecting multiple cores on a single chip. These on-chip networks have been used to relay communication between homogeneous and heterogeneous sets of cores and core clusters. However, the variation in bandwidth requirements of heterogeneous systems is often neglected. Therefore, at a given moment, bandwidth may be in excess at one node while it is insufficient at another leading to lower performance and higher energy costs. This work proposes and examines dynamic schemes for the allocation of photonic channels in a Photonic Network-on-Chip (PNoC) as an alternative to their static-provision counterparts and proposes a method of simulating and selecting the characteristics of a dynamic system at the time of design as to achieve maximum system performance in a Photonic Network-on-Chip for a given application type

    Control Plane Hardware Design for Optical Packet Switched Data Centre Networks

    Get PDF
    Optical packet switching for intra-data centre networks is key to addressing traffic requirements. Photonic integration and wavelength division multiplexing (WDM) can overcome bandwidth limits in switching systems. A promising technology to build a nanosecond-reconfigurable photonic-integrated switch, compatible with WDM, is the semiconductor optical amplifier (SOA). SOAs are typically used as gating elements in a broadcast-and-select (B\&S) configuration, to build an optical crossbar switch. For larger-size switching, a three-stage Clos network, based on crossbar nodes, is a viable architecture. However, the design of the switch control plane, is one of the barriers to packet switching; it should run on packet timescales, which becomes increasingly challenging as line rates get higher. The scheduler, used for the allocation of switch paths, limits control clock speed. To this end, the research contribution was the design of highly parallel hardware schedulers for crossbar and Clos network switches. On a field-programmable gate array (FPGA), the minimum scheduler clock period achieved was 5.0~ns and 5.4~ns, for a 32-port crossbar and Clos switch, respectively. By using parallel path allocation modules, one per Clos node, a minimum clock period of 7.0~ns was achieved, for a 256-port switch. For scheduler application-specific integrated circuit (ASIC) synthesis, this reduces to 2.0~ns; a record result enabling scalable packet switching. Furthermore, the control plane was demonstrated experimentally. Moreover, a cycle-accurate network emulator was developed to evaluate switch performance. Results showed a switch saturation throughput at a traffic load 60\% of capacity, with sub-microsecond packet latency, for a 256-port Clos switch, outperforming state-of-the-art optical packet switches
    corecore