Search CORE

270 research outputs found

Write-limited sorts and joins for persistent memory

Author: Chen S.
Kevin L.
Kim H.
Myers D.
Qureshi M. K.
Publication venue: 'VLDB Endowment'
Publication date: 01/01/2014
Field of study

To mitigate the impact of the widening gap between the memory needs of CPUs and what standard memory technology can deliver, system architects have introduced a new class of memory technology termed persistent memory. Persistent memory is byteaddressable, but exhibits asymmetric I/O: writes are typically one order of magnitude more expensive than reads. Byte addressability combined with I/O asymmetry render the performance profile of persistent memory unique. Thus, it becomes imperative to find new ways to seamlessly incorporate it into database systems. We do so in the context of query processing. We focus on the fundamental operations of sort and join processing. We introduce the notion of write-limited algorithms that effectively minimize the I/O cost. We give a high-level API that enables the system to dynamically optimize the workflow of the algorithms; or, alternatively, allows the developer to tune the write profile of the algorithms. We present four different techniques to incorporate persistent memory into the database processing stack in light of this API. We have implemented and extensively evaluated all our proposals. Our results show that the algorithms deliver on their promise of I/O-minimality and tunable performance. We showcase the merits and deficiencies of each implementation technique, thus taking a solid first step towards incorporating persistent memory into query processing. 1

CiteSeerX

Crossref

Edinburgh Research Explorer

Dynamic Select Approach for Memory Allocation

Author: Jyoti Raina Bakaya
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/09/2017
Field of study

When we need to use Memory allocation for relatively huge datasets, then we may have a possibility to encounter the exception that is OutOfMemoryException. This exception shows that memory is not available for the allocation. But exception does not occur due to limited memory system, it usually occurs due to non availability of virtual address space for that byte of data. This issue is because of the current implementation of memory allocation which uses single array byte as backing store. When the data set is huge the backing store of memory allocation space also requires more contiguous memory than that is available in the virtual address space. If there is no contiguous memory available for the process then it encounters the exception of OutOfMemoryException even there is enough space available but not continuous. This research proposed an approach for dynamically selecting the best memory allocator for every application. The proposed approach does not need any type of contiguous memory for storing the data in stream. This approach uses a dynamic list of small chunks as backing storage that are allocated on demand when the stream is used. If there is no contiguous memory available in the Stream then memory allocation can be done from these small chunks of memory with no OutOfMemoryException

International Journal on Recent and Innovation Trends in Computing and Communication

Recommended from our members

CHERIvoke: Characterising pointer revocation using CHERI capabilities for temporal memory safety

Author: Ainsworth S
Filardo NW
Jones TM
Moore SW
Neumann PG
Richardson A
Roe M
Rugg P
Watson RNM
Woodruff J
Xia H
Publication venue: Proceedings of the Annual International Symposium on Microarchitecture, MICRO
Publication date: 01/01/2019
Field of study

A lack of temporal safety in low-level languages has led to an epidemic of use-after-free exploits. These have surpassed in number and severity even the infamous buffer-overflow exploits violating spatial safety. Capability addressing can directly enforce spatial safety for the C language by enforcing bounds on pointers and by rendering pointers unforgeable. Nevertheless, an efficient solution for strong temporal memory safety remains elusive. CHERI is an architectural extension to provide hardware capability addressing that is seeing significant commercial and open- source interest. We show that CHERI capabilities can be used as a foundation to enable low-cost heap temporal safety by facilitating out-of-date pointer revocation, as capabilities enable precise and efficient identification and invalidation of pointers, even when using unsafe languages such as C. We develop CHERIvoke, a technique for deterministic and fast sweeping revocation to enforce temporal safety on CHERI systems. CHERIvoke quarantines freed data before periodically using a small shadow map to revoke all dangling pointers in a single sweep of memory, and provides a tunable trade-off between performance and heap growth. We evaluate the performance of such a system using high-performance x86 processors, and further analytically examine its primary overheads. When configured with a heap-size overhead of 25%, we find that CHERIvoke achieves an average execution-time overhead of under 5%, far below the overheads associated with traditional garbage collection, revocation, or page-table systems.EP/K026399/1, EP/P020011/1, EP/K008528/

Apollo (Cambridge)

Autotuning for Automatic Parallelization on Heterogeneous Systems

Author: Pfaffe Philip
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2020
Field of study

KITopen

Pulsar: Design and Simulation Methodology for Dynamic Bandwidth Allocation in Photonic Network-on-Chip Architectures in Heterogeneous Multicore Systems

Author: Opong-Mensah Kwadwo
Publication venue: RIT Scholar Works
Publication date: 01/08/2015
Field of study

As the computing industry moved toward faster and more energy-efficient solutions, multicore computers proved to be dependable. Soon after, the Network-on-Chip (NoC) paradigm made headway as an effective method of connecting multiple cores on a single chip. These on-chip networks have been used to relay communication between homogeneous and heterogeneous sets of cores and core clusters. However, the variation in bandwidth requirements of heterogeneous systems is often neglected. Therefore, at a given moment, bandwidth may be in excess at one node while it is insufficient at another leading to lower performance and higher energy costs. This work proposes and examines dynamic schemes for the allocation of photonic channels in a Photonic Network-on-Chip (PNoC) as an alternative to their static-provision counterparts and proposes a method of simulating and selecting the characteristics of a dynamic system at the time of design as to achieve maximum system performance in a Photonic Network-on-Chip for a given application type

RIT Scholar Works

Control Plane Hardware Design for Optical Packet Switched Data Centre Networks

Author: Andreades Paris
Publication venue: UCL (University College London)
Publication date: 28/01/2020
Field of study

Optical packet switching for intra-data centre networks is key to addressing traffic requirements. Photonic integration and wavelength division multiplexing (WDM) can overcome bandwidth limits in switching systems. A promising technology to build a nanosecond-reconfigurable photonic-integrated switch, compatible with WDM, is the semiconductor optical amplifier (SOA). SOAs are typically used as gating elements in a broadcast-and-select (B\&S) configuration, to build an optical crossbar switch. For larger-size switching, a three-stage Clos network, based on crossbar nodes, is a viable architecture. However, the design of the switch control plane, is one of the barriers to packet switching; it should run on packet timescales, which becomes increasingly challenging as line rates get higher. The scheduler, used for the allocation of switch paths, limits control clock speed. To this end, the research contribution was the design of highly parallel hardware schedulers for crossbar and Clos network switches. On a field-programmable gate array (FPGA), the minimum scheduler clock period achieved was 5.0~ns and 5.4~ns, for a 32-port crossbar and Clos switch, respectively. By using parallel path allocation modules, one per Clos node, a minimum clock period of 7.0~ns was achieved, for a 256-port switch. For scheduler application-specific integrated circuit (ASIC) synthesis, this reduces to 2.0~ns; a record result enabling scalable packet switching. Furthermore, the control plane was demonstrated experimentally. Moreover, a cycle-accurate network emulator was developed to evaluate switch performance. Results showed a switch saturation throughput at a traffic load 60\% of capacity, with sub-microsecond packet latency, for a 256-port Clos switch, outperforming state-of-the-art optical packet switches

UCL Discovery