Search CORE

99 research outputs found

Demystifying the Characteristics of 3D-Stacked Memories: A Case Study for Hybrid Memory Cube

Author: Asgari Bahar
Hadidi Ramyad
Kim Hyesoon
Mudassar Burhan Ahmad
Mukhopadhyay Saibal
Yalamanchili Sudhakar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/10/2017
Field of study

Three-dimensional (3D)-stacking technology, which enables the integration of DRAM and logic dies, offers high bandwidth and low energy consumption. This technology also empowers new memory designs for executing tasks not traditionally associated with memories. A practical 3D-stacked memory is Hybrid Memory Cube (HMC), which provides significant access bandwidth and low power consumption in a small area. Although several studies have taken advantage of the novel architecture of HMC, its characteristics in terms of latency and bandwidth or their correlation with temperature and power consumption have not been fully explored. This paper is the first, to the best of our knowledge, to characterize the thermal behavior of HMC in a real environment using the AC-510 accelerator and to identify temperature as a new limitation for this state-of-the-art design space. Moreover, besides bandwidth studies, we deconstruct factors that contribute to latency and reveal their sources for high- and low-load accesses. The results of this paper demonstrates essential behaviors and performance bottlenecks for future explorations of packet-switched and 3D-stacked memories.Comment: EEE Catalog Number: CFP17236-USB ISBN 13: 978-1-5386-1232-

arXiv.org e-Print Archive

Crossref

Performance Implications of NoCs on 3D-Stacked Memories: Insights from the Hybrid Memory Cube

Author: Asgari Bahar
Garg Kartikay
Hadidi Ramyad
Kim Hyesoon
Krishna Tushar
Mudassar Burhan Ahmad
Young Jeffrey
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/02/2018
Field of study

Memories that exploit three-dimensional (3D)-stacking technology, which integrate memory and logic dies in a single stack, are becoming popular. These memories, such as Hybrid Memory Cube (HMC), utilize a network-on-chip (NoC) design for connecting their internal structural organizations. This novel usage of NoC, in addition to aiding processing-in-memory capabilities, enables numerous benefits such as high bandwidth and memory-level parallelism. However, the implications of NoCs on the characteristics of 3D-stacked memories in terms of memory access latency and bandwidth have not been fully explored. This paper addresses this knowledge gap by (i) characterizing an HMC prototype on the AC-510 accelerator board and revealing its access latency behaviors, and (ii) by investigating the implications of such behaviors on system and software designs

arXiv.org e-Print Archive

Crossref

Genetic characterization of chicken infectious anaemia viruses isolated in Korea and their pathogenicity in chicks

Author: HyeonSu Kim
HyeRyoung Kim
HyeSoon Song
YongKuk Kwon
Publication venue: Frontiers Media S.A.
Publication date: 01/02/2024
Field of study

Chicken infectious anaemia virus (CIAV) causes severe anemia and immunosuppression through horizontal or vertical transmission in young chickens. Especially, vertical transmission of virus through the egg can lead to significantly economic losses due to the increased mortality in the broiler industry. Here, 28 CIAV complete sequences circulating in Korea were first characterized using the newly designed primers. Phylogenetic analysis based on the complete sequences revealed that CIAV isolates were divided into four groups, IIa (2/28, 7.1%), IIb (9/28, 32.1%), IIIa (8/28, 28.6%) and IIIb (9/28, 32.1%), and exhibited a close relationship to each other. The major groups were IIb, IIIa and IIIb, and no strains were clustered with a vaccine strain available in Korea. Also, for viral titration, we newly developed a quantitative PCR assay that is highly sensitive, reliable and simple. To investigate the pathogenicity of three major genotypes, 18R001(IIb), 08AQ017A(IIIa), and 17AD008(IIIb) isolates were challenged into one-day-old specific-pathogen-free (SPF) chicks. Each CIAV strain caused anaemia, severe growth retardation and immunosuppression in chickens regardless of CIAV genotypes. Notably, a 17AD008 strain showed stable cellular adaptability and higher virus titer in vitro as well as higher pathogenicity in vivo. Taken together, our study provides valuable information to understand molecular characterization, genetic diversity and pathogenicity of CIAV to improve management and control of CIA in poultry farm

Directory of Open Access Journals

Copernicus: Characterizing the Performance Implications of Compression Formats Used in Sparse Workloads

Author: Asgari Bahar
Dierberger Joshua
Hadidi Ramyad
Kim Hyesoon
Marfatia Amaan
Steinichen Charlotte
Publication venue
Publication date: 18/10/2021
Field of study

Sparse matrices are the key ingredients of several application domains, from scientific computation to machine learning. The primary challenge with sparse matrices has been efficiently storing and transferring data, for which many sparse formats have been proposed to significantly eliminate zero entries. Such formats, essentially designed to optimize memory footprint, may not be as successful in performing faster processing. In other words, although they allow faster data transfer and improve memory bandwidth utilization -- the classic challenge of sparse problems -- their decompression mechanism can potentially create a computation bottleneck. Not only is this challenge not resolved, but also it becomes more serious with the advent of domain-specific architectures (DSAs), as they intend to more aggressively improve performance. The performance implications of using various formats along with DSAs, however, has not been extensively studied by prior work. To fill this gap of knowledge, we characterize the impact of using seven frequently used sparse formats on performance, based on a DSA for sparse matrix-vector multiplication (SpMV), implemented on an FPGA using high-level synthesis (HLS) tools, a growing and popular method for developing DSAs. Seeking a fair comparison, we tailor and optimize the HLS implementation of decompression for each format. We thoroughly explore diverse metrics, including decompression overhead, latency, balance ratio, throughput, memory bandwidth utilization, resource utilization, and power consumption, on a variety of real-world and synthetic sparse workloads.Comment: 11 pages, 14 figures, 2 table

arXiv.org e-Print Archive

HPerf: A Lightweight Profiler for Task Distribution on CPU+GPU Platforms

Author: Brett Bevin
Kim Hyesoon
Lee Joo Hwan
Nigania Nimit
Publication venue: Georgia Institute of Technology
Publication date: 01/01/2015
Field of study

Research areas: Computer architecture, Programming analysisHeterogeneous computing has emerged as one of the major computing platforms in many domains. Although there have been several proposals to aid programming for heterogeneous computing platforms, optimizing applications on heterogeneous computing platforms is not an easy task. Identifying which parallel regions (or tasks) should run on GPUs or CPUs is one of the critical decisions to improve performance. In this paper, we propose a profiler, HPerf, to identify an efficient task distribution on CPUs+GPUs system with low profiling overhead. HPerf is a hierarchical profiler. First it performs lightweight profiling and then if necessary, it performs detailed profiling to measure caching and data transfer cost. Compared to a brute-force approach, HPerf reduces the profiling overhead significantly and compared to a naive decision, HPerf improves the performance of OpenCL applications up to 25%

Scholarly Materials And Research @ Georgia Tech