Search CORE

3,259 research outputs found

Leading Effect of CP Violation with Four Generations

Author: Hou Wei-Shu
Mao Yao-Yuan
Shen Chia-Hsien
Publication venue: 'American Physical Society (APS)'
Publication date: 26/09/2010
Field of study

In the Standard Model with a fourth generation of quarks, we study the relation between the Jarlskog invariants and the triangle areas in the 4-by-4 CKM matrix. To identify the leading effects that may probe the CP violation in processes involving quarks, we invoke small mass and small angle expansions, and show that these leading effects are enhanced considerably compared to the three generation case by the large masses of fourth generation quarks. We discuss the leading effect in several cases, in particular the possibility of large CP violation in

b \to s

processes, which echoes the heightened recent interest because of experimental hints.Comment: 12 pages, no figur

arXiv.org e-Print Archive

National Taiwan University Repository

Efficient Memory Management for GPU-based Deep Learning Systems

Author: Zhang Junzhe
Yeung Sai Ho
Shu Yao
He Bingsheng
Wang Wei
Publication venue
Publication date: 10/01/1961
Field of study

GPU (graphics processing unit) has been used for many data-intensive applications. Among them, deep learning systems are one of the most important consumer systems for GPU nowadays. As deep learning applications impose deeper and larger models in order to achieve higher accuracy, memory management becomes an important research topic for deep learning systems, given that GPU has limited memory size. Many approaches have been proposed towards this issue, e.g., model compression and memory swapping. However, they either degrade the model accuracy or require a lot of manual intervention. In this paper, we propose two orthogonal approaches to reduce the memory cost from the system perspective. Our approaches are transparent to the models, and thus do not affect the model accuracy. They are achieved by exploiting the iterative nature of the training algorithm of deep learning to derive the lifetime and read/write order of all variables. With the lifetime semantics, we are able to implement a memory pool with minimal fragments. However, the optimization problem is NP-complete. We propose a heuristic algorithm that reduces up to 13.3% of memory compared with Nvidia's default memory pool with equal time complexity. With the read/write semantics, the variables that are not in use can be swapped out from GPU to CPU to reduce the memory footprint. We propose multiple swapping strategies to automatically decide which variable to swap and when to swap out (in), which reduces the memory cost by up to 34.2% without communication overhead

arXiv.org e-Print Archive

Trinity College

Efficient Memory Management for GPU-based Deep Learning Systems

Author: He Bingsheng
Shu Yao
Wang Wei
Yeung Sai Ho
Zhang Junzhe
Publication venue
Publication date: 19/02/2019
Field of study

arXiv.org e-Print Archive

ScholarBank@NUS

As-Built and Post-treated Microstructures of an Electron Beam Melting (EBM) Produced Nickel-Based Superalloy

Author: Goel Sneha
Joshi S.
Klement Uta
Mehtani Hitesh
Samajdar Indradev
Yao Shu Wei
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2020
Field of study

The microstructures of an electron beam melted (EBM) nickel-based superalloy (Alloy 718) were comprehensively investigated in as-built and post-treated conditions, with particular focus individually on the contour (outer periphery) and hatch (core) regions of the build. The hatch region exhibited columnar grains with strong 〈001〉 texture in the build direction, while the contour region had a mix of columnar and equiaxed grains, with no preferred crystallographic texture. Both regions exhibited nearly identical hardness and carbide content. However, the contour region showed a higher number density of fine carbides compared to the hatch. The as-built material was subjected to two distinct post-treatments: (1) hot isostatic pressing (HIP) and (2) HIP plus heat treatment (HIP + HT), with the latter carried out as a single cycle inside the HIP vessel. Both post-treatments resulted in nearly an order of magnitude decrease in defect content in hatch and contour regions. HIP + HT led to grain coarsening in the contour, but did not alter the microstructure in the hatch region. Different factors that may be responsible for grain growth, such as grain size, grain orientation, grain boundary curvature and secondary phase particles, are discussed. The differences in carbide sizes in the hatch and contour regions appeared to decrease after post-treatment. After HIP + HT, similar higher hardness was observed in both the hatch and contour regions compared to the as-built material

Crossref

Chalmers Research