19 research outputs found
MSREP: A Fast yet Light Sparse Matrix Framework for Multi-GPU Systems
Sparse linear algebra kernels play a critical role in numerous applications,
covering from exascale scientific simulation to large-scale data analytics.
Offloading linear algebra kernels on one GPU will no longer be viable in these
applications, simply because the rapidly growing data volume may exceed the
memory capacity and computing power of a single GPU. Multi-GPU systems nowadays
being ubiquitous in supercomputers and data-centers present great potentials in
scaling up large sparse linear algebra kernels. In this work, we design a novel
sparse matrix representation framework for multi-GPU systems called MSREP, to
scale sparse linear algebra operations based on our augmented sparse matrix
formats in a balanced pattern. Different from dense operations, sparsity
significantly intensifies the difficulty of distributing the computation
workload among multiple GPUs in a balanced manner. We enhance three mainstream
sparse data formats -- CSR, CSC, and COO, to enable fine-grained data
distribution. We take sparse matrix-vector multiplication (SpMV) as an example
to demonstrate the efficiency of our MSREP framework. In addition, MSREP can be
easily extended to support other sparse linear algebra kernels based on the
three fundamental formats (i.e., CSR, CSC and COO)
New-Sum: A Novel Online ABFT Scheme for General Iterative Methods
Emerging high-performance computing platforms, with large component counts and lower power margins, are anticipated to be more susceptible to soft errors in both logic circuits and memory subsystems. We present an online algorithm-based fault tolerance (ABFT) approach to efficiently detect and recover soft errors for general iterative methods. We design a novel checksum-based encoding scheme for matrix-vector multiplication that is resilient to both arithmetic and memory errors. Our design decouples the checksum updating process from the actual computation, and allows adaptive checksum overhead control. Building on this new encoding mechanism, we propose two online ABFT designs that can effectively recover from errors when combined with a checkpoint/rollback scheme. These designs are capable of addressing scenarios under different error rates. Our ABFT approaches apply to a wide range of iterative solvers that primarily rely on matrix-vector multiplication and vector linear operations. We evaluate our designs through comprehensive analytical and empirical analysis. Experimental evaluation on the Stampede supercomputer demonstrates the low performance overheads incurred by our two ABFT schemes for preconditioned CG (0:4% and 2:2%) and preconditioned BiCGSTAB (1:0% and 4:0%) for the largest SPD matrix from UFL Sparse Matrix Collection. The evaluation also demonstrates the exibility and effectiveness of our proposed designs for detecting and recovering various types of soft errors in general iterative methods
RenAIssance: A Survey into AI Text-to-Image Generation in the Era of Large Model
Text-to-image generation (TTI) refers to the usage of models that could
process text input and generate high fidelity images based on text
descriptions. Text-to-image generation using neural networks could be traced
back to the emergence of Generative Adversial Network (GAN), followed by the
autoregressive Transformer. Diffusion models are one prominent type of
generative model used for the generation of images through the systematic
introduction of noises with repeating steps. As an effect of the impressive
results of diffusion models on image synthesis, it has been cemented as the
major image decoder used by text-to-image models and brought text-to-image
generation to the forefront of machine-learning (ML) research. In the era of
large models, scaling up model size and the integration with large language
models have further improved the performance of TTI models, resulting the
generation result nearly indistinguishable from real-world images,
revolutionizing the way we retrieval images. Our explorative study has
incentivised us to think that there are further ways of scaling text-to-image
models with the combination of innovative model architectures and prediction
enhancement techniques. We have divided the work of this survey into five main
sections wherein we detail the frameworks of major literature in order to delve
into the different types of text-to-image generation methods. Following this we
provide a detailed comparison and critique of these methods and offer possible
pathways of improvement for future work. In the future work, we argue that TTI
development could yield impressive productivity improvements for creation,
particularly in the context of the AIGC era, and could be extended to more
complex tasks such as video generation and 3D generation