226,063 research outputs found
Deep Learning for Identifying Breast Cancer
Medical images are playing an increasingly important role in the prevention and diagnosis of diseases. Medical images often contain massive amounts of data. Professional interpretation usually requires a long time of professional study and experience accumulation by doctors. Therefore, the use of super storage and computing power in deep learning as a basis can effectively process a large amount of medical data. Breast cancer brings great harm to female patients, and early diagnosis is the most effective prevention and treatment method, so this project will create a new optimized breast cancer auxiliary diagnosis model based on ResNet. Analyze and process, realize medical aided diagnosis, and provide scientific diagnosis for breast cancer patients
Towards Lightweight Data Integration using Multi-workflow Provenance and Data Observability
Modern large-scale scientific discovery requires multidisciplinary
collaboration across diverse computing facilities, including High Performance
Computing (HPC) machines and the Edge-to-Cloud continuum. Integrated data
analysis plays a crucial role in scientific discovery, especially in the
current AI era, by enabling Responsible AI development, FAIR, Reproducibility,
and User Steering. However, the heterogeneous nature of science poses
challenges such as dealing with multiple supporting tools, cross-facility
environments, and efficient HPC execution. Building on data observability,
adapter system design, and provenance, we propose MIDA: an approach for
lightweight runtime Multi-workflow Integrated Data Analysis. MIDA defines data
observability strategies and adaptability methods for various parallel systems
and machine learning tools. With observability, it intercepts the dataflows in
the background without requiring instrumentation while integrating domain,
provenance, and telemetry data at runtime into a unified database ready for
user steering queries. We conduct experiments showing end-to-end multi-workflow
analysis integrating data from Dask and MLFlow in a real distributed deep
learning use case for materials science that runs on multiple environments with
up to 276 GPUs in parallel. We show near-zero overhead running up to 100,000
tasks on 1,680 CPU cores on the Summit supercomputer.Comment: 10 pages, 5 figures, 2 Listings, 42 references, Paper accepted at
IEEE eScience'2
Review of Elements of Parallel Computing
As the title clearly states, this book is about parallel computing. Modern computers are no longer characterized by a single, fully sequential CPU. Instead, they have one or more multicore/manycore processors. The purpose of such parallel architectures is to enable the simultaneous execution of instructions, in order to achieve faster computations. In high performance computing, clusters of parallel processors are used to achieve PFLOPS performance, which is necessary for scientific and Big Data applications.
Mastering parallel computing means having deep knowledge of parallel architectures, parallel programming models, parallel algorithms, parallel design patterns, and performance analysis and optimization techniques. The design of parallel programs requires a lot of creativity, because there is no universal recipe that allows one to achieve the best possible efficiency for any problem.
The book presents the fundamental concepts of parallel computing from the point of view of the algorithmic and implementation patterns. The idea is that, while the hardware keeps changing, the same principles of parallel computing are reused. The book surveys some key algorithmic structures and programming models, together with an abstract representation of the underlying hardware. Parallel programming patterns are purposely not illustrated using the formal design patterns approach, to keep an informal and friendly presentation that is suited to novices
Scientific Image Restoration Anywhere
The use of deep learning models within scientific experimental facilities
frequently requires low-latency inference, so that, for example, quality
control operations can be performed while data are being collected. Edge
computing devices can be useful in this context, as their low cost and compact
form factor permit them to be co-located with the experimental apparatus. Can
such devices, with their limited resources, can perform neural network
feed-forward computations efficiently and effectively? We explore this question
by evaluating the performance and accuracy of a scientific image restoration
model, for which both model input and output are images, on edge computing
devices. Specifically, we evaluate deployments of TomoGAN, an image-denoising
model based on generative adversarial networks developed for low-dose x-ray
imaging, on the Google Edge TPU and NVIDIA Jetson. We adapt TomoGAN for edge
execution, evaluate model inference performance, and propose methods to address
the accuracy drop caused by model quantization. We show that these edge
computing devices can deliver accuracy comparable to that of a full-fledged CPU
or GPU model, at speeds that are more than adequate for use in the intended
deployments, denoising a 1024 x 1024 image in less than a second. Our
experiments also show that the Edge TPU models can provide 3x faster inference
response than a CPU-based model and 1.5x faster than an edge GPU-based model.
This combination of high speed and low cost permits image restoration anywhere.Comment: 6 pages, 8 figures, 1 tabl
Distributed Sparse Computing and Communication for Big Graph Analytics and Deep Learning
Sparsity can be found in the underlying structure of many real-world computationally expensive problems including big graph analytics and large scale sparse deep neural networks. In addition, if gracefully investigated, many of these problems contain a broad substratum of parallelism suitable for parallel and distributed executions of sparse computation. However, usually, dense computation is preferred to its sparse alternative as sparse computation is not only hard to parallelize due to the irregular nature of the sparse data, but also complicated to implement in terms of rewriting a dense algorithm into a sparse one. Hence, foolproof sparse computation requires customized data structures to encode the sparsity of the sparse data and new algorithms to mask the complexity of the sparse computation. However, by carefully exploiting the sparse data structures and algorithms, sparse computation can reduce memory consumption, communication volume, and processing power and thus undoubtedly move the scalability boundaries compared to its dense equivalent.
In this dissertation, I explain how to use parallel and distributed computing techniques in the presence of sparsity to solve large scientific problems including graph analytics and deep learning. To meet this end goal, I leverage the duality between graph theory and sparse linear algebra primitives, and thus solve graph analytics and deep learning problems with the sparse matrix operations. My contributions are fourfold: (1) design and implementation of a new distributed compressed sparse matrix data structure that reduces both computation and communication volumes and is suitable for sparse matrix-vector and sparse matrix-matrix operations, (2) introducing the new MPI*X parallelism model that deems threads as basic units of computing and communication, (3) optimizing sparse matrix-matrix multiplication by employing different hashing techniques, and (4) proposing the new data-then-model parallelism that mitigates the effect of stragglers in sparse deep learning by combining data and model parallelisms. Altogether, these contributions provide a set of data structures and algorithms to accelerate and scale the sparse computing and communication
HPC as a Service: A naive model
Applications like Big Data, Machine Learning, Deep Learning and even other
Engineering and Scientific research requires a lot of computing power; making
High-Performance Computing (HPC) an important field. But access to
Supercomputers is out of range from the majority. Nowadays Supercomputers are
actually clusters of computers usually made-up of commodity hardware. Such
clusters are called Beowulf Clusters. The history of which goes back to 1994
when NASA built a Supercomputer by creating a cluster of commodity hardware. In
recent times a lot of effort has been done in making HPC Clusters of even
single board computers (SBCs). Although the creation of clusters of commodity
hardware is possible but is a cumbersome task. Moreover, the maintenance of
such systems is also difficult and requires special expertise and time. The
concept of cloud is to provide on-demand resources that can be services,
platform or even infrastructure and this is done by sharing a big resource
pool. Cloud computing has resolved problems like maintenance of hardware and
requirement of having expertise in networking etc. An effort is made of
bringing concepts from cloud computing to HPC in order to get benefits of
cloud. The main target is to create a system which can develop a capability of
providing computing power as a service which to further be referred to as
Supercomputer as a service. A prototype was made using Raspberry Pi (RPi) 3B
and 3B+ Single Board Computers. The reason for using RPi boards was increasing
popularity of ARM processors in the field of HPCComment: 2019 8th International Conference on Information and Communication
Technologies (ICICT), Karachi, Pakistan, 201
Prototype of machine learning “as a service” for CMS physics in signal vs background discrimination
Big volumes of data are collected and analysed by LHC experiments at CERN. The success of this scientific challenges is ensured by a great amount of computing power and storage capacity, operated over high performance networks, in very complex LHC computing models on the LHC Computing Grid infrastructure. Now in Run-2 data taking, LHC has an ambitious and broad experimental programme for the coming decades: it includes large investments in detector hardware, and similarly it requires commensurate investment in the R&D in software and com- puting to acquire, manage, process, and analyse the shear amounts of data to be recorded in the High-Luminosity LHC (HL-LHC) era.
The new rise of Artificial Intelligence - related to the current Big Data era, to the technological progress and to a bump in resources democratization and efficient allocation at affordable costs through cloud solutions - is posing new challenges but also offering extremely promising techniques, not only for the commercial world but also for scientific enterprises such as HEP experiments. Machine Learning and Deep Learning are rapidly evolving approaches to characterising and describing data with the potential to radically change how data is reduced and analysed, also at LHC.
This thesis aims at contributing to the construction of a Machine Learning “as a service” solution for CMS Physics needs, namely an end-to-end data-service to serve Machine Learning trained model to the CMS software framework. To this ambitious goal, this thesis work contributes firstly with a proof of concept of a first prototype of such infrastructure, and secondly with a specific physics use-case: the Signal versus Background discrimination in the study of CMS all-hadronic top quark decays, done with scalable Machine Learning techniques
- …