4 research outputs found

    Distributed Training and Optimization Of Neural Networks

    Full text link
    Deep learning models are yielding increasingly better performances thanks to multiple factors. To be successful, model may have large number of parameters or complex architectures and be trained on large dataset. This leads to large requirements on computing resource and turn around time, even more so when hyper-parameter optimization is done (e.g search over model architectures). While this is a challenge that goes beyond particle physics, we review the various ways to do the necessary computations in parallel, and put it in the context of high energy physics.Comment: 20 pages, 4 figures, 2 tables, Submitted for review. To appear in "Artificial Intelligence for Particle Physics", World Scientific Publishin

    The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism

    Full text link
    We present scalable hybrid-parallel algorithms for training large-scale 3D convolutional neural networks. Deep learning-based emerging scientific workflows often require model training with large, high-dimensional samples, which can make training much more costly and even infeasible due to excessive memory usage. We solve these challenges by extensively applying hybrid parallelism throughout the end-to-end training pipeline, including both computations and I/O. Our hybrid-parallel algorithm extends the standard data parallelism with spatial parallelism, which partitions a single sample in the spatial domain, realizing strong scaling beyond the mini-batch dimension with a larger aggregated memory capacity. We evaluate our proposed training algorithms with two challenging 3D CNNs, CosmoFlow and 3D U-Net. Our comprehensive performance studies show that good weak and strong scaling can be achieved for both networks using up 2K GPUs. More importantly, we enable training of CosmoFlow with much larger samples than previously possible, realizing an order-of-magnitude improvement in prediction accuracy.Comment: 12 pages, 10 figure

    Just-in-time deep learning for real-time X-ray computed tomography

    Get PDF
    Real-time X-ray tomography pipelines, such as implemented by RECAST3D, compute and visualize tomographic reconstructions in milliseconds, and enable the observation of dynamic experiments in synchrotron beamlines and laboratory scanners. For extending real-time reconstruction by image processing and analysis components, Deep Neural Networks (DNNs) are a promising technology, due to their strong performance and much faster run-times compared to conventional algorithms. DNNs may prevent experiment repetition by simplifying real-time steering and optimization of the ongoing experiment. The main challenge of integrating DNNs into real-time tomography pipelines, however, is that they need to learn their task from representative data before the start of the experiment. In scientific environments, such training data may not exist, and other uncertain and variable factors, such as the set-up configuration, reconstruction parameters, or user interaction, cannot easily be anticipated beforehand, either. To overcome these problems, we developed just-in-time learning, an online DNN training strategy that takes advantage of the spatio-temporal continuity of consecutive reconstructions in the tomographic pipeline. This allows training and deploying comparatively small DNNs during the experiment. We provide software implementations, and study the feasibility and challenges of the approach by training the self-supervised Noise2Inverse denoising task with X-ray data replayed from real-world dynamic experiments
    corecore