11,500 research outputs found

    Montage: a grid portal and software toolkit for science-grade astronomical image mosaicking

    Full text link
    Montage is a portable software toolkit for constructing custom, science-grade mosaics by composing multiple astronomical images. The mosaics constructed by Montage preserve the astrometry (position) and photometry (intensity) of the sources in the input images. The mosaic to be constructed is specified by the user in terms of a set of parameters, including dataset and wavelength to be used, location and size on the sky, coordinate system and projection, and spatial sampling rate. Many astronomical datasets are massive, and are stored in distributed archives that are, in most cases, remote with respect to the available computational resources. Montage can be run on both single- and multi-processor computers, including clusters and grids. Standard grid tools are used to run Montage in the case where the data or computers used to construct a mosaic are located remotely on the Internet. This paper describes the architecture, algorithms, and usage of Montage as both a software toolkit and as a grid portal. Timing results are provided to show how Montage performance scales with number of processors on a cluster computer. In addition, we compare the performance of two methods of running Montage in parallel on a grid.Comment: 16 pages, 11 figure

    Serial-batch scheduling – the special case of laser-cutting machines

    Get PDF
    The dissertation deals with a problem in the field of short-term production planning, namely the scheduling of laser-cutting machines. The object of decision is the grouping of production orders (batching) and the sequencing of these order groups on one or more machines (scheduling). This problem is also known in the literature as "batch scheduling problem" and belongs to the class of combinatorial optimization problems due to the interdependencies between the batching and the scheduling decisions. The concepts and methods used are mainly from production planning, operations research and machine learning

    ASCR/HEP Exascale Requirements Review Report

    Full text link
    This draft report summarizes and details the findings, results, and recommendations derived from the ASCR/HEP Exascale Requirements Review meeting held in June, 2015. The main conclusions are as follows. 1) Larger, more capable computing and data facilities are needed to support HEP science goals in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of the demand at the 2025 timescale is at least two orders of magnitude -- and in some cases greater -- than that available currently. 2) The growth rate of data produced by simulations is overwhelming the current ability, of both facilities and researchers, to store and analyze it. Additional resources and new techniques for data analysis are urgently needed. 3) Data rates and volumes from HEP experimental facilities are also straining the ability to store and analyze large and complex data volumes. Appropriately configured leadership-class facilities can play a transformational role in enabling scientific discovery from these datasets. 4) A close integration of HPC simulation and data analysis will aid greatly in interpreting results from HEP experiments. Such an integration will minimize data movement and facilitate interdependent workflows. 5) Long-range planning between HEP and ASCR will be required to meet HEP's research needs. To best use ASCR HPC resources the experimental HEP program needs a) an established long-term plan for access to ASCR computational and data resources, b) an ability to map workflows onto HPC resources, c) the ability for ASCR facilities to accommodate workflows run by collaborations that can have thousands of individual members, d) to transition codes to the next-generation HPC platforms that will be available at ASCR facilities, e) to build up and train a workforce capable of developing and using simulations and analysis to support HEP scientific research on next-generation systems.Comment: 77 pages, 13 Figures; draft report, subject to further revisio

    Optimizing Integrated Terminal Airspace Operations Under Uncertainty

    Get PDF
    In the terminal airspace, integrated departures and arrivals have the potential to increase operations efficiency. Recent research has developed geneticalgorithm- based schedulers for integrated arrival and departure operations under uncertainty. This paper presents an alternate method using a machine jobshop scheduling formulation to model the integrated airspace operations. A multistage stochastic programming approach is chosen to formulate the problem and candidate solutions are obtained by solving sample average approximation problems with finite sample size. Because approximate solutions are computed, the proposed algorithm incorporates the computation of statistical bounds to estimate the optimality of the candidate solutions. A proof-ofconcept study is conducted on a baseline implementation of a simple problem considering a fleet mix of 14 aircraft evolving in a model of the Los Angeles terminal airspace. A more thorough statistical analysis is also performed to evaluate the impact of the number of scenarios considered in the sampled problem. To handle extensive sampling computations, a multithreading technique is introduced

    Exascale Deep Learning for Climate Analytics

    Full text link
    We extract pixel-level masks of extreme weather patterns using variants of Tiramisu and DeepLabv3+ neural networks. We describe improvements to the software frameworks, input pipeline, and the network training algorithms necessary to efficiently scale deep learning on the Piz Daint and Summit systems. The Tiramisu network scales to 5300 P100 GPUs with a sustained throughput of 21.0 PF/s and parallel efficiency of 79.0%. DeepLabv3+ scales up to 27360 V100 GPUs with a sustained throughput of 325.8 PF/s and a parallel efficiency of 90.7% in single precision. By taking advantage of the FP16 Tensor Cores, a half-precision version of the DeepLabv3+ network achieves a peak and sustained throughput of 1.13 EF/s and 999.0 PF/s respectively.Comment: 12 pages, 5 tables, 4, figures, Super Computing Conference November 11-16, 2018, Dallas, TX, US

    The Panchromatic Hubble Andromeda Treasury

    Get PDF
    The Panchromatic Hubble Andromeda Treasury (PHAT) is an on-going HST Multicycle Treasury program to image ~1/3 of M31's star forming disk in 6 filters, from the UV to the NIR. The full survey will resolve the galaxy into more than 100 million stars with projected radii from 0-20 kpc over a contiguous 0.5 square degree area in 828 orbits, producing imaging in the F275W and F336W filters with WFC3/UVIS, F475W and F814W with ACS/WFC, and F110W and F160W with WFC3/IR. The resulting wavelength coverage gives excellent constraints on stellar temperature, bolometric luminosity, and extinction for most spectral types. The photometry reaches SNR=4 at F275W=25.1, F336W=24.9, F475W=27.9, F814W=27.1, F110W=25.5, and F160W=24.6 for single pointings in the uncrowded outer disk; however, the optical and NIR data are crowding limited, and the deepest reliable magnitudes are up to 5 magnitudes brighter in the inner bulge. All pointings are dithered and produce Nyquist-sampled images in F475W, F814W, and F160W. We describe the observing strategy, photometry, astrometry, and data products, along with extensive tests of photometric stability, crowding errors, spatially-dependent photometric biases, and telescope pointing control. We report on initial fits to the structure of M31's disk, derived from the density of RGB stars, in a way that is independent of the assumed M/L and is robust to variations in dust extinction. These fits also show that the 10 kpc ring is not just a region of enhanced recent star formation, but is instead a dynamical structure containing a significant overdensity of stars with ages >1 Gyr. (Abridged)Comment: 48 pages including 22 pages of figures. Accepted to the Astrophysical Journal Supplements. Some figures slightly degraded to reduce submission siz

    Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code

    Full text link
    This paper introduces Tiramisu, a polyhedral framework designed to generate high performance code for multiple platforms including multicores, GPUs, and distributed machines. Tiramisu introduces a scheduling language with novel extensions to explicitly manage the complexities that arise when targeting these systems. The framework is designed for the areas of image processing, stencils, linear algebra and deep learning. Tiramisu has two main features: it relies on a flexible representation based on the polyhedral model and it has a rich scheduling language allowing fine-grained control of optimizations. Tiramisu uses a four-level intermediate representation that allows full separation between the algorithms, loop transformations, data layouts, and communication. This separation simplifies targeting multiple hardware architectures with the same algorithm. We evaluate Tiramisu by writing a set of image processing, deep learning, and linear algebra benchmarks and compare them with state-of-the-art compilers and hand-tuned libraries. We show that Tiramisu matches or outperforms existing compilers and libraries on different hardware architectures, including multicore CPUs, GPUs, and distributed machines.Comment: arXiv admin note: substantial text overlap with arXiv:1803.0041

    FTCS finite difference scheme GPGPU parallel computing for the heat conduction equation = Programación en paralelo GPGPU del método en diferencias finitas FTCS para la ecuación del calor

    Full text link
    En el presente artículo se muestran las ventajas de la programación en paralelo resolviendo numéricamente la ecuación del calor en dos dimensiones a través del método de diferencias finitas explícito centrado en el espacio FTCS. De las conclusiones de este trabajo se pone de manifiesto la importancia de la programación en paralelo para tratar problemas grandes, en los que se requiere un elevado número de cálculos, para los cuales la programación secuencial resulta impracticable por el elevado tiempo de ejecución. En la primera sección se describe brevemente los conceptos básicos de programación en paralelo. Seguidamente se resume el método de diferencias finitas explícito centrado en el espacio FTCS aplicado a la ecuación parabólica del calor. Seguidamente se describe el problema de condiciones de contorno y valores iniciales específico al que se va a aplicar el método de diferencias finitas FTCS, proporcionando pseudocódigos de una implementación secuencial y dos implementaciones en paralelo. Finalmente tras la discusión de los resultados se presentan algunas conclusiones. In this paper the advantages of parallel computing are shown by solving the heat conduction equation in two dimensions with the forward in time central in space (FTCS) finite difference method. Two different levels of parallelization are consider and compared with traditional serial procedures. We show in this work the importance of parallel computing when dealing with large problems that are impractical or impossible to solve them with a serial computing procedure. In the first section a summary of parallel computing approach is presented. Subsequently, the forward in time central in space (FTCS) finite difference method for the heat conduction equation is outline, describing how the heat flow equation is derived in two dimensions and the particularities of the finite difference numerical technique considered. Then, a specific initial boundary value problem is solved by the FTCS finite difference method and serial and parallel pseudo codes are provided. Finally after results are discussed some conclusions are presented
    corecore