174,212 research outputs found

    ASCR/HEP Exascale Requirements Review Report

    Full text link
    This draft report summarizes and details the findings, results, and recommendations derived from the ASCR/HEP Exascale Requirements Review meeting held in June, 2015. The main conclusions are as follows. 1) Larger, more capable computing and data facilities are needed to support HEP science goals in all three frontiers: Energy, Intensity, and Cosmic. The expected scale of the demand at the 2025 timescale is at least two orders of magnitude -- and in some cases greater -- than that available currently. 2) The growth rate of data produced by simulations is overwhelming the current ability, of both facilities and researchers, to store and analyze it. Additional resources and new techniques for data analysis are urgently needed. 3) Data rates and volumes from HEP experimental facilities are also straining the ability to store and analyze large and complex data volumes. Appropriately configured leadership-class facilities can play a transformational role in enabling scientific discovery from these datasets. 4) A close integration of HPC simulation and data analysis will aid greatly in interpreting results from HEP experiments. Such an integration will minimize data movement and facilitate interdependent workflows. 5) Long-range planning between HEP and ASCR will be required to meet HEP's research needs. To best use ASCR HPC resources the experimental HEP program needs a) an established long-term plan for access to ASCR computational and data resources, b) an ability to map workflows onto HPC resources, c) the ability for ASCR facilities to accommodate workflows run by collaborations that can have thousands of individual members, d) to transition codes to the next-generation HPC platforms that will be available at ASCR facilities, e) to build up and train a workforce capable of developing and using simulations and analysis to support HEP scientific research on next-generation systems.Comment: 77 pages, 13 Figures; draft report, subject to further revisio

    Deep generative modeling for single-cell transcriptomics.

    Get PDF
    Single-cell transcriptome measurements can reveal unexplored biological diversity, but they suffer from technical noise and bias that must be modeled to account for the resulting uncertainty in downstream analyses. Here we introduce single-cell variational inference (scVI), a ready-to-use scalable framework for the probabilistic representation and analysis of gene expression in single cells ( https://github.com/YosefLab/scVI ). scVI uses stochastic optimization and deep neural networks to aggregate information across similar cells and genes and to approximate the distributions that underlie observed expression values, while accounting for batch effects and limited sensitivity. We used scVI for a range of fundamental analysis tasks including batch correction, visualization, clustering, and differential expression, and achieved high accuracy for each task

    Forecasting the cost of processing multi-join queries via hashing for main-memory databases (Extended version)

    Full text link
    Database management systems (DBMSs) carefully optimize complex multi-join queries to avoid expensive disk I/O. As servers today feature tens or hundreds of gigabytes of RAM, a significant fraction of many analytic databases becomes memory-resident. Even after careful tuning for an in-memory environment, a linear disk I/O model such as the one implemented in PostgreSQL may make query response time predictions that are up to 2X slower than the optimal multi-join query plan over memory-resident data. This paper introduces a memory I/O cost model to identify good evaluation strategies for complex query plans with multiple hash-based equi-joins over memory-resident data. The proposed cost model is carefully validated for accuracy using three different systems, including an Amazon EC2 instance, to control for hardware-specific differences. Prior work in parallel query evaluation has advocated right-deep and bushy trees for multi-join queries due to their greater parallelization and pipelining potential. A surprising finding is that the conventional wisdom from shared-nothing disk-based systems does not directly apply to the modern shared-everything memory hierarchy. As corroborated by our model, the performance gap between the optimal left-deep and right-deep query plan can grow to about 10X as the number of joins in the query increases.Comment: 15 pages, 8 figures, extended version of the paper to appear in SoCC'1

    Extensive light profile fitting of galaxy-scale strong lenses

    Full text link
    We investigate the merits of a massive forward modeling of ground-based optical imaging as a diagnostic for the strong lensing nature of Early-Type Galaxies, in the light of which blurred and faint Einstein rings can hide. We simulate several thousand mock strong lenses under ground- and space-based conditions as arising from the deflection of an exponential disk by a foreground de Vaucouleurs light profile whose lensing potential is described by a Singular Isothermal Ellipsoid. We then fit for the lensed light distribution with sl_fit after having subtracted the foreground light emission off (ideal case) and also after having fitted the deflector's light with galfit. By setting thresholds in the output parameter space, we can decide the lens/not-a-lens status of each system. We finally apply our strategy to a sample of 517 lens candidates present in the CFHTLS data to test the consistency of our selection approach. The efficiency of the fast modeling method at recovering the main lens parameters like Einstein radius, total magnification or total lensed flux, is quite comparable under CFHT and HST conditions when the deflector is perfectly subtracted off (only possible in simulations), fostering a sharp distinction between the good and the bad candidates. Conversely, for a more realistic subtraction, a substantial fraction of the lensed light is absorbed into the deflector's model, which biases the subsequent fitting of the rings and then disturbs the selection process. We quantify completeness and purity of the lens finding method in both cases. This suggests that the main limitation currently resides in the subtraction of the foreground light. Provided further enhancement of the latter, the direct forward modeling of large numbers of galaxy-galaxy strong lenses thus appears tractable and could constitute a competitive lens finder in the next generation of wide-field imaging surveys.Comment: A&A accepted version, minor changes (13 pages, 10 figures
    • 

    corecore