7,412 research outputs found

    Achieving High Speed CFD simulations: Optimization, Parallelization, and FPGA Acceleration for the unstructured DLR TAU Code

    Get PDF
    Today, large scale parallel simulations are fundamental tools to handle complex problems. The number of processors in current computation platforms has been recently increased and therefore it is necessary to optimize the application performance and to enhance the scalability of massively-parallel systems. In addition, new heterogeneous architectures, combining conventional processors with specific hardware, like FPGAs, to accelerate the most time consuming functions are considered as a strong alternative to boost the performance. In this paper, the performance of the DLR TAU code is analyzed and optimized. The improvement of the code efficiency is addressed through three key activities: Optimization, parallelization and hardware acceleration. At first, a profiling analysis of the most time-consuming processes of the Reynolds Averaged Navier Stokes flow solver on a three-dimensional unstructured mesh is performed. Then, a study of the code scalability with new partitioning algorithms are tested to show the most suitable partitioning algorithms for the selected applications. Finally, a feasibility study on the application of FPGAs and GPUs for the hardware acceleration of CFD simulations is presented

    Large-Eddy Simulations of Flow and Heat Transfer in Complex Three-Dimensional Multilouvered Fins

    Get PDF
    The paper describes the computational procedure and results from large-eddy simulations in a complex three-dimensional louver geometry. The three-dimensionality in the louver geometry occurs along the height of the fin, where the angled louver transitions to the flat landing and joins with the tube surface. The transition region is characterized by a swept leading edge and decreasing flow area between louvers. Preliminary results show a high energy compact vortex jet forming in this region. The jet forms in the vicinity of the louver junction with the flat landing and is drawn under the louver in the transition region. Its interaction with the surface of the louver produces vorticity of the opposite sign, which aids in augmenting heat transfer on the louver surface. The top surface of the louver in the transition region experiences large velocities in the vicinity of the surface and exhibits higher heat transfer coefficients than the bottom surface.Air Conditioning and Refrigeration Project 9

    Quantum Computing with Very Noisy Devices

    Full text link
    In theory, quantum computers can efficiently simulate quantum physics, factor large numbers and estimate integrals, thus solving otherwise intractable computational problems. In practice, quantum computers must operate with noisy devices called ``gates'' that tend to destroy the fragile quantum states needed for computation. The goal of fault-tolerant quantum computing is to compute accurately even when gates have a high probability of error each time they are used. Here we give evidence that accurate quantum computing is possible with error probabilities above 3% per gate, which is significantly higher than what was previously thought possible. However, the resources required for computing at such high error probabilities are excessive. Fortunately, they decrease rapidly with decreasing error probabilities. If we had quantum resources comparable to the considerable resources available in today's digital computers, we could implement non-trivial quantum computations at error probabilities as high as 1% per gate.Comment: 47 page

    Online Tensor Methods for Learning Latent Variable Models

    Get PDF
    We introduce an online tensor decomposition based approach for two latent variable modeling problems namely, (1) community detection, in which we learn the latent communities that the social actors in social networks belong to, and (2) topic modeling, in which we infer hidden topics of text articles. We consider decomposition of moment tensors using stochastic gradient descent. We conduct optimization of multilinear operations in SGD and avoid directly forming the tensors, to save computational and storage costs. We present optimized algorithm in two platforms. Our GPU-based implementation exploits the parallelism of SIMD architectures to allow for maximum speed-up by a careful optimization of storage and data transfer, whereas our CPU-based implementation uses efficient sparse matrix computations and is suitable for large sparse datasets. For the community detection problem, we demonstrate accuracy and computational efficiency on Facebook, Yelp and DBLP datasets, and for the topic modeling problem, we also demonstrate good performance on the New York Times dataset. We compare our results to the state-of-the-art algorithms such as the variational method, and report a gain of accuracy and a gain of several orders of magnitude in the execution time.Comment: JMLR 201

    Fifty Years of Mincer Earnings Regressions

    Get PDF
    The Mincer earnings function is the cornerstone of a large literature in empirical economics. This paper discusses the theoretical foundations of the Mincer model and examines the empirical support for it using data from Decennial Censuses and Current Population Surveys. While data from 1940 and 1950 Censuses provide some support for Mincer's model, data from later decades are inconsistent with it. We examine the importance of relaxing functional form assumptions in estimating internal rates of return to schooling and of accounting for taxes, tuition, nonlinearity in schooling, and nonseparability between schooling and work experience. Inferences about trends in rates of return to high school and college obtained from our more general model differ substantially from inferences drawn from estimates based on a Mincer earnings regression. Important differences also arise between cohort-based and cross-sectional estimates of the rate of return to schooling. In the recent period of rapid technological progress, widely used cross-sectional applications of the Mincer model produce dramatically biased estimates of cohort returns to schooling. We also examine the implications of accounting for uncertainty and agent expectation formation. Even when the static framework of Mincer is maintained, accounting for uncertainty substantially affects the return estimates. Considering the sequential resolution of uncertainty over time in a dynamic setting gives rise to option values, which fundamentally changes the analysis of schooling decisions. In the presence of sequential resolution of uncertainty and option values, the internal rate of return - a cornerstone of classical human capital theory - is not a useful guide to policy analysis.
    corecore