7,412 research outputs found
Achieving High Speed CFD simulations: Optimization, Parallelization, and FPGA Acceleration for the unstructured DLR TAU Code
Today, large scale parallel simulations are fundamental tools to handle complex problems. The number of processors in current computation platforms has been recently increased and therefore it is necessary to optimize the application performance and to enhance the scalability of massively-parallel systems. In addition, new heterogeneous architectures, combining conventional processors with specific hardware, like FPGAs, to accelerate the most time consuming functions are considered as a strong alternative to boost the performance.
In this paper, the performance of the DLR TAU code is analyzed and optimized. The improvement of the code efficiency is addressed through three key activities: Optimization, parallelization and hardware acceleration. At first, a profiling analysis of the most time-consuming processes of the Reynolds Averaged Navier Stokes flow solver on a three-dimensional unstructured mesh is performed. Then, a study of the code scalability with new partitioning algorithms are tested to show the most suitable partitioning algorithms for the selected applications. Finally, a feasibility study on the application of FPGAs and GPUs for the hardware acceleration of CFD simulations is presented
Large-Eddy Simulations of Flow and Heat Transfer in Complex Three-Dimensional Multilouvered Fins
The paper describes the computational procedure and
results from large-eddy simulations in a complex three-dimensional
louver geometry. The three-dimensionality in the
louver geometry occurs along the height of the fin, where the
angled louver transitions to the flat landing and joins with the
tube surface. The transition region is characterized by a swept
leading edge and decreasing flow area between louvers.
Preliminary results show a high energy compact vortex jet
forming in this region. The jet forms in the vicinity of the louver
junction with the flat landing and is drawn under the louver in
the transition region. Its interaction with the surface of the
louver produces vorticity of the opposite sign, which aids in
augmenting heat transfer on the louver surface. The top surface
of the louver in the transition region experiences large velocities
in the vicinity of the surface and exhibits higher heat transfer
coefficients than the bottom surface.Air Conditioning and Refrigeration Project 9
Quantum Computing with Very Noisy Devices
In theory, quantum computers can efficiently simulate quantum physics, factor
large numbers and estimate integrals, thus solving otherwise intractable
computational problems. In practice, quantum computers must operate with noisy
devices called ``gates'' that tend to destroy the fragile quantum states needed
for computation. The goal of fault-tolerant quantum computing is to compute
accurately even when gates have a high probability of error each time they are
used. Here we give evidence that accurate quantum computing is possible with
error probabilities above 3% per gate, which is significantly higher than what
was previously thought possible. However, the resources required for computing
at such high error probabilities are excessive. Fortunately, they decrease
rapidly with decreasing error probabilities. If we had quantum resources
comparable to the considerable resources available in today's digital
computers, we could implement non-trivial quantum computations at error
probabilities as high as 1% per gate.Comment: 47 page
Online Tensor Methods for Learning Latent Variable Models
We introduce an online tensor decomposition based approach for two latent
variable modeling problems namely, (1) community detection, in which we learn
the latent communities that the social actors in social networks belong to, and
(2) topic modeling, in which we infer hidden topics of text articles. We
consider decomposition of moment tensors using stochastic gradient descent. We
conduct optimization of multilinear operations in SGD and avoid directly
forming the tensors, to save computational and storage costs. We present
optimized algorithm in two platforms. Our GPU-based implementation exploits the
parallelism of SIMD architectures to allow for maximum speed-up by a careful
optimization of storage and data transfer, whereas our CPU-based implementation
uses efficient sparse matrix computations and is suitable for large sparse
datasets. For the community detection problem, we demonstrate accuracy and
computational efficiency on Facebook, Yelp and DBLP datasets, and for the topic
modeling problem, we also demonstrate good performance on the New York Times
dataset. We compare our results to the state-of-the-art algorithms such as the
variational method, and report a gain of accuracy and a gain of several orders
of magnitude in the execution time.Comment: JMLR 201
Fifty Years of Mincer Earnings Regressions
The Mincer earnings function is the cornerstone of a large literature in empirical economics. This paper discusses the theoretical foundations of the Mincer model and examines the empirical support for it using data from Decennial Censuses and Current Population Surveys. While data from 1940 and 1950 Censuses provide some support for Mincer's model, data from later decades are inconsistent with it. We examine the importance of relaxing functional form assumptions in estimating internal rates of return to schooling and of accounting for taxes, tuition, nonlinearity in schooling, and nonseparability between schooling and work experience. Inferences about trends in rates of return to high school and college obtained from our more general model differ substantially from inferences drawn from estimates based on a Mincer earnings regression. Important differences also arise between cohort-based and cross-sectional estimates of the rate of return to schooling. In the recent period of rapid technological progress, widely used cross-sectional applications of the Mincer model produce dramatically biased estimates of cohort returns to schooling. We also examine the implications of accounting for uncertainty and agent expectation formation. Even when the static framework of Mincer is maintained, accounting for uncertainty substantially affects the return estimates. Considering the sequential resolution of uncertainty over time in a dynamic setting gives rise to option values, which fundamentally changes the analysis of schooling decisions. In the presence of sequential resolution of uncertainty and option values, the internal rate of return - a cornerstone of classical human capital theory - is not a useful guide to policy analysis.
- …