31,210 research outputs found

    A Parallel Iterative Method for Computing Molecular Absorption Spectra

    Full text link
    We describe a fast parallel iterative method for computing molecular absorption spectra within TDDFT linear response and using the LCAO method. We use a local basis of "dominant products" to parametrize the space of orbital products that occur in the LCAO approach. In this basis, the dynamical polarizability is computed iteratively within an appropriate Krylov subspace. The iterative procedure uses a a matrix-free GMRES method to determine the (interacting) density response. The resulting code is about one order of magnitude faster than our previous full-matrix method. This acceleration makes the speed of our TDDFT code comparable with codes based on Casida's equation. The implementation of our method uses hybrid MPI and OpenMP parallelization in which load balancing and memory access are optimized. To validate our approach and to establish benchmarks, we compute spectra of large molecules on various types of parallel machines. The methods developed here are fairly general and we believe they will find useful applications in molecular physics/chemistry, even for problems that are beyond TDDFT, such as organic semiconductors, particularly in photovoltaics.Comment: 20 pages, 17 figures, 3 table

    Powerful Trend Function Tests That Are Robust to Strong Serial Correlation with an Application to the Prebisch-Singer Hypothesis

    Get PDF
    In this paper we propose tests for hypotheses regarding the parameters of the deterministic trend function of a univariate time series. The tests do not require knowledge of the form of serial correlation in the data and they are robust to strong serial correlation. The data can contain a unit root and the tests still have the correct size asymptotically. The tests we analyze are standard heteroskedasticity autocorrelation (HAC) robust tests based on nonparametric kernel variance estimators. We analyze these tests using the ï¾…xed-b asymptotic framework recently proposed by Kiefer and Vogelsang (2002). This analysis allows us to analyze the power properties of the tests with regards to bandwidth and kernel choices. Our analysis shows that among popular kernels, there are speciï¾…c kernel and bandwidth choices that deliver tests with maximal power within a speciï¾…c class of tests. Based on the theoretical results, we propose a data dependent bandwidth rule that maximizes integrated power. Our recommended test is shown to have power that dominates a related test proposed by Vogelsang (1998). We apply the recommended test to the logarithm of a net barter terms of trade series and we ï¾…nd that this series has a statistically signiï¾…cant negative slope. This ï¾…nding is consistent with the well known Prebisch-Singer hypothesis.

    Preemptive Thread Block Scheduling with Online Structural Runtime Prediction for Concurrent GPGPU Kernels

    Full text link
    Recent NVIDIA Graphics Processing Units (GPUs) can execute multiple kernels concurrently. On these GPUs, the thread block scheduler (TBS) uses the FIFO policy to schedule their thread blocks. We show that FIFO leaves performance to chance, resulting in significant loss of performance and fairness. To improve performance and fairness, we propose use of the preemptive Shortest Remaining Time First (SRTF) policy instead. Although SRTF requires an estimate of runtime of GPU kernels, we show that such an estimate of the runtime can be easily obtained using online profiling and exploiting a simple observation on GPU kernels' grid structure. Specifically, we propose a novel Structural Runtime Predictor. Using a simple Staircase model of GPU kernel execution, we show that the runtime of a kernel can be predicted by profiling only the first few thread blocks. We evaluate an online predictor based on this model on benchmarks from ERCBench, and find that it can estimate the actual runtime reasonably well after the execution of only a single thread block. Next, we design a thread block scheduler that is both concurrent kernel-aware and uses this predictor. We implement the SRTF policy and evaluate it on two-program workloads from ERCBench. SRTF improves STP by 1.18x and ANTT by 2.25x over FIFO. When compared to MPMax, a state-of-the-art resource allocation policy for concurrent kernels, SRTF improves STP by 1.16x and ANTT by 1.3x. To improve fairness, we also propose SRTF/Adaptive which controls resource usage of concurrently executing kernels to maximize fairness. SRTF/Adaptive improves STP by 1.12x, ANTT by 2.23x and Fairness by 2.95x compared to FIFO. Overall, our implementation of SRTF achieves system throughput to within 12.64% of Shortest Job First (SJF, an oracle optimal scheduling policy), bridging 49% of the gap between FIFO and SJF.Comment: 14 pages, full pre-review version of PACT 2014 poste

    Asteroseismic measurement of surface-to-core rotation in a main-sequence A star, KIC 11145123

    Get PDF
    We have discovered rotationally split core g-mode triplets and surface p-mode triplets and quintuplets in a terminal age main-sequence A star, KIC 11145123, that shows both δ Sct p-mode pulsations and γ Dor g-mode pulsations. This gives the first robust determination of the rotation of the deep core and surface of a main-sequence star, essentially model independently. We find its rotation to be nearly uniform with a period near 100 d, but we show with high confidence that the surface rotates slightly faster than the core. A strong angular momentum transfer mechanism must be operating to produce the nearly rigid rotation, and a mechanism other than viscosity must be operating toproduce a more rapidly rotating surface than core. Our asteroseismic result, along with previous asteroseismic constraints on internal rotation in some B stars, and measurements of internal rotation in some subgiant, giant and white dwarf stars,has made angular momentum transport in stars throughout their lifetimes an observational science
    • …
    corecore