671 research outputs found

    The Hierarchical Formation of the Galactic Disk

    Full text link
    I review the results of recent cosmological simulations of galaxy formation that highlight the importance of satellite accretion in the formation of galactic disks. Tidal debris of disrupted satellites may contribute to the disk component if they are compact enough to survive the decay and circularization of the orbit as dynamical friction brings the satellite into the disk plane. This process may add a small but non-negligible fraction of stars to the thin and thick disks, and reconcile the presence of very old stars with the protracted merging history expected in a hierarchically clustering universe. I discuss various lines of evidence which suggest that this process may have been important during the formation of the Galactic disk.Comment: paper to be read at the "Penetrating Bars through Masks of Cosmic Dust" conference in South Afric

    Building Intelligent Systems: Artificial Intelligence Research at NASA Ames Research Center

    Get PDF
    The goal of building autonomous, intelligent systems is the major focus of both basic and applied research in artificial intelligence within NASA. This paper discusses the components that make up that goal and describes ongoing work at the NASA Ames Research Center to achieve the goal. NASA provides a unique environment for fostering the development of intelligent computational systems; the test domains, especially Space Station systems, represent both a welldefined need, and a marvelous series of increasingly more difficult problems for testing research ideas. One can see a clear progression of systems through research settings (both within and external to NASA) to space station testbeds to systems which actually fly on space station

    Computational advances in gravitational microlensing: a comparison of CPU, GPU, and parallel, large data codes

    Full text link
    To assess how future progress in gravitational microlensing computation at high optical depth will rely on both hardware and software solutions, we compare a direct inverse ray-shooting code implemented on a graphics processing unit (GPU) with both a widely-used hierarchical tree code on a single-core CPU, and a recent implementation of a parallel tree code suitable for a CPU-based cluster supercomputer. We examine the accuracy of the tree codes through comparison with a direct code over a much wider range of parameter space than has been feasible before. We demonstrate that all three codes present comparable accuracy, and choice of approach depends on considerations relating to the scale and nature of the microlensing problem under investigation. On current hardware, there is little difference in the processing speed of the single-core CPU tree code and the GPU direct code, however the recent plateau in single-core CPU speeds means the existing tree code is no longer able to take advantage of Moore's law-like increases in processing speed. Instead, we anticipate a rapid increase in GPU capabilities in the next few years, which is advantageous to the direct code. We suggest that progress in other areas of astrophysical computation may benefit from a transition to GPUs through the use of "brute force" algorithms, rather than attempting to port the current best solution directly to a GPU language -- for certain classes of problems, the simple implementation on GPUs may already be no worse than an optimised single-core CPU version.Comment: 11 pages, 4 figures, accepted for publication in New Astronom

    Data analysis techniques for gravitational wave observations

    Get PDF
    Astrophysical sources of gravitational waves fall broadly into three categories: (i) transient and bursts, (ii) periodic or continuous wave and (iii) stochastic. Each type of source requires a different type of data analysis strategy. In this talk various data analysis strategies will be reviewed. Optimal filtering is used for extracting binary inspirals; Fourier transforms over Doppler shifted time intervals are computed for long duration periodic sources; optimally weighted cross-correlations for stochastic background. Some recent schemes which efficiently search for in spirals will be described. The performance of some of these techniques on real data obtained will be discussed. Finally, some results on cancellation of systematic noises in laser interferometric space antenna (LISA) will be presented and future directions indicated

    Profile-driven parallelisation of sequential programs

    Get PDF
    Traditional parallelism detection in compilers is performed by means of static analysis and more specifically data and control dependence analysis. The information that is available at compile time, however, is inherently limited and therefore restricts the parallelisation opportunities. Furthermore, applications written in C – which represent the majority of today’s scientific, embedded and system software – utilise many lowlevel features and an intricate programming style that forces the compiler to even more conservative assumptions. Despite the numerous proposals to handle this uncertainty at compile time using speculative optimisation and parallelisation, the software industry still lacks any pragmatic approaches that extracts coarse-grain parallelism to exploit the multiple processing units of modern commodity hardware. This thesis introduces a novel approach for extracting and exploiting multiple forms of coarse-grain parallelism from sequential applications written in C. We utilise profiling information to overcome the limitations of static data and control-flow analysis enabling more aggressive parallelisation. Profiling is performed using an instrumentation scheme operating at the Intermediate Representation (Ir) level of the compiler. In contrast to existing approaches that depend on low-level binary tools and debugging information, Ir-profiling provides precise and direct correlation of profiling information back to the Ir structures of the compiler. Additionally, our approach is orthogonal to existing automatic parallelisation approaches and additional fine-grain parallelism may be exploited. We demonstrate the applicability and versatility of the proposed methodology using two studies that target different forms of parallelism. First, we focus on the exploitation of loop-level parallelism that is abundant in many scientific and embedded applications. We evaluate our parallelisation strategy against the Nas and Spec Fp benchmarks and two different multi-core platforms (a shared-memory Intel Xeon Smp and a heterogeneous distributed-memory Ibm Cell blade). Empirical evaluation shows that our approach not only yields significant improvements when compared with state-of- the-art parallelising compilers, but comes close to and sometimes exceeds the performance of manually parallelised codes. On average, our methodology achieves 96% of the performance of the hand-tuned parallel benchmarks on the Intel Xeon platform, and a significant speedup for the Cell platform. The second study, addresses the problem of partially sequential loops, typically found in implementations of multimedia codecs. We develop a more powerful whole-program representation based on the Program Dependence Graph (Pdg) that supports profiling, partitioning and codegeneration for pipeline parallelism. In addition we demonstrate how this enhances conventional pipeline parallelisation by incorporating support for multi-level loops and pipeline stage replication in a uniform and automatic way. Experimental results using a set of complex multimedia and stream processing benchmarks confirm the effectiveness of the proposed methodology that yields speedups up to 4.7 on a eight-core Intel Xeon machine
    • …
    corecore