6,640 research outputs found

    Partial parallelization of VMEC system

    Get PDF

    An Integrated Tool for Loop Calculations: aITALC

    Full text link
    aITALC, a new tool for automating loop calculations in high energy physics, is described. The package creates Fortran code for two-fermion scattering processes automatically, starting from the generation and analysis of the Feynman graphs. We describe the modules of the tool, the intercommunication between them and illustrate its use with three examples.Comment: 24 pages, 5 figures, 8 table

    Museums & Society 2034: Trends and Potential Futures

    Get PDF
    What challenges will society and museums face in the next quarter-century? How will the demographic profile of America change between now and 2034? How will energy and infrastructure costs affect the sustainability of museums? What will Web 3.0 -- or 5.0 or 6.0 -- look like? Will the "real" survive the assault of the "virtual"? Will the number of leisure-time alternatives continue to grow? Will the lines between work and leisure, public and private, continue to blur? Most importantly, how will museums face these challenges and shape the future they will have to inhabit?This report, commissioned by the Center for the Future of Museums at the American Association of Museums, projects current social trends to 2034 and suggests how museums can face future challenges while continuing to meet their mission of public service. The report focuses on four major trends: demographic shifts, globalization, the revolution in information and communication technologies, and new cultural assumptions about the primacy of the individual as creator and curator

    AUTOMATING DATA-LAYOUT DECISIONS IN DOMAIN-SPECIFIC LANGUAGES

    Get PDF
    A long-standing challenge in High-Performance Computing (HPC) is the simultaneous achievement of programmer productivity and hardware computational efficiency. The challenge has been exacerbated by the onset of multi- and many-core CPUs and accelerators. Only a few expert programmers have been able to hand-code domain-specific data transformations and vectorization schemes needed to extract the best possible performance on such architectures. In this research, we examined the possibility of automating these methods by developing a Domain-Specific Language (DSL) framework. Our DSL approach extends C++14 by embedding into it a high-level data-parallel array language, and by using a domain-specific compiler to compile to hybrid-parallel code. We also implemented an array index-space transformation algebra within this high-level array language to manipulate array data-layouts and data-distributions. The compiler introduces a novel method for SIMD auto-vectorization based on array data-layouts. Our new auto-vectorization technique is shown to outperform the default auto-vectorization strategy by up to 40% for stencil computations. The compiler also automates distributed data movement with overlapping of local compute with remote data movement using polyhedral integer set analysis. Along with these main innovations, we developed a new technique using C++ template metaprogramming for developing embedded DSLs using C++. We also proposed a domain-specific compiler intermediate representation that simplifies data flow analysis of abstract DSL constructs. We evaluated our framework by constructing a DSL for the HPC grand-challenge domain of lattice quantum chromodynamics. Our DSL yielded performance gains of up to twice the flop rate over existing production C code for selected kernels. This gain in performance was obtained while using less than one-tenth the lines of code. The performance of this DSL was also competitive with the best hand-optimized and hand-vectorized code, and is an order of magnitude better than existing production DSLs.Doctor of Philosoph

    Factors shaping the evolution of electronic documentation systems

    Get PDF
    The main goal is to prepare the space station technical and managerial structure for likely changes in the creation, capture, transfer, and utilization of knowledge. By anticipating advances, the design of Space Station Project (SSP) information systems can be tailored to facilitate a progression of increasingly sophisticated strategies as the space station evolves. Future generations of advanced information systems will use increases in power to deliver environmentally meaningful, contextually targeted, interconnected data (knowledge). The concept of a Knowledge Base Management System is emerging when the problem is focused on how information systems can perform such a conversion of raw data. Such a system would include traditional management functions for large space databases. Added artificial intelligence features might encompass co-existing knowledge representation schemes; effective control structures for deductive, plausible, and inductive reasoning; means for knowledge acquisition, refinement, and validation; explanation facilities; and dynamic human intervention. The major areas covered include: alternative knowledge representation approaches; advanced user interface capabilities; computer-supported cooperative work; the evolution of information system hardware; standardization, compatibility, and connectivity; and organizational impacts of information intensive environments

    Parallel computations based on domain decompositions and integrated radial basis functions for fluid flow problems

    Get PDF
    The thesis reports a contribution to the development of parallel algorithms based on Domain Decomposition (DD) method and Compact Local Integrated Radial Basis Function (CLIRBF) method. This development aims to solve large scale fluid flow problems more efficiently by using parallel high performance computing (HPC). With the help of the DD method, one big problem can be separated into sub-problems and solved on parallel machines. In terms of numerical analysis, for each sub-problem, the overall condition number of the system matrix is significantly reduced. This is one of the main reasons for the stability, high accuracy and efficiency of parallel algorithms. The developed methods have been successfully applied to solve several benchmark problems with both rectangular and non-rectangular boundaries. In parallel computation, there is a challenge called Distributed Termination Detection (DTD) problem. DTD concerns the discovery whether all processes in a distributed system have finished their job. In a distributed system, this problem is not a trivial problem because there is neither a global synchronised clock nor a shared memory. Taking into account the specific requirement of parallel algorithms, a new algorithm is proposed and called the Bitmap DTD. This algorithm is designed to work with DD method for solving Partial Differential Equations (PDEs). The Bitmap DTD algorithm is inspired by the Credit/Recovery DTD class (or weight-throw). The distinguishing feature of this algorithm is the use of a bitmap to carry the snapshot of the system from process to process. The proposed algorithm possesses characteristics as follows. (i) It allows any process to detect termination (symmetry); (ii) it does not require any central control agent (decentralisation); (iii) termination detection delay is of the order of the diameter of the network; and (iv) the message complexity of the proposed algorithm is optimal. In the first attempt, the combination of the DD method and CLIRBF based collocation approach yields an effective parallel algorithm to solve PDEs. This approach has enabled not only the problem to be solved separately in each subdomain by a Central Processing Unit (CPU) but also compact local stencils to be independently treated. The present algorithm has achieved high throughput in solving large scale problems. The procedure is illustrated by several numerical examples including the benchmark lid-driven cavity flow problem. A new parallel algorithm is developed using the Control Volume Method (CVM) for the solution of PDEs. The goal is to develop an efficient parallel algorithm especially for problems with non-rectangular domains. When combined with CLIRBF approach, the resultant method can produce high-order accuracy and economical solution for problems with complex boundary. The algorithm is verified by solving two benchmark problems including the square lid-driven cavity flow and the triangular lid-driven cavity flow. In both cases, the accuracy is in great agreement with benchmark values. In terms of efficiency, the results show that the method has a very high efficiency profile and for some specific cases a super-linear speed-up is achieved. Although overlapping method yields a straightforward implementation and stable convergence, overlapping of sub-domains makes it less applicable for complex domains. The method even generates more computing overhead for each subdomain as the overlapping area grows. Hence, a parallel algorithm based on non-overlapping DD and CLIRBF has been developed for solving Navier-Stokes equations where a CLIRBF scheme is used to solve the problem in each subdomain. A relaxation factor is employed for the transmission conditions at the interface of sub-domains to ensure the convergence of the iterative method while the Bitmap DTD algorithm is used to achieve the global termination. The parallel algorithm is demonstrated through two fluid flow problems, namely the natural convection in concentric annuli (Boussinesq fluids) and the lid-driven cavity flow (viscous fluids). The results confirm the high efficiency of the present method in comparison with a sequential algorithm. A super-linear efficiency is also observed for a range of numbers of CPUs. Finally, when comparing the overlapping and non-overlapping parallel algorithms, it is found that the non-overlapping one is less stable. The numerical results show that the non-overlapping method is not able to converge for high Reynolds number while overlapping method reaches the same convergence profile as the sequential CLIRBF method. Thus, in this research when dealing with non-Newtonian fluids and large scale problems, the overlapping method is preferred to the nonoverlapping one. The flow of Oldroyd-B fluid through a planar contraction was considered as a benchmark problem. In this problem, the singularity of stress at the re-entrant corners always poses difficulty to numerical methods in obtaining stable solutions at high Weissenberg numbers. In this work, a high resolution simulation of the flow is obtained and the contour of streamline is shown to be in great agreement with other results

    Changing Trains at Wigan: Digital Preservation and the Future of Scholarship

    Get PDF
    This paper examines the impact of the emerging digital landscape on long term access to material created in digital form and its use for research; it examines challenges, risks and expectations.

    A Modular Approach to Adaptive Reactive Streaming Systems

    Get PDF
    The latest generations of FPGA devices offer large resource counts that provide the headroom to implement large-scale and complex systems. However, there are increasing challenges for the designer, not just because of pure size and complexity, but also in harnessing effectively the flexibility and programmability of the FPGA. A central issue is the need to integrate modules from diverse sources to promote modular design and reuse. Further, the capability to perform dynamic partial reconfiguration (DPR) of FPGA devices means that implemented systems can be made reconfigurable, allowing components to be changed during operation. However, use of DPR typically requires low-level planning of the system implementation, adding to the design challenge. This dissertation presents ReShape: a high-level approach for designing systems by interconnecting modules, which gives a ‘plug and play’ look and feel to the designer, is supported by tools that carry out implementation and verification functions, and is carried through to support system reconfiguration during operation. The emphasis is on the inter-module connections and abstracting the communication patterns that are typical between modules – for example, the streaming of data that is common in many FPGA-based systems, or the reading and writing of data to and from memory modules. ShapeUp is also presented as the static precursor to ReShape. In both, the details of wiring and signaling are hidden from view, via metadata associated with individual modules. ReShape allows system reconfiguration at the module level, by supporting type checking of replacement modules and by managing the overall system implementation, via metadata associated with its FPGA floorplan. The methodology and tools have been implemented in a prototype for a broad domain-specific setting – networking systems – and have been validated on real telecommunications design projects
    • …
    corecore