150 research outputs found

    Efficient algorithms for the fast computation of space charge effects caused by charged particles in particle accelerators

    Get PDF
    In this dissertation, a Poisson solver is improved with three parts: the efficient integrated Green's function; the discrete cosine transform of the efficient integrated Green's function values; the implicitly zero-padded fast Fourier transform for charge density. In addition, the high performance computing technology is utilized for the further improvement of efficiency, such as: OpenMP API, OpenMP+CUDA, MPI, and MPI+OpenMP parallelizations. The examples and simulation results are matched with the results of the commonly used Poisson solver to demonstrate the accuracy performance

    Advanced modeling of materials with PAOFLOW 2.0:New features and software design

    Get PDF
    Recent research in materials science opens exciting perspectives to design novel quantum materials and devices, but it calls for quantitative predictions of properties which are not accessible in standard first principles packages. PAOFLOW, is a software tool that constructs tight-binding Hamiltonians from self consistent electronic wavefunctions by projecting onto a set of atomic orbitals. The electronic structure provides numerous materials properties that otherwise would have to be calculated via phenomenological models. In this paper, we describe recent re-design of the code as well as the new features and improvements in performance. In particular, we have implemented symmetry operations for unfolding equivalent k-points, which drastically reduces the runtime requirements of first principles calculations, and we have provided internal routines of projections onto atomic orbitals enabling generation of real space atomic orbitals. Moreover, we have included models for non-constant relaxation time in electronic transport calculations, doubling the real space dimensions of the Hamiltonian as well as the construction of Hamiltonians directly from analytical models. Importantly, PAOFLOW has been now converted into a Python package, and is streamlined for use directly within other Python codes. The new object oriented design treats PAOFLOW's computational routines as class methods, providing an API for explicit control of each calculation.</p

    Microwave Tomography Using Stochastic Optimization And High Performance Computing

    Get PDF
    This thesis discusses the application of parallel computing in microwave tomography for detection and imaging of dielectric objects. The main focus is on microwave tomography with the use of a parallelized Finite Difference Time Domain (FDTD) forward solver in conjunction with non-linear stochastic optimization based inverse solvers. Because such solvers require very heavy computation, their investigation has been limited in favour of deterministic inverse solvers that make use of assumptions and approximations of the imaging target. Without the use of linearization assumptions, a non-linear stochastic microwave tomography system is able to resolve targets of arbitrary permittivity contrast profiles while avoiding convergence to local minima of the microwave tomography optimization space. This work is focused on ameliorating this computational load with the use of heavy parallelization. The presented microwave tomography system is capable of modelling complex, heterogeneous, and dispersive media using the Debye model. A detailed explanation of the dispersive FDTD is presented herein. The system uses scattered field data due to multiple excitation angles, frequencies, and observation angles in order to improve target resolution, reduce the ill-posedness of the microwave tomography inverse problem, and improve the accuracy of the complex permittivity profile of the imaging target. The FDTD forward solver is parallelized with the use of the Common Unified Device Architecture (CUDA) programming model developed by NVIDIA corporation. In the forward solver, the time stepping of the fields are computed on a Graphics Processing Unit (GPU). In addition the inverse solver makes use of the Message Passing Interface (MPI) system to distribute computation across multiple work stations. The FDTD method was chosen due to its ease of parallelization using GPU computing, in addition to its ability to simulate wideband excitation signals during a single forward simulation. We investigated the use of distributed Particle Swarm Optimization (PSO) and Differential Evolution (DE) methods in the inverse solver for this microwave tomography system. In these optimization algorithms, candidate solutions are farmed out to separate workstations to be evaluated. As fitness evaluations are returned asynchronously, the optimization algorithm updates the population of candidate solutions and gives new candidate solutions to be evaluated to open workstations. In this manner, we used a total of eight graphics processing units during optimization with minimal downtime. Presented in this thesis is a microwave tomography algorithm that does not rely on linearization assumptions, capable of imaging a target in a reasonable amount of time for clinical applications. The proposed algorithm was tested using numerical phantoms that with material parameters similar to what one would find in normal or malignant human tissue

    Revealing atomic-scale vacancy-solute interaction in nickel

    Full text link
    Imaging individual vacancies in solids and revealing their interactions with solute atoms remains one of the frontiers in microscopy and microanalysis. Here we study a creep-deformed binary Ni-2 at.% Ta alloy. Atom probe tomography reveals a random distribution of Ta. Field ion microscopy, with contrast interpretation supported by density-functional theory and time-of-flight mass spectrometry, evidences a positive correlation of tantalum with vacancies. Our results support solute-vacancy binding, which explains improvement in creep resistance of Ta-containing Ni-based superalloys and helps guide future material design strategies.Comment: Submitted to Physics Review Lette

    Mechanism of collective interstitial ordering in Fe–C alloys

    Get PDF
    Collective interstitial ordering is at the core of martensite formation in Fe–C-based alloys, laying the foundation for high-strength steels. Even though this ordering has been studied extensively for more than a century, some fundamental mechanisms remain elusive. Here, we show the unexpected effects of two correlated phenomena on the ordering mechanism: anharmonicity and segregation. The local anharmonicity in the strain fields induced by interstitials substantially reduces the critical concentration for interstitial ordering, up to a factor of three. Further, the competition between interstitial ordering and segregation results in an effective decrease of interstitial segregation into extended defects for high interstitial concentrations. The mechanism and corresponding impact on interstitial ordering identified here enrich the theory of phase transitions in materials and constitute a crucial step in the design of ultra-high-performance alloys

    XcalableMP PGAS Programming Language

    Get PDF
    XcalableMP is a directive-based parallel programming language based on Fortran and C, supporting a Partitioned Global Address Space (PGAS) model for distributed memory parallel systems. This open access book presents XcalableMP language from its programming model and basic concept to the experience and performance of applications described in XcalableMP.  XcalableMP was taken as a parallel programming language project in the FLAGSHIP 2020 project, which was to develop the Japanese flagship supercomputer, Fugaku, for improving the productivity of parallel programing. XcalableMP is now available on Fugaku and its performance is enhanced by the Fugaku interconnect, Tofu-D. The global-view programming model of XcalableMP, inherited from High-Performance Fortran (HPF), provides an easy and useful solution to parallelize data-parallel programs with directives for distributed global array and work distribution and shadow communication. The local-view programming adopts coarray notation from Coarray Fortran (CAF) to describe explicit communication in a PGAS model. The language specification was designed and proposed by the XcalableMP Specification Working Group organized in the PC Consortium, Japan. The Omni XcalableMP compiler is a production-level reference implementation of XcalableMP compiler for C and Fortran 2008, developed by RIKEN CCS and the University of Tsukuba. The performance of the XcalableMP program was used in the Fugaku as well as the K computer. A performance study showed that XcalableMP enables a scalable performance comparable to the message passing interface (MPI) version with a clean and easy-to-understand programming style requiring little effort
    • …
    corecore