592 research outputs found

    Multiprocessor Out-of-Core FFTs with Distributed Memory and Parallel Disks

    Get PDF
    This paper extends an earlier out-of-core Fast Fourier Transform (FFT) method for a uniprocessor with the Parallel Disk Model (PDM) to use multiple processors. Four out-of-core multiprocessor methods are examined. Operationally, these methods differ in the size of mini-butterfly computed in memory and how the data are organized on the disks and in the distributed memory of the multiprocessor. The methods also perform differing amounts of I/O and communication. Two of them have the remarkable property that even though they are computing the FFT on a multiprocessor, all interprocessor communication occurs outside the mini-butterfly computations. Performance results on a small workstation cluster indicate that except for unusual combinations of problem size and memory size, the methods that do not perform interprocessor communication during the mini-butterfly computations require approximately 86% of the time of those that do. Moreover, the faster methods are much easier to implement

    Determining an Out-of-Core FFT Decomposition Strategy for Parallel Disks by Dynamic Programming

    Get PDF
    We present an out-of-core FFT algorithm based on the in-core FFT method developed by Swarztrauber. Our algorithm uses a recursive divide-and-conquer strategy, and each stage in the recursion presents several possibilities for how to split the problem into subproblems. We give a recurrence for the algorithm\u27s I/O complexity on the Parallel Disk Model and show how to use dynamic programming to determine optimal splits at each recursive stage. The algorithm to determine the optimal splits takes only Theta(lg^2 N) time for an N-point FFT, and it is practical. The out-of-core FFT algorithm itself takes considerably longer

    Wave modes of collective vortex gyration in dipolar-coupled-dot-array magnonic crystals

    Get PDF
    Lattice vibration modes are collective excitations in periodic arrays of atoms or molecules. These modes determine novel transport properties in solid crystals. Analogously, in periodical arrangements of magnetic vortex-state disks, collective vortex motions have been predicted. Here, we experimentally observe wave modes of collective vortex gyration in one-dimensional (1D) periodic arrays of magnetic disks using time-resolved scanning transmission x-ray microscopy. The observed modes are interpreted based on micromagnetic simulation and numerical calculation of coupled Thiele equations. Dispersion of the modes is found to be strongly affected by both vortex polarization and chirality ordering, as revealed by the explicit analytical form of 1D infinite arrays. A thorough understanding thereof is fundamental both for lattice vibrations and vortex dynamics, which we demonstrate for 1D magnonic crystals. Such magnetic disk arrays with vortex-state ordering, referred to as magnetic metastructure, offer potential implementation into information processing devices.open8

    Optimizing the Dimensional Method for Performing Multidimensional, Multiprocessor, Out-of-Core FFTs

    Get PDF
    We present an improved version of the Dimensional Method for computing multidimensional Fast Fourier Transforms (FFTs) on a multiprocessor system when the data consist of too many records to fit into memory. Data are spread across parallel disks and processed in sections. We use the Parallel Disk Model for analysis. The simple Dimensional Method performs the 1-dimensional FFTs for each dimension in term. Between each dimension, an out-of-core permutation is used to rearrange the data to contiguous locations. The improved Dimensional Method processes multiple dimensions at a time. We show that determining an optimal sequence and groupings of dimensions is NP-complete. We then analyze the effects of two modifications to the Dimensional Method independently: processing multiple dimensions at one time, and processing single dimensions in a different order. Finally, we show a lower bound on the I/O complexity of the Dimensional Method and present an algorithm that is approximately asymptotically optimal

    Out-of-Core Hydrodynamic Simulations for Cosmological Applications

    Full text link
    We present an out-of-core hydrodynamic code for high resolution cosmological simulations that require terabytes of memory. Out-of-core computation refers to the technique of using disk space as virtual memory and transferring data in and out of main memory at high I/O bandwidth. The code is based on a two-level mesh scheme where short-range physics is solved on a high-resolution, localized mesh while long-range physics is captured on a lower resolution, global mesh. The two-level mesh gravity solver allows FFTs to operate on data stored entirely in memory, which is much faster than the alternative of computing the transforms out-of-core through non-sequential disk accesses. We also describe an out-of-core initial conditions generator that is used to prepare large data sets for cosmological simulations. The out-of-core code is accurate, cost-effective, and memory-efficient and the current version is implemented to run in parallel on shared-memory machines. I/O overhead is significantly reduced down to less than 10% by performing disk operations concurrently with numerical calculations. The current computational setup, which includes a 32 processor Alpha server and a 3 TB striped SCSI disk array, allows us to run cosmological simulations with up to 4000^3 grid cells and 2000^3 dark matter particles.Comment: 19 pages, 10 figures; accepted by New Astronom

    ERS-1 SAR data processing

    Get PDF
    To take full advantage of the synthetic aperature radar (SAR) to be flown on board the European Space Agency's Remote Sensing Satellite (ERS-1) (1989) and the Canadian Radarsat (1990), the implementation of a receiving station in Alaska is being studied to gather and process SAR data pertaining in particular to regions within the station's range of reception. The current SAR data processing requirement is estimated to be on the order of 5 minutes per day. The Interim Digital Sar Processor (IDP) which was under continual development through Seasat (1978) and SIR-B (1984) can process slightly more than 2 minutes of ERS-1 data per day. On the other hand, the Advanced Digital SAR Processore (ADSP), currently under development for the Shuttle Imaging Radar C (SIR-C, 1988) and the Venus Radar Mapper, (VMR, 1988), is capable of processing ERS-1 SAR data at a real time rate. To better suit the anticipated ERS-1 SAR data processing requirement, both a modified IDP and an ADSP derivative are being examined. For the modified IDP, a pipelined architecture is proposed for the mini-computer plus array processor arrangement to improve throughout. For the ADSP derivative, a simplified version is proposed to enhance ease of implementation and maintainability while maintaing real time throughput rates. These processing systems are discussed and evaluated

    μžμ„± λ””μŠ€ν¬ λ°°μ—΄ λ‚΄ κ²°ν•©λœ 자기 μ†Œμš©λŒμ΄μ˜ 동적 거동 연ꡬ

    Get PDF
    ν•™μœ„λ…Όλ¬Έ(박사)--μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› :κ³΅κ³ΌλŒ€ν•™ μž¬λ£Œκ³΅ν•™λΆ€,2020. 2. 김상ꡭ.자기 μ†Œμš©λŒμ΄λŠ” 수 λ§ˆμ΄ν¬λ‘œλ―Έν„° 크기 ν˜Ήμ€ κ·Έ μ΄ν•˜μ˜ κ°•μžμ„± κ΅¬μ‘°μ²΄μ—μ„œ μ•ˆμ •μ μœΌλ‘œ ν˜•μ„±λ˜λŠ” νŠΉμ΄ν•œ λ°°μ—΄ ꡬ쑰λ₯Ό λ§ν•œλ‹€. 자기 μ†Œμš©λŒμ΄λŠ” 박막면에 μˆ˜μ§ν•œ μˆ˜μ‹­ λ‚˜λ…Έλ―Έν„° 크기의 자기 μ†Œμš©λŒμ΄ ν•΅κ³Ό, κ·Έ μ£Όμœ„μ˜ 평면 λ‚΄ νšŒμ „ν•˜λŠ” λͺ¨μ–‘μœΌλ‘œ λ°°μ—΄λœ μŠ€ν•€λ“€λ‘œ κ΅¬μ„±λœλ‹€. 자기 μ†Œμš©λŒμ΄μ— μ™ΈλΆ€ 자기μž₯ ν˜Ήμ€ μ „λ₯˜ 등을 μΈκ°€ν•˜λ©΄ 자기 μ†Œμš©λŒμ΄ 핡이 νšŒμ „μš΄λ™μ„ ν•˜λŠ” μ„±μ§ˆμ΄ μžˆλ‹€. μ΄λŸ¬ν•œ 자기 μ†Œμš©λŒμ΄λŠ” ν•΅μ˜ 두 가지 μžν™”λ°©ν–₯κ³Ό 주변에 λ°°μ—΄λœ μŠ€ν•€λ“€μ˜ 두 가지 νšŒμ „λ°©ν–₯의 μ‘°ν•©μœΌλ‘œ λ„€ 개의 λ™μΌν•œ κΈ°μ € μ—λ„ˆμ§€ μ€€μœ„λ₯Ό κ°€μ§ˆ 수 있고, μ—΄μ μœΌλ‘œ 맀우 μ•ˆμ •ν•˜κΈ° λ•Œλ¬Έμ— λΉ„νœ˜λ°œμ„± 정보저μž₯ μ†Œμžλ‘œ μ‘μš© κ°€λŠ₯ν•˜λ‹€. λ˜ν•œ μ—¬λŸ¬ 개의 κ²°ν•©λœ 자기 μ†Œμš©λŒμ΄ μ‚¬μ΄μ—μ„œ λ‚˜νƒ€λ‚˜λŠ” 자기 μ†Œμš©λŒμ΄ ν•΅μ˜ 집단적 νšŒμ „μš΄λ™μ€ μƒˆλ‘œμš΄ μ‹ ν˜Έμ „λ‹¬μ˜ 맀개체둜 이용될 수 μžˆμ–΄ μ •λ³΄μ²˜λ¦¬ μ†Œμžλ‘œμ˜ μ‘μš©μ„±μ— λŒ€ν•œ 연ꡬ가 μ§„ν–‰λ˜μ–΄μ™”λ‹€. λ³Έ ν•™μœ„ λ…Όλ¬Έμ—μ„œλŠ” λ―Έμ†ŒμžκΈ° μ „μ‚°λͺ¨μ‚¬ 및 μ‹€ν—˜μ„ μ΄μš©ν•˜μ—¬ 자기 μ†Œμš©λŒμ΄μ˜ 동적 거동과 자기 μ†Œμš©λŒμ΄ κ°„μ˜ 동적 μƒν˜Έμž‘μš© 연ꡬ에 μ΄ˆμ μ„ λ‘κ³ μžˆλ‹€. 자기 λ””μŠ€ν¬ λ°°μ—΄μ—μ„œ 자기 μ†Œμš©λŒμ΄ κ²°ν•© λͺ¨λ“œ, 자기 μ†Œμš©λŒμ΄ ν•΅ λ°˜μ „ 방법 및 자기 μ†Œμš©λŒμ΄ ν•΅μ˜ νšŒμ „μš΄λ™ μ‹ ν˜Έ μ „λ‹¬μ˜ μ œμ–΄μ— κ΄€ν•œ 연ꡬ가 μ£Ό λ‚΄μš©μ΄λ‹€. μ΄λŸ¬ν•œ 자기 μ†Œμš©λŒμ΄μ˜ 동적 거동 μ œμ–΄ 방법을 μ΄μš©ν•΄ μƒˆλ‘œμš΄ κ°œλ…μ˜ RS 래치 논리 μ†Œμž, μ‹œλΆ„ν•  및 주파수 λΆ„ν•  λ””λ©€ν‹°ν”Œλ ‰μ„œ μ†Œμžλ₯Ό μ œμ•ˆν•˜κ³  κ·Έ λ™μž‘ νŠΉμ„±μ„ μ—°κ΅¬ν•˜μ˜€λ‹€. 자기 μ†Œμš©λŒμ΄λ₯Ό μ΄μš©ν•œ μ†Œμžλ“€μ€ λΉ„νœ˜λ°œμ„±μ΄λ©°, 거의 λ¬΄μ œν•œμ˜ 수λͺ…을 가지고, μ—λ„ˆμ§€κ°€ 적게 λ“œλŠ” λ“± λ§Žμ€ μž₯점을 가지고 μžˆλ‹€. λ˜ν•œ 자기 μ†Œμš©λŒμ΄λŠ” κ·Έ νŠΉμ„±μ˜ μ œμ–΄κ°€ 맀우 μš©μ΄ν•΄μ„œ ν–₯ν›„ 개발될 μŠ€ν•€νŠΈλ‘œλ‹‰μŠ€ μ†Œμžλ‘œ μ‘μš©λ  수 μžˆλŠ” κ°€λŠ₯성을 가지고 μžˆλ‹€. λ³Έ 연ꡬ κ²°κ³ΌλŠ” μ°¨μ„ΈλŒ€ μŠ€ν•€νŠΈλ‘œλ‹‰μŠ€ κΈ°μˆ λ‘œμ„œ 자기 μ†Œμš©λŒμ΄μ— κΈ°λ°˜ν•œ 논리 μ†Œμž 및 정보 처리 μž₯치의 κ΅¬ν˜„ κ°€λŠ₯성을 보여쀀닀.In the sub-micrometer-size ferromagnetic structure, the magnetic vortex is in a strongly stable ground state characterized by an in-plane curling magnetization around and an out-of-plane magnetization in the central region. The magnetic vortex is characterized by clockwise (CW) or counter-clockwise (CCW) curling in-plane magnetizations around a single vortex core in which region magnetizations are perpendicularly oriented either upward or downward. In isolated disks, applied external forces induce vortex excitations, among which a translational mode exists in which the vortex core gyrates around its equilibrium position at a characteristic eigenfrequency. Vortex-core switching can be accomplished with low power consumption when vortex gyrations are resonantly excited. Moreover, the gyration modes of individual vortex cores in a periodic array of patterned vortex-state disks are coupled with each other, thus yielding collectively coupled motions of the individual cores. On the basis of such novel dynamic characteristics, non-volatile memory and information processing devices using magnetic vortex have been proposed. This work focused on dynamic interaction between vortex-state ferromagnetic structures and its applications, utilizing micromagnetic simulations, analytical calculations, and experiments. The dynamic behaviors of vortex-gyration-coupled modes, vortex-core switching, and propagation of vortex-core gyration signal in magnetic-disk-network devices are investigated. Based on the combinations of the novel dynamic characteristics of vortices in dipolar-coupled disks, a new concept RS latch logic, time- and frequency-division demultiplexer device operations are explored. Magnetic vortex has many advantages such as non-volatility, almost unlimited endurance, and low power operation. Furthermore, a rich tunability of magnetic vortices makes them adoptable as future spintronics devices. This work can pave the way for possible implementation of logic gates and information processing devices based on coupled magnetic vortices.1. Introduction 1 2. Research Background 5 2.1. Magnetization dynamics and micromagnetics 5 2.1.1. Landau-Lifshitz-Gilbert equation 5 2.1.2. Effective fields in the LLG equation 8 2.2. Vortices in magnetic microstructures and their dynamics 10 2.2.1. Vortex core gyration 15 2.2.2. Vortex core switching 18 2.2.3. Interaction between magnetic vortices 18 2.3. Experimental methods 20 2.3.1. Photo lithography 20 2.3.2. Electron beam lithography 20 2.3.3. Anisotropic magneto resistance in vortex 21 3. Vortex Core Switching by Propagation of a Gyration-Coupled Mode 23 3.1. Micromagnetic simulation conditions 23 3.2. Coupled modes of gyration for the two types of vortex-state configurations 26 3.3. Concept design of reset-set latch device 32 3.4. Magnitude of oscillating magnetic field and radius of disks dependent switching behavior 36 3.5. Reset-set latch logic operation 39 4. Control of Gyration Signal Propagation in Coupled Magnetic Vortices 43 4.1. Dynamics of the single and coupled disk array 43 4.2. Control of gyration signal propagation by in-plane bias field 50 4.3. Control of gyration signal propagation by vortex core switching 53 4.4. Concept design of time-division demultiplexer device and its operation 60 4.5. Concept design of frequency-division demultiplexer device and its operation 65 5. Electrical Measurement of the Gyrotropic Resonance of a Magnetic Vortex in Circular and Chopped Disks. 68 5.1. Sample fabrication 68 5.2. DC AMR measurement 73 5.3. AC AMR measurement by rectification technique 78 6. Summary 88 Bibliography 90 Publication List 100 Patent List 102 Presentations in Conferences 103Docto

    Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS

    Full text link
    GROMACS is a widely used package for biomolecular simulation, and over the last two decades it has evolved from small-scale efficiency to advanced heterogeneous acceleration and multi-level parallelism targeting some of the largest supercomputers in the world. Here, we describe some of the ways we have been able to realize this through the use of parallelization on all levels, combined with a constant focus on absolute performance. Release 4.6 of GROMACS uses SIMD acceleration on a wide range of architectures, GPU offloading acceleration, and both OpenMP and MPI parallelism within and between nodes, respectively. The recent work on acceleration made it necessary to revisit the fundamental algorithms of molecular simulation, including the concept of neighborsearching, and we discuss the present and future challenges we see for exascale simulation - in particular a very fine-grained task parallelism. We also discuss the software management, code peer review and continuous integration testing required for a project of this complexity.Comment: EASC 2014 conference proceedin
    • …
    corecore