Search CORE

592 research outputs found

Multiprocessor Out-of-Core FFTs with Distributed Memory and Parallel Disks

Author: Cormen Thomas H
Nicol David M
Wegmann Jake
Publication venue: Dartmouth Digital Commons
Publication date: 01/01/1997
Field of study

This paper extends an earlier out-of-core Fast Fourier Transform (FFT) method for a uniprocessor with the Parallel Disk Model (PDM) to use multiple processors. Four out-of-core multiprocessor methods are examined. Operationally, these methods differ in the size of mini-butterfly computed in memory and how the data are organized on the disks and in the distributed memory of the multiprocessor. The methods also perform differing amounts of I/O and communication. Two of them have the remarkable property that even though they are computing the FFT on a multiprocessor, all interprocessor communication occurs outside the mini-butterfly computations. Performance results on a small workstation cluster indicate that except for unusual combinations of problem size and memory size, the methods that do not perform interprocessor communication during the mini-butterfly computations require approximately 86% of the time of those that do. Moreover, the faster methods are much easier to implement

Dartmouth Digital Commons (Dartmouth College)

Determining an Out-of-Core FFT Decomposition Strategy for Parallel Disks by Dynamic Programming

Author: Cormen Thomas H
Publication venue: Dartmouth Digital Commons
Publication date: 01/09/1996
Field of study

We present an out-of-core FFT algorithm based on the in-core FFT method developed by Swarztrauber. Our algorithm uses a recursive divide-and-conquer strategy, and each stage in the recursion presents several possibilities for how to split the problem into subproblems. We give a recurrence for the algorithm\u27s I/O complexity on the Parallel Disk Model and show how to use dynamic programming to determine optimal splits at each recursive stage. The algorithm to determine the optimal splits takes only Theta(lg^2 N) time for an N-point FFT, and it is practical. The out-of-core FFT algorithm itself takes considerably longer

Dartmouth Digital Commons (Dartmouth College)

Wave modes of collective vortex gyration in dipolar-coupled-dot-array magnonic crystals

Author: A Barman
A Khitun
A Khitun
A Vogel
A Vogel
A Vogel
A Vogel
AA Serga
AA Thiele
AV Chumak
AV Chumak
AV Chumak
AV Chumak
C Bayer
CT Chen
D Suess
H Jung
H Jung
H Jung
H Puszkarski
J Ding
J Lau
J Shibata
J Shibata
K-S Lee
K-S Lee
K-S Lee
LD Landau
M Kammerer
MP Kostylev
R Hertel
RP Cowburn
S Barman
S Choi
S Jain
S Neusser
S Neusser
S Sugimoto
S Tacchi
S-K Kim
S-K Kim
SV Vasiliev
T Schneider
TL Gilbert
U Hansen
VV Kruglyak
VV Kruglyak
Y Kobljanskyj
Y Yu
Y Zhu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Lattice vibration modes are collective excitations in periodic arrays of atoms or molecules. These modes determine novel transport properties in solid crystals. Analogously, in periodical arrangements of magnetic vortex-state disks, collective vortex motions have been predicted. Here, we experimentally observe wave modes of collective vortex gyration in one-dimensional (1D) periodic arrays of magnetic disks using time-resolved scanning transmission x-ray microscopy. The observed modes are interpreted based on micromagnetic simulation and numerical calculation of coupled Thiele equations. Dispersion of the modes is found to be strongly affected by both vortex polarization and chirality ordering, as revealed by the explicit analytical form of 1D infinite arrays. A thorough understanding thereof is fundamental both for lattice vibrations and vortex dynamics, which we demonstrate for 1D magnonic crystals. Such magnetic disk arrays with vortex-state ordering, referred to as magnetic metastructure, offer potential implementation into information processing devices.open8

arXiv.org e-Print Archive

Optimizing the Dimensional Method for Performing Multidimensional, Multiprocessor, Out-of-Core FFTs

Author: Fineman Jeremy T
Publication venue: Dartmouth Digital Commons
Publication date: 01/06/2001
Field of study

We present an improved version of the Dimensional Method for computing multidimensional Fast Fourier Transforms (FFTs) on a multiprocessor system when the data consist of too many records to fit into memory. Data are spread across parallel disks and processed in sections. We use the Parallel Disk Model for analysis. The simple Dimensional Method performs the 1-dimensional FFTs for each dimension in term. Between each dimension, an out-of-core permutation is used to rearrange the data to contiguous locations. The improved Dimensional Method processes multiple dimensions at a time. We show that determining an optimal sequence and groupings of dimensions is NP-complete. We then analyze the effects of two modifications to the Dimensional Method independently: processing multiple dimensions at one time, and processing single dimensions in a different order. Finally, we show a lower bound on the I/O complexity of the Dimensional Method and present an algorithm that is approximately asymptotically optimal

Dartmouth Digital Commons (Dartmouth College)

Out-of-Core Hydrodynamic Simulations for Cosmological Applications

Author: Bagla
Bode
Couchman
Couchman
Dubinski
Efstathiou
Efstathiou
Fryxell
Harten
Hockney
Hy Trac
Merz
Pen
Seljak
Smith
Spergel
Springel
Strang
Trac
Trac
Ue-Li Pen
Warren
Xu
Publication venue: 'Elsevier BV'
Publication date: 09/01/2006
Field of study

We present an out-of-core hydrodynamic code for high resolution cosmological simulations that require terabytes of memory. Out-of-core computation refers to the technique of using disk space as virtual memory and transferring data in and out of main memory at high I/O bandwidth. The code is based on a two-level mesh scheme where short-range physics is solved on a high-resolution, localized mesh while long-range physics is captured on a lower resolution, global mesh. The two-level mesh gravity solver allows FFTs to operate on data stored entirely in memory, which is much faster than the alternative of computing the transforms out-of-core through non-sequential disk accesses. We also describe an out-of-core initial conditions generator that is used to prepare large data sets for cosmological simulations. The out-of-core code is accurate, cost-effective, and memory-efficient and the current version is implemented to run in parallel on shared-memory machines. I/O overhead is significantly reduced down to less than 10% by performing disk operations concurrently with numerical calculations. The current computational setup, which includes a 32 processor Alpha server and a 3 TB striped SCSI disk array, allows us to run cosmological simulations with up to 4000^3 grid cells and 2000^3 dark matter particles.Comment: 19 pages, 10 figures; accepted by New Astronom

arXiv.org e-Print Archive

Crossref

ERS-1 SAR data processing

Author: Bicknell T.
Leung K.
Vines K.
Publication venue
Publication date
Field of study

To take full advantage of the synthetic aperature radar (SAR) to be flown on board the European Space Agency's Remote Sensing Satellite (ERS-1) (1989) and the Canadian Radarsat (1990), the implementation of a receiving station in Alaska is being studied to gather and process SAR data pertaining in particular to regions within the station's range of reception. The current SAR data processing requirement is estimated to be on the order of 5 minutes per day. The Interim Digital Sar Processor (IDP) which was under continual development through Seasat (1978) and SIR-B (1984) can process slightly more than 2 minutes of ERS-1 data per day. On the other hand, the Advanced Digital SAR Processore (ADSP), currently under development for the Shuttle Imaging Radar C (SIR-C, 1988) and the Venus Radar Mapper, (VMR, 1988), is capable of processing ERS-1 SAR data at a real time rate. To better suit the anticipated ERS-1 SAR data processing requirement, both a modified IDP and an ADSP derivative are being examined. For the modified IDP, a pipelined architecture is proposed for the mini-computer plus array processor arrangement to improve throughout. For the ADSP derivative, a simplified version is proposed to enhance ease of implementation and maintainability while maintaing real time throughput rates. These processing systems are discussed and evaluated

NASA Technical Reports Server

자성 디스크 배열 내 결합된 자기 소용돌이의 동적 거동 연구

Author: 조영준
Publication venue: 서울대학교 대학원
Publication date: 01/02/2020
Field of study

학위논문(박사)--서울대학교 대학원 :공과대학 재료공학부,2020. 2. 김상국.자기 소용돌이는 수 마이크로미터 크기 혹은 그 이하의 강자성 구조체에서 안정적으로 형성되는 특이한 배열 구조를 말한다. 자기 소용돌이는 박막면에 수직한 수십 나노미터 크기의 자기 소용돌이 핵과, 그 주위의 평면 내 회전하는 모양으로 배열된 스핀들로 구성된다. 자기 소용돌이에 외부 자기장 혹은 전류 등을 인가하면 자기 소용돌이 핵이 회전운동을 하는 성질이 있다. 이러한 자기 소용돌이는 핵의 두 가지 자화방향과 주변에 배열된 스핀들의 두 가지 회전방향의 조합으로 네 개의 동일한 기저 에너지 준위를 가질 수 있고, 열적으로 매우 안정하기 때문에 비휘발성 정보저장 소자로 응용 가능하다. 또한 여러 개의 결합된 자기 소용돌이 사이에서 나타나는 자기 소용돌이 핵의 집단적 회전운동은 새로운 신호전달의 매개체로 이용될 수 있어 정보처리 소자로의 응용성에 대한 연구가 진행되어왔다. 본 학위 논문에서는 미소자기 전산모사 및 실험을 이용하여 자기 소용돌이의 동적 거동과 자기 소용돌이 간의 동적 상호작용 연구에 초점을 두고있다. 자기 디스크 배열에서 자기 소용돌이 결합 모드, 자기 소용돌이 핵 반전 방법 및 자기 소용돌이 핵의 회전운동 신호 전달의 제어에 관한 연구가 주 내용이다. 이러한 자기 소용돌이의 동적 거동 제어 방법을 이용해 새로운 개념의 RS 래치 논리 소자, 시분할 및 주파수 분할 디멀티플렉서 소자를 제안하고 그 동작 특성을 연구하였다. 자기 소용돌이를 이용한 소자들은 비휘발성이며, 거의 무제한의 수명을 가지고, 에너지가 적게 드는 등 많은 장점을 가지고 있다. 또한 자기 소용돌이는 그 특성의 제어가 매우 용이해서 향후 개발될 스핀트로닉스 소자로 응용될 수 있는 가능성을 가지고 있다. 본 연구 결과는 차세대 스핀트로닉스 기술로서 자기 소용돌이에 기반한 논리 소자 및 정보 처리 장치의 구현 가능성을 보여준다.In the sub-micrometer-size ferromagnetic structure, the magnetic vortex is in a strongly stable ground state characterized by an in-plane curling magnetization around and an out-of-plane magnetization in the central region. The magnetic vortex is characterized by clockwise (CW) or counter-clockwise (CCW) curling in-plane magnetizations around a single vortex core in which region magnetizations are perpendicularly oriented either upward or downward. In isolated disks, applied external forces induce vortex excitations, among which a translational mode exists in which the vortex core gyrates around its equilibrium position at a characteristic eigenfrequency. Vortex-core switching can be accomplished with low power consumption when vortex gyrations are resonantly excited. Moreover, the gyration modes of individual vortex cores in a periodic array of patterned vortex-state disks are coupled with each other, thus yielding collectively coupled motions of the individual cores. On the basis of such novel dynamic characteristics, non-volatile memory and information processing devices using magnetic vortex have been proposed. This work focused on dynamic interaction between vortex-state ferromagnetic structures and its applications, utilizing micromagnetic simulations, analytical calculations, and experiments. The dynamic behaviors of vortex-gyration-coupled modes, vortex-core switching, and propagation of vortex-core gyration signal in magnetic-disk-network devices are investigated. Based on the combinations of the novel dynamic characteristics of vortices in dipolar-coupled disks, a new concept RS latch logic, time- and frequency-division demultiplexer device operations are explored. Magnetic vortex has many advantages such as non-volatility, almost unlimited endurance, and low power operation. Furthermore, a rich tunability of magnetic vortices makes them adoptable as future spintronics devices. This work can pave the way for possible implementation of logic gates and information processing devices based on coupled magnetic vortices.1. Introduction 1 2. Research Background 5 2.1. Magnetization dynamics and micromagnetics 5 2.1.1. Landau-Lifshitz-Gilbert equation 5 2.1.2. Effective fields in the LLG equation 8 2.2. Vortices in magnetic microstructures and their dynamics 10 2.2.1. Vortex core gyration 15 2.2.2. Vortex core switching 18 2.2.3. Interaction between magnetic vortices 18 2.3. Experimental methods 20 2.3.1. Photo lithography 20 2.3.2. Electron beam lithography 20 2.3.3. Anisotropic magneto resistance in vortex 21 3. Vortex Core Switching by Propagation of a Gyration-Coupled Mode 23 3.1. Micromagnetic simulation conditions 23 3.2. Coupled modes of gyration for the two types of vortex-state configurations 26 3.3. Concept design of reset-set latch device 32 3.4. Magnitude of oscillating magnetic field and radius of disks dependent switching behavior 36 3.5. Reset-set latch logic operation 39 4. Control of Gyration Signal Propagation in Coupled Magnetic Vortices 43 4.1. Dynamics of the single and coupled disk array 43 4.2. Control of gyration signal propagation by in-plane bias field 50 4.3. Control of gyration signal propagation by vortex core switching 53 4.4. Concept design of time-division demultiplexer device and its operation 60 4.5. Concept design of frequency-division demultiplexer device and its operation 65 5. Electrical Measurement of the Gyrotropic Resonance of a Magnetic Vortex in Circular and Chopped Disks. 68 5.1. Sample fabrication 68 5.2. DC AMR measurement 73 5.3. AC AMR measurement by rectification technique 78 6. Summary 88 Bibliography 90 Publication List 100 Patent List 102 Presentations in Conferences 103Docto

SNU Open Repository and Archive

Tackling Exascale Software Challenges in Molecular Dynamics Simulations with GROMACS

Author: A Arnold
A Faradjian
B Hess
C Schütte
G Wilson
JA Anderson
JC Phillips
KJ Bowers
KJ Bowers
L Verlet
M Eleftheriou
M Shirts
MJ Abraham
P Eastman
R Yokota
S Pronk
S Páll
U Essmann
W Humphrey
WM Brown
Y Andoh
Y Sugita
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

GROMACS is a widely used package for biomolecular simulation, and over the last two decades it has evolved from small-scale efficiency to advanced heterogeneous acceleration and multi-level parallelism targeting some of the largest supercomputers in the world. Here, we describe some of the ways we have been able to realize this through the use of parallelization on all levels, combined with a constant focus on absolute performance. Release 4.6 of GROMACS uses SIMD acceleration on a wide range of architectures, GPU offloading acceleration, and both OpenMP and MPI parallelism within and between nodes, respectively. The recent work on acceleration made it necessary to revisit the fundamental algorithms of molecular simulation, including the concept of neighborsearching, and we discuss the present and future challenges we see for exascale simulation - in particular a very fine-grained task parallelism. We also discuss the software management, code peer review and continuous integration testing required for a project of this complexity.Comment: EASC 2014 conference proceedin

arXiv.org e-Print Archive

Publikationer från KTH

Crossref

Digitala Vetenskapliga Arkivet - Academic Archive On-line

MPG.PuRe