416 research outputs found
A GPU-Computing Approach to Solar Stokes Profile Inversion
We present a new computational approach to the inversion of solar
photospheric Stokes polarization profiles, under the Milne-Eddington model, for
vector magnetography. Our code, named GENESIS (GENEtic Stokes Inversion
Strategy), employs multi-threaded parallel-processing techniques to harness the
computing power of graphics processing units GPUs, along with algorithms
designed to exploit the inherent parallelism of the Stokes inversion problem.
Using a genetic algorithm (GA) engineered specifically for use with a GPU, we
produce full-disc maps of the photospheric vector magnetic field from polarized
spectral line observations recorded by the Synoptic Optical Long-term
Investigations of the Sun (SOLIS) Vector Spectromagnetograph (VSM) instrument.
We show the advantages of pairing a population-parallel genetic algorithm with
data-parallel GPU-computing techniques, and present an overview of the Stokes
inversion problem, including a description of our adaptation to the
GPU-computing paradigm. Full-disc vector magnetograms derived by this method
are shown, using SOLIS/VSM data observed on 2008 March 28 at 15:45 UT
Extremely large scale simulation of a Kardar-Parisi-Zhang model using graphics cards
The octahedron model introduced recently has been implemented onto graphics
cards, which permits extremely large scale simulations via binary lattice gases
and bit coded algorithms. We confirm scaling behaviour belonging to the 2d
Kardar-Parisi-Zhang universality class and find a surface growth exponent:
beta=0.2415(15) on 2^17 x 2^17 systems, ruling out beta=1/4 suggested by field
theory. The maximum speed-up with respect to a single CPU is 240. The steady
state has been analysed by finite size scaling and a growth exponent
alpha=0.393(4) is found. Correction to scaling exponents are computed and the
power-spectrum density of the steady state is determined. We calculate the
universal scaling functions, cumulants and show that the limit distribution can
be obtained by the sizes considered. We provide numerical fitting for the small
and large tail behaviour of the steady state scaling function of the interface
width.Comment: 7 pages, 8 figures, slightly modified, accepted version for PR
Fast Monte Carlo Simulation for Patient-specific CT/CBCT Imaging Dose Calculation
Recently, X-ray imaging dose from computed tomography (CT) or cone beam CT
(CBCT) scans has become a serious concern. Patient-specific imaging dose
calculation has been proposed for the purpose of dose management. While Monte
Carlo (MC) dose calculation can be quite accurate for this purpose, it suffers
from low computational efficiency. In response to this problem, we have
successfully developed a MC dose calculation package, gCTD, on GPU architecture
under the NVIDIA CUDA platform for fast and accurate estimation of the x-ray
imaging dose received by a patient during a CT or CBCT scan. Techniques have
been developed particularly for the GPU architecture to achieve high
computational efficiency. Dose calculations using CBCT scanning geometry in a
homogeneous water phantom and a heterogeneous Zubal head phantom have shown
good agreement between gCTD and EGSnrc, indicating the accuracy of our code. In
terms of improved efficiency, it is found that gCTD attains a speed-up of ~400
times in the homogeneous water phantom and ~76.6 times in the Zubal phantom
compared to EGSnrc. As for absolute computation time, imaging dose calculation
for the Zubal phantom can be accomplished in ~17 sec with the average relative
standard deviation of 0.4%. Though our gCTD code has been developed and tested
in the context of CBCT scans, with simple modification of geometry it can be
used for assessing imaging dose in CT scans as well.Comment: 18 pages, 7 figures, and 1 tabl
Solving the Ghost-Gluon System of Yang-Mills Theory on GPUs
We solve the ghost-gluon system of Yang-Mills theory using Graphics
Processing Units (GPUs). Working in Landau gauge, we use the Dyson-Schwinger
formalism for the mathematical description as this approach is well-suited to
directly benefit from the computing power of the GPUs. With the help of a
Chebyshev expansion for the dressing functions and a subsequent appliance of a
Newton-Raphson method, the non-linear system of coupled integral equations is
linearized. The resulting Newton matrix is generated in parallel using OpenMPI
and CUDA(TM). Our results show, that it is possible to cut down the run time by
two orders of magnitude as compared to a sequential version of the code. This
makes the proposed techniques well-suited for Dyson-Schwinger calculations on
more complicated systems where the Yang-Mills sector of QCD serves as a
starting point. In addition, the computation of Schwinger functions using GPU
devices is studied.Comment: 19 pages, 7 figures, additional figure added, dependence on
block-size is investigated in more detail, version accepted by CP
Fast Calculation of the Lomb-Scargle Periodogram Using Graphics Processing Units
I introduce a new code for fast calculation of the Lomb-Scargle periodogram,
that leverages the computing power of graphics processing units (GPUs). After
establishing a background to the newly emergent field of GPU computing, I
discuss the code design and narrate key parts of its source. Benchmarking
calculations indicate no significant differences in accuracy compared to an
equivalent CPU-based code. However, the differences in performance are
pronounced; running on a low-end GPU, the code can match 8 CPU cores, and on a
high-end GPU it is faster by a factor approaching thirty. Applications of the
code include analysis of long photometric time series obtained by ongoing
satellite missions and upcoming ground-based monitoring facilities; and
Monte-Carlo simulation of periodogram statistical properties.Comment: Accepted by ApJ. Accompanying program source (updated since
acceptance) can be downloaded from
http://www.astro.wisc.edu/~townsend/resource/download/code/culsp.tar.g
Efektivitas Fungsi Humas Dalam Pelayanan Komplain Pelanggan PDAM Tirta Satria
Kebutuhan air bersih semakin lama semakin meningkat tetapi persedian sumber daya alam berupa air bersih dan sehat mulai berkurang. Hal ini menbuat beberapa masyarakat tertarik untuk menggunakan jasa penyedia air bersih, sehat, dan layak dikonsumsi yang dikelola oleh salah satu penyelenggara pengelolaan air bersih dan sehat di wilayah Banyumas. Penelitian ini bertujuan untuk mengetahui efektivitas fungsi humas dalam pelayanan komplain pelanggan PDAM Tirta Satria, hambatan yang muncul dari pihak humas dalam memberikan pelayanan kepada pelanggan, dan upaya humas dalam mengatasi hambatan tersebut.
Penelitian ini menggunakan metode penelitian kualitatif deskriptip. Dalam pengumpulan data penulis menggunakan metode observasi, wawancara dan dokumentasi. Sasaran dalam penelitian ini adalah pelanggan PDAM Tirta Satria. Teknik analisis data menggunakan Miles dan Huberman, melalui tiga tahap yaitu reduksi data, penyajian (display) data dan penarikan kesimpulan dan validasi. Teknik validitas dan reabilitas adalah dengan triangulasi.
Hasil penelitian menunjukkan bahwa Kualitas pelayanan yang diberikan oleh PDAM belum dapat dikatakan baik. Laporan yang ada dapat diselesaikan pada hari yang sama. Hal tersebut dikarenakan masih banyak prosedur yang harus dilakukan oleh petugas PDAM. Biasanya pihak PDAM Tirta Satria baru merealisasikannya sehari kemudian setelah laporan pengaduan masuk. Upaya yang dilakukan oleh pihak humas di PDAM Tirta Satria untuk mengatasi hambatan-hambatan yaitu evaluasi dan Pelatihan atau training kehumasan dan studi banding
Solving Lattice QCD systems of equations using mixed precision solvers on GPUs
Modern graphics hardware is designed for highly parallel numerical tasks and
promises significant cost and performance benefits for many scientific
applications. One such application is lattice quantum chromodyamics (lattice
QCD), where the main computational challenge is to efficiently solve the
discretized Dirac equation in the presence of an SU(3) gauge field. Using
NVIDIA's CUDA platform we have implemented a Wilson-Dirac sparse matrix-vector
product that performs at up to 40 Gflops, 135 Gflops and 212 Gflops for double,
single and half precision respectively on NVIDIA's GeForce GTX 280 GPU. We have
developed a new mixed precision approach for Krylov solvers using reliable
updates which allows for full double precision accuracy while using only single
or half precision arithmetic for the bulk of the computation. The resulting
BiCGstab and CG solvers run in excess of 100 Gflops and, in terms of iterations
until convergence, perform better than the usual defect-correction approach for
mixed precision.Comment: 30 pages, 7 figure
GAMER: a GPU-Accelerated Adaptive Mesh Refinement Code for Astrophysics
We present the newly developed code, GAMER (GPU-accelerated Adaptive MEsh
Refinement code), which has adopted a novel approach to improve the performance
of adaptive mesh refinement (AMR) astrophysical simulations by a large factor
with the use of the graphic processing unit (GPU). The AMR implementation is
based on a hierarchy of grid patches with an oct-tree data structure. We adopt
a three-dimensional relaxing TVD scheme for the hydrodynamic solver, and a
multi-level relaxation scheme for the Poisson solver. Both solvers have been
implemented in GPU, by which hundreds of patches can be advanced in parallel.
The computational overhead associated with the data transfer between CPU and
GPU is carefully reduced by utilizing the capability of asynchronous memory
copies in GPU, and the computing time of the ghost-zone values for each patch
is made to diminish by overlapping it with the GPU computations. We demonstrate
the accuracy of the code by performing several standard test problems in
astrophysics. GAMER is a parallel code that can be run in a multi-GPU cluster
system. We measure the performance of the code by performing purely-baryonic
cosmological simulations in different hardware implementations, in which
detailed timing analyses provide comparison between the computations with and
without GPU(s) acceleration. Maximum speed-up factors of 12.19 and 10.47 are
demonstrated using 1 GPU with 4096^3 effective resolution and 16 GPUs with
8192^3 effective resolution, respectively.Comment: 60 pages, 22 figures, 3 tables. More accuracy tests are included.
Accepted for publication in ApJ
GPU-based fast Monte Carlo simulation for radiotherapy dose calculation
Monte Carlo (MC) simulation is commonly considered to be the most accurate
dose calculation method in radiotherapy. However, its efficiency still requires
improvement for many routine clinical applications. In this paper, we present
our recent progress towards the development a GPU-based MC dose calculation
package, gDPM v2.0. It utilizes the parallel computation ability of a GPU to
achieve high efficiency, while maintaining the same particle transport physics
as in the original DPM code and hence the same level of simulation accuracy. In
GPU computing, divergence of execution paths between threads can considerably
reduce the efficiency. Since photons and electrons undergo different physics
and hence attain different execution paths, we use a simulation scheme where
photon transport and electron transport are separated to partially relieve the
thread divergence issue. High performance random number generator and hardware
linear interpolation are also utilized. We have also developed various
components to handle fluence map and linac geometry, so that gDPM can be used
to compute dose distributions for realistic IMRT or VMAT treatment plans. Our
gDPM package is tested for its accuracy and efficiency in both phantoms and
realistic patient cases. In all cases, the average relative uncertainties are
less than 1%. A statistical t-test is performed and the dose difference between
the CPU and the GPU results is found not statistically significant in over 96%
of the high dose region and over 97% of the entire region. Speed up factors of
69.1 ~ 87.2 have been observed using an NVIDIA Tesla C2050 GPU card against a
2.27GHz Intel Xeon CPU processor. For realistic IMRT and VMAT plans, MC dose
calculation can be completed with less than 1% standard deviation in 36.1~39.6
sec using gDPM.Comment: 18 pages, 5 figures, and 3 table
- …