Search CORE

3 research outputs found

Two-level compact implicit schemes for three-dimensional parabolic problems.

Author: Karaa Samir
Othman Mohamed
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

We derive a class of two-level high-order implicit finite difference schemes for solving three-dimensional parabolic problems with mixed derivatives. The schemes are fourth-order accurate in space and second- or lower-order accurate in time depending on the choice of a weighted average parameter μ. Numerical results with μ=0.5 are presented to confirm the high accuracy of the derived scheme and to compare it with the standard second-order central difference scheme. It is shown that the improvement in accuracy does not come at a higher cost of computation and storage since it is possible to choose the grid parameters so that the present scheme requires less work and memory and gives more accuracy than the standard central difference scheme

Elsevier - Publisher Connector

Universiti Putra Malaysia Institutional Repository

Equity Swaps에 대한 고차수렴 유한차분법과 OpenCL을 이용한 Heterogeneous 컴퓨팅

Author: 추형석
Publication venue: 서울대학교 대학원
Publication date: 01/08/2013
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 협동과정 계산과학 전공, 2013. 8. 신동우.본 학위 논문에서는 Equity 스왑 모델에 대한 4차 수렴 유한차분법을 제안하였다. 특히 Equity 스왑 모델은 시간과 공간에 종속하는 계수들을 가지고 있기 때문에, 4차 수렴 유한차분법을 유도하기 위하여 특별한 좌표 변환을 고려하였다. 이 좌표 변환은 편미분 방정식에서 교차미분을 제거하는 것으로, 여러 예제들을 통해 그 수렴성을 검증하였다. 대부분의 선형해법들은 BLAS 알고리즘을 기반하여 구성되어있기 때문에, CPU와 GPU를 사용하여 BLAS를 병렬화 하는 연구를 수행하였다. 이것은 CPU와 GPU에 어떻게 작업을 분배할 것인가의 문제로 귀결되고, 분배하는 지점은 각 계산자원에서 소요되는 계산시간의 최소–최대 문제로 나타낼 수 있다. CPU와 GPU에서 특정 BLAS를 계산하는데 걸리는 시간을 다항함수의 형태로 예측함으로써, 최소–최대 문제와 실제 계산결과를 비교 분석하였다.A nine-point compact finite difference scheme with fourth-order convergence is proposed for an equity swap model. In order to derive a compact scheme for the equity swap model, a special treatment is necessary to remove the mixed derivative term so that the resulting scheme is of fourth-order convergence as well as compactness. A suitable coordinate transformation is proposed to eliminate the mixed derivative term successfully. The resulting algorithm is shown to be a fourth order convergent scheme. Various examples confirm the validity of the proposed scheme. Since most of linear solvers consist of basic linear algebra subroutines (BLAS), we optimize computational performance by distributing a subroutine into CPU and GPU with some splitting ratio. We present this splitting ratio by means of a min-max problem concerning with computational times. Computational times for both CPU and GPU are estimated as polynomial functions based on their capabilities. BLAS saxpy, sgemv, and sgemm are implemented in OpenCL and we verified our min-max model with actual heterogeneous computing results.I A Higher-Order Finite-Difference Scheme for Equity Swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Previous Studies . . . . . . . . . . . . . . . . . . . . . . . 5 1.2 Equity Swaps . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Higher-order compact Finite difference scheme 13 2.1 Seeking a higher-order scheme . . . . . . . . . . . . 14 2.2 Coordinate transformation . . . . . . . . . . . . . . . . 17 2.3 A nine-point compact scheme . . . . . . . . . . . . . 19 3 Stability analysis . . . . . . . . . . . . . . . . . . . . . . . . 25 4 Numerical results . . . . . . . . . . . . . . . . . . . . . . . 31 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 II Heterogeneous Computing with OpenCL . . . . . . . .41 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2 OpenCL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3 Implementation Issues . . . . . . . . . . . . . . . . . . . . 47 3.1 Concurrency in Heterogeneous Computing . . . . . 48 3.2 CPU Parking Protocol . . . . . . . . . . . . . . . . . . . . 52 4 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.1 Performance Estimations . . . . . . . . . . . . . . . . . 57 4.2 PCI express Bandwidth . . . . . . . . . . . . . . . . . . 59 5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 A Hardware Parameters . . . . . . . . . . . . . . . . . . . . 69 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Docto

SNU Open Repository and Archive