1 research outputs found

    Equity Swaps에 λŒ€ν•œ 고차수렴 μœ ν•œμ°¨λΆ„λ²•κ³Ό OpenCL을 μ΄μš©ν•œ Heterogeneous μ»΄ν“¨νŒ…

    Get PDF
    ν•™μœ„λ…Όλ¬Έ (박사)-- μ„œμšΈλŒ€ν•™κ΅ λŒ€ν•™μ› : ν˜‘λ™κ³Όμ • 계산과학 전곡, 2013. 8. μ‹ λ™μš°.λ³Έ ν•™μœ„ λ…Όλ¬Έμ—μ„œλŠ” Equity μŠ€μ™‘ λͺ¨λΈμ— λŒ€ν•œ 4μ°¨ 수렴 μœ ν•œμ°¨λΆ„λ²•μ„ μ œμ•ˆν•˜μ˜€λ‹€. 특히 Equity μŠ€μ™‘ λͺ¨λΈμ€ μ‹œκ°„κ³Ό 곡간에 μ’…μ†ν•˜λŠ” κ³„μˆ˜λ“€μ„ 가지고 있기 λ•Œλ¬Έμ—, 4μ°¨ 수렴 μœ ν•œμ°¨λΆ„λ²•μ„ μœ λ„ν•˜κΈ° μœ„ν•˜μ—¬ νŠΉλ³„ν•œ μ’Œν‘œ λ³€ν™˜μ„ κ³ λ €ν•˜μ˜€λ‹€. 이 μ’Œν‘œ λ³€ν™˜μ€ νŽΈλ―ΈλΆ„ λ°©μ •μ‹μ—μ„œ ꡐ차미뢄을 μ œκ±°ν•˜λŠ” κ²ƒμœΌλ‘œ, μ—¬λŸ¬ μ˜ˆμ œλ“€μ„ 톡해 κ·Έ μˆ˜λ ΄μ„±μ„ κ²€μ¦ν•˜μ˜€λ‹€. λŒ€λΆ€λΆ„μ˜ μ„ ν˜•ν•΄λ²•λ“€μ€ BLAS μ•Œκ³ λ¦¬μ¦˜μ„ κΈ°λ°˜ν•˜μ—¬ κ΅¬μ„±λ˜μ–΄μžˆκΈ° λ•Œλ¬Έμ—, CPU와 GPUλ₯Ό μ‚¬μš©ν•˜μ—¬ BLASλ₯Ό 병렬화 ν•˜λŠ” 연ꡬλ₯Ό μˆ˜ν–‰ν•˜μ˜€λ‹€. 이것은 CPU와 GPU에 μ–΄λ–»κ²Œ μž‘μ—…μ„ λΆ„λ°°ν•  κ²ƒμΈκ°€μ˜ 문제둜 κ·€κ²°λ˜κ³ , λΆ„λ°°ν•˜λŠ” 지점은 각 κ³„μ‚°μžμ›μ—μ„œ μ†Œμš”λ˜λŠ” κ³„μ‚°μ‹œκ°„μ˜ μ΅œμ†Œβ€“μ΅œλŒ€ 문제둜 λ‚˜νƒ€λ‚Ό 수 μžˆλ‹€. CPU와 GPUμ—μ„œ νŠΉμ • BLASλ₯Ό κ³„μ‚°ν•˜λŠ”λ° κ±Έλ¦¬λŠ” μ‹œκ°„μ„ λ‹€ν•­ν•¨μˆ˜μ˜ ν˜•νƒœλ‘œ μ˜ˆμΈ‘ν•¨μœΌλ‘œμ¨, μ΅œμ†Œβ€“μ΅œλŒ€ λ¬Έμ œμ™€ μ‹€μ œ 계산결과λ₯Ό 비ꡐ λΆ„μ„ν•˜μ˜€λ‹€.A nine-point compact finite difference scheme with fourth-order convergence is proposed for an equity swap model. In order to derive a compact scheme for the equity swap model, a special treatment is necessary to remove the mixed derivative term so that the resulting scheme is of fourth-order convergence as well as compactness. A suitable coordinate transformation is proposed to eliminate the mixed derivative term successfully. The resulting algorithm is shown to be a fourth order convergent scheme. Various examples confirm the validity of the proposed scheme. Since most of linear solvers consist of basic linear algebra subroutines (BLAS), we optimize computational performance by distributing a subroutine into CPU and GPU with some splitting ratio. We present this splitting ratio by means of a min-max problem concerning with computational times. Computational times for both CPU and GPU are estimated as polynomial functions based on their capabilities. BLAS saxpy, sgemv, and sgemm are implemented in OpenCL and we verified our min-max model with actual heterogeneous computing results.I A Higher-Order Finite-Difference Scheme for Equity Swaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.1 Previous Studies . . . . . . . . . . . . . . . . . . . . . . . 5 1.2 Equity Swaps . . . . . . . . . . . . . . . . . . . . . . . . . 6 2 Higher-order compact Finite difference scheme 13 2.1 Seeking a higher-order scheme . . . . . . . . . . . . 14 2.2 Coordinate transformation . . . . . . . . . . . . . . . . 17 2.3 A nine-point compact scheme . . . . . . . . . . . . . 19 3 Stability analysis . . . . . . . . . . . . . . . . . . . . . . . . 25 4 Numerical results . . . . . . . . . . . . . . . . . . . . . . . 31 5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 II Heterogeneous Computing with OpenCL . . . . . . . .41 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 2 OpenCL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 3 Implementation Issues . . . . . . . . . . . . . . . . . . . . 47 3.1 Concurrency in Heterogeneous Computing . . . . . 48 3.2 CPU Parking Protocol . . . . . . . . . . . . . . . . . . . . 52 4 Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.1 Performance Estimations . . . . . . . . . . . . . . . . . 57 4.2 PCI express Bandwidth . . . . . . . . . . . . . . . . . . 59 5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 A Hardware Parameters . . . . . . . . . . . . . . . . . . . . 69 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71Docto
    corecore