28 research outputs found

    Batched Second-Order Adjoint Sensitivity for Reduced Space Methods

    Full text link
    This paper presents an efficient method for extracting the second-order sensitivities from a system of implicit nonlinear equations on upcoming graphical processing units (GPU) dominated computer systems. We design a custom automatic differentiation (AutoDiff) backend that targets highly parallel architectures by extracting the second-order information in batch. When the nonlinear equations are associated to a reduced space optimization problem, we leverage the parallel reverse-mode accumulation in a batched adjoint-adjoint algorithm to compute efficiently the reduced Hessian of the problem. We apply the method to extract the reduced Hessian associated to the balance equations of a power network, and show on the largest instances that a parallel GPU implementation is 30 times faster than a sequential CPU reference based on UMFPACK.Comment: SIAM-PP2

    pszufe/OSMToolset.jl: v0.1.2

    No full text
    <h2>OSMToolset v0.1.2</h2> <p><a href="https://github.com/pszufe/OSMToolset.jl/compare/v0.1.1...v0.1.2">Diff since v0.1.1</a></p&gt

    pszufe/OSMToolset.jl: v0.1.1

    No full text
    <h2>OSMToolset v0.1.1</h2> <p><a href="https://github.com/pszufe/OSMToolset.jl/compare/v0.1.0...v0.1.1">Diff since v0.1.0</a></p&gt

    alan-turing-institute/MLJ.jl: v0.19.4

    No full text
    MLJ v0.19.4 <p><a href="https://github.com/alan-turing-institute/MLJ.jl/compare/v0.19.3...v0.19.4">Diff since v0.19.3</a></p> <p><strong>Merged pull requests:</strong></p> <ul> <li>Updating MLJBase.jl dep to last version (#1037) (@pebeto)</li> </ul&gt

    JuliaGPU/GPUCompiler.jl: v0.24.2

    No full text
    GPUCompiler v0.24.2 Diff since v0.24.0 Merged pull requests: Don't ICE on non-void return types. (#512) (@maleadt) Improve error message on unsupported type usage. (#513) (@maleadt) Verify IR after our own validation. (#514) (@maleadt

    JuliaGPU/GPUCompiler.jl: v0.24.5

    No full text
    GPUCompiler v0.24.5 Diff since v0.24.4 Merged pull requests: Only warn about missed transformations when debug output is enabled. (#521) (@maleadt

    JuliaGPU/AMDGPU.jl: v0.7.4

    No full text
    <h2>AMDGPU v0.7.4</h2> <p><a href="https://github.com/JuliaGPU/AMDGPU.jl/compare/v0.7.3...v0.7.4">Diff since v0.7.3</a></p> <p><strong>Merged pull requests:</strong></p> <ul> <li>Update preconditioners.jl (#533) (@amontoison)</li> <li>[rocSPARSE] Interface the generic routines (#535) (@amontoison)</li> <li>Defer freeing hostcall buffers & add 1.10 CI (#538) (@pxl-th)</li> <li>Have separate <code>free!</code> method for hostcalls (#539) (@pxl-th)</li> <li>Switch to artifact device libraries if ROCm 5.5+ is detected (#540) (@pxl-th)</li> <li>Fix artifact discovery in global project (#541) (@pxl-th)</li> </ul> <p><strong>Closed issues:</strong></p> <ul> <li>Investigate GPUArrays tests suite error (#515)</li> <li>Multiple workers hang test suite on Julia 1.10 (#521)</li> <li>[rocSPARSE] ILU(0) and IC(0) preconditioners are not working (#532)</li> <li>Hostcall tests hang (#537)</li> </ul&gt

    JuliaGPU/GPUCompiler.jl: v0.24.3

    No full text
    GPUCompiler v0.24.3 Diff since v0.24.2 Merged pull requests: Metal: Support vector-valued intrinsics. (#515) (@maleadt) Disable the vectorizer pipeline. (#516) (@maleadt) Metal: Avoid duplicate intrinsics. (#517) (@maleadt
    corecore