28 research outputs found
Batched Second-Order Adjoint Sensitivity for Reduced Space Methods
This paper presents an efficient method for extracting the second-order
sensitivities from a system of implicit nonlinear equations on upcoming
graphical processing units (GPU) dominated computer systems. We design a custom
automatic differentiation (AutoDiff) backend that targets highly parallel
architectures by extracting the second-order information in batch. When the
nonlinear equations are associated to a reduced space optimization problem, we
leverage the parallel reverse-mode accumulation in a batched adjoint-adjoint
algorithm to compute efficiently the reduced Hessian of the problem. We apply
the method to extract the reduced Hessian associated to the balance equations
of a power network, and show on the largest instances that a parallel GPU
implementation is 30 times faster than a sequential CPU reference based on
UMFPACK.Comment: SIAM-PP2
pszufe/OSMToolset.jl: v0.1.2
<h2>OSMToolset v0.1.2</h2>
<p><a href="https://github.com/pszufe/OSMToolset.jl/compare/v0.1.1...v0.1.2">Diff since v0.1.1</a></p>
pszufe/OSMToolset.jl: v0.1.1
<h2>OSMToolset v0.1.1</h2>
<p><a href="https://github.com/pszufe/OSMToolset.jl/compare/v0.1.0...v0.1.1">Diff since v0.1.0</a></p>
alan-turing-institute/MLJ.jl: v0.19.4
MLJ v0.19.4
<p><a href="https://github.com/alan-turing-institute/MLJ.jl/compare/v0.19.3...v0.19.4">Diff since v0.19.3</a></p>
<p><strong>Merged pull requests:</strong></p>
<ul>
<li>Updating MLJBase.jl dep to last version (#1037) (@pebeto)</li>
</ul>
JuliaGPU/GPUCompiler.jl: v0.24.2
GPUCompiler v0.24.2
Diff since v0.24.0
Merged pull requests:
Don't ICE on non-void return types. (#512) (@maleadt)
Improve error message on unsupported type usage. (#513) (@maleadt)
Verify IR after our own validation. (#514) (@maleadt
JuliaGPU/GPUCompiler.jl: v0.24.1
GPUCompiler v0.24.1
Diff since v0.24.0
Merged pull requests:
Don't ICE on non-void return types. (#512) (@maleadt
JuliaGPU/GPUCompiler.jl: v0.24.4
GPUCompiler v0.24.4
Diff since v0.24.3
Merged pull requests:
Metal: Emit name metadata. (#518) (@maleadt
JuliaGPU/GPUCompiler.jl: v0.24.5
GPUCompiler v0.24.5
Diff since v0.24.4
Merged pull requests:
Only warn about missed transformations when debug output is enabled. (#521) (@maleadt
JuliaGPU/AMDGPU.jl: v0.7.4
<h2>AMDGPU v0.7.4</h2>
<p><a href="https://github.com/JuliaGPU/AMDGPU.jl/compare/v0.7.3...v0.7.4">Diff since v0.7.3</a></p>
<p><strong>Merged pull requests:</strong></p>
<ul>
<li>Update preconditioners.jl (#533) (@amontoison)</li>
<li>[rocSPARSE] Interface the generic routines (#535) (@amontoison)</li>
<li>Defer freeing hostcall buffers & add 1.10 CI (#538) (@pxl-th)</li>
<li>Have separate <code>free!</code> method for hostcalls (#539) (@pxl-th)</li>
<li>Switch to artifact device libraries if ROCm 5.5+ is detected (#540) (@pxl-th)</li>
<li>Fix artifact discovery in global project (#541) (@pxl-th)</li>
</ul>
<p><strong>Closed issues:</strong></p>
<ul>
<li>Investigate GPUArrays tests suite error (#515)</li>
<li>Multiple workers hang test suite on Julia 1.10 (#521)</li>
<li>[rocSPARSE] ILU(0) and IC(0) preconditioners are not working (#532)</li>
<li>Hostcall tests hang (#537)</li>
</ul>
JuliaGPU/GPUCompiler.jl: v0.24.3
GPUCompiler v0.24.3
Diff since v0.24.2
Merged pull requests:
Metal: Support vector-valued intrinsics. (#515) (@maleadt)
Disable the vectorizer pipeline. (#516) (@maleadt)
Metal: Avoid duplicate intrinsics. (#517) (@maleadt