Skip to main content
Article thumbnail
Location of Repository

Programming the LU Factorization for a Multicore System with Accelerators

By Jakub Kurzak, Piotr Luszczek, Mathieu Faverge and Jack Dongarra

Abstract

Abstract. LU factorization with partial pivoting is a canonical numerical procedure and the main component of the High Performance LINPACK benchmark. This article presents an implementation of the algorithm for a hybrid, shared memory, system with standard CPU cores and GPU accelerators. Performance in excess of one TeraFLOPS is achieved using four AMD Magny Cours CPUs and four NVIDIA Fermi GPUs.

Year: 2013
OAI identifier: oai:CiteSeerX.psu:10.1.1.352.7267
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://web.eecs.utk.edu/~luszc... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.