This paper presents a method that takes advantage of powerful graphics hardware to obtain fully affine-invariant image feature detection and matching. The chosen approach is the accurate, but also very computationally expensive, ASIFT algorithm. We have created a CUDA version of this algorithm that is up to 70 times faster than the original implementation, while keeping the algorithm’s accuracy close to that of ASIFT. It’s matching performance is therefore much better than that of other non-fully affine-invariant algorithms. Also, this approach was adapted to fit the multi-GPU paradigm in order to assess the acceleration potential from modern GPU clusters.