We present a novel parallel approach, parallel nearest neigh-
bor unit (PNNU), for finding the nearest member in a learned dictionary of high-dimensional features. This is a computation fundamental to machine learning and data analytics algorithms such as sparse coding for feature extraction. PNNU achieves high performance by using three techniques: (1) PNNU employs a novel fast table look up scheme to identify a small number of atoms as candidates from which the nearest neighbor of a query data vector can be found; (2) PNNU reduces computation
cost by working with candidate atoms of reduced dimensionality; and
(3) PNNU performs computations in parallel over multiple cores with
low inter-core communication overheads. Based on e cient computation via techniques (1) and (2), technique (3) attains further speed up
via parallel processing. We have implemented PNNU on multi-core ma-
chines. We demonstrate its superior performance on three application
tasks in signal processing and computer vision. For an action recognition task, PNNU achieves 41x overall performance gains on a 16-core compute server against a conventional serial implementation of nearest
neighbor computation. Our PNNU software is available online as open source.Funded by Naval Postgraduate SchoolIntel CorporationAgreement no. N00244-15-0050 (NPS