Search CORE

59,549 research outputs found

FLASH: Randomized Algorithms Accelerated over CPU-GPU for Ultra-High Dimensional Similarity Search

Author: Andoni A.
Broder A. Z.
Li P.
Lv Q.
Shrivastava A.
Shrivastava A.
Weber R.
Publication venue
Publication date: 03/07/2018
Field of study

We present FLASH (\textbf{F}ast \textbf{L}SH \textbf{A}lgorithm for \textbf{S}imilarity search accelerated with \textbf{H}PC), a similarity search system for ultra-high dimensional datasets on a single machine, that does not require similarity computations and is tailored for high-performance computing platforms. By leveraging a LSH style randomized indexing procedure and combining it with several principled techniques, such as reservoir sampling, recent advances in one-pass minwise hashing, and count based estimations, we reduce the computational and parallelization costs of similarity search, while retaining sound theoretical guarantees. We evaluate FLASH on several real, high-dimensional datasets from different domains, including text, malicious URL, click-through prediction, social networks, etc. Our experiments shed new light on the difficulties associated with datasets having several million dimensions. Current state-of-the-art implementations either fail on the presented scale or are orders of magnitude slower than FLASH. FLASH is capable of computing an approximate k-NN graph, from scratch, over the full webspam dataset (1.3 billion nonzeros) in less than 10 seconds. Computing a full k-NN graph in less than 10 seconds on the webspam dataset, using brute-force (

n^2D

), will require at least 20 teraflops. We provide CPU and GPU implementations of FLASH for replicability of our results

arXiv.org e-Print Archive

Crossref

Application of graphics processing units to search pipelines for gravitational waves from coalescing binaries of compact objects

Author: Blair David
Cannon Kipp
Chung Shin Kee
Datta Amitava
Wen Linqing
Publication venue: 'AIP Publishing'
Publication date: 07/07/2010
Field of study

We report a novel application of a graphics processing unit (GPU) for the purpose of accelerating the search pipelines for gravitational waves from coalescing binaries of compact objects. A speed-up of 16-fold in total has been achieved with an NVIDIA GeForce 8800 Ultra GPU card compared with one core of a 2.5 GHz Intel Q9300 central processing unit (CPU). We show that substantial improvements are possible and discuss the reduction in CPU count required for the detection of inspiral sources afforded by the use of GPUs

Caltech Authors

Implementation Aspects of a Transmitted-Reference UWB Receiver

Author: Bevilacqua
Blazquez
Blazquez
Carbonelli
Carbonelli
Cassioli
Chen
Choi
Colavolpe
Cramer
Durisi
Durisi
Foerster
Franz
Helal
Ho
Hoctor
IEEE 802.15 WPAN Low rate Alternative PHY Task Group 4a (TG4a)
Kunisch
Lottici
Mielczarek
Newaskar
O'Donnell
Orndorff
Proakis
Qiu
Rabbachin
Saleh
Souilmi
Stoica
Walden
Weisenhorn
Zhang
Publication venue: Wiley
Publication date: 01/01/2005
Field of study

In this paper, we discuss the design issues of an ultra wide band (UWB) receiver targeting a single-chip CMOS implementation for low data-rate applications like ad hoc wireless sensor networks. A non-coherent transmitted reference (TR) receiver is chosen because of its small complexity compared to other architectures. After a brief recapitulation of the UWB fundamentals and a short discussion on the major differences between coherent and non-coherent receivers, we discuss issues, challenges and possible design solutions. Several simulation results obtained by means of a behavioral model are presented, together with an analysis of the trade-off between performance and complexity in an integrated circuit implementation

Crossref

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

GPU-based Iterative Cone Beam CT Reconstruction Using Tight Frame Regularization

Author: Bin Dong
Cai J
Cho S
Dong B
Gu X
Gu X
Han G Liang Z You J
Hestenes M R
Jacobs F
Jia X
Li M
Men C
Men C H
Meyer Y
NVIDIA
Sharp G C
Shen Z W
Shen Z W Toh K C Yun S
Sidky E Y
Sidky E Y
Steve B Jiang
Tang J
Xu F
Xun Jia
Yan G R
Yifei Lou
Publication venue: 'IOP Publishing'
Publication date: 05/05/2011
Field of study

X-ray imaging dose from serial cone-beam CT (CBCT) scans raises a clinical concern in most image guided radiation therapy procedures. It is the goal of this paper to develop a fast GPU-based algorithm to reconstruct high quality CBCT images from undersampled and noisy projection data so as to lower the imaging dose. For this purpose, we have developed an iterative tight frame (TF) based CBCT reconstruction algorithm. A condition that a real CBCT image has a sparse representation under a TF basis is imposed in the iteration process as regularization to the solution. To speed up the computation, a multi-grid method is employed. Our GPU implementation has achieved high computational efficiency and a CBCT image of resolution 512\times512\times70 can be reconstructed in ~5 min. We have tested our algorithm on a digital NCAT phantom and a physical Catphan phantom. It is found that our TF-based algorithm is able to reconstrct CBCT in the context of undersampling and low mAs levels. We have also quantitatively analyzed the reconstructed CBCT image quality in terms of modulation-transfer-function and contrast-to-noise ratio under various scanning conditions. The results confirm the high CBCT image quality obtained from our TF algorithm. Moreover, our algorithm has also been validated in a real clinical context using a head-and-neck patient case. Comparisons of the developed TF algorithm and the current state-of-the-art TV algorithm have also been made in various cases studied in terms of reconstructed image quality and computation efficiency.Comment: 24 pages, 8 figures, accepted by Phys. Med. Bio

arXiv.org e-Print Archive

Crossref

Anode-Coupled Readout for Light Collection in Liquid Argon TPCs

Author: Bugel L.
Collin G. H.
Conrad J. M.
Moss Z.
Toups M.
Publication venue: 'IOP Publishing'
Publication date: 01/07/2015
Field of study

This paper will discuss a new method of signal read-out from photon detectors in ultra-large, underground liquid argon time projection chambers. In this design, the signal from the light collection system is coupled via capacitive plates to the TPC wire-planes. This signal is then read out using the same cabling and electronics as the charge information. This greatly benefits light collection: it eliminates the need for an independent readout, substantially reducing cost; It reduces the number of cables in the vapor region of the TPC that can produce impurities; And it cuts down on the number of feed-throughs in the cryostat wall that can cause heat-leaks and potential points of failure. We present experimental results that demonstrate the sensitivity of a LArTPC wire plane to photon detector signals. We also simulate the effect of a 1

\mu

s shaping time and a 2 MHz sampling rate on these signals in the presence of noise, and find that a single photoelectron timing resolution of

\sim

30 ns can be achieved.Comment: 16 pages, 15 figure

arXiv.org e-Print Archive

DSpace@MIT