Abstract-Basecalling is a core function in DNA sequencing. It is responsible for the conversion of measured date to a text representation of the DNA's molecular make-up. Recent advances in sequencing machinery have greatly accelerated the rate at which DNA data can be gathered using miniaturized platforms. To keep up, the basecalling function requires substantial computing power. To ease this burden we demonstrate an FPGA-based hardware accelerator for basecalling.
I. INTRODUCTION
Sequencers are specialized machines for the analysis of biological molecules such as DNA, RNA, and proteins. In the case of DNA, sequencers extract the unique succession of base molecules, adenine (A), cytosine (C), guanine (G), and thymine (T) that constitute the DNA under test. In essence, sequencers are molecule-to-text converters. The extracted text sequence identifies the so-called primary structure of the measured molecule.
DNA sequencing has made phenomenal strides since Frederick Sanger's methods [1] , [2] boosted sequencing speed from identifying about 10 base pairs (bp) per year to 100 bp/day. Today, large sequencing machines (e.g. the Illumina HiSeq 1000) costing hundreds of thousands of dollars can achieve throughputs in excess of 10 6 bp/s, the equivalent of an entire human genome per hour.
Recent sequencing innovations have achieved further advances through a combination of unique sensors and more effective integration of semiconductor technologies [3] , [4] , [5] , [6] . In particular, stunning reductions in certain aspects of hardware have been achieved with the MinION device [7] pictured in Fig. 1 . This hand sized device interfaces with and obtains raw measurements of DNA samples, with ensuing data processing and signal-to-text translation done with external computing. Even this small device can ideally process over 500 separate DNA samples in parallel and measure the equivalent of a human genome in 3.5 hours. Impressively, this machine essentially operates in a real-time, streaming fashion, allowing continuous introduction of new DNA samples.
Such miniaturizations make the prospect of widely available rapid genome analysis much more realistic. The potential for applications such as personalized health care, environmental monitoring, and industrial process control greatly increases as a result. However, the advances achieved at the front-end (sample preparation, sensing, signal conditioning, etc.) need to be tracked with improvements on the back-end, the part of sequencing dealing with data handling, efficient computing, and algorithm design. It is this aspect of the emerging sequencing technology that is addressed by the proposed demo. Our technology is elaborated below.
II. DNA BASECALLING
Our proposal is to demonstrate a hardware accelerated real-time DNA basecalling engine. Basecalling is a critical step within the overall sequencing pipeline, an example of which is summarized in Fig. 2 . As shown, basecalling is noted as step 5 within the back-end phase of the sequencing procedure and involves a signal-to-text conversion of DNA fragments. Current sample handling methods require such a fragmentation of the original, intact, DNA sample. As a result, computational efforts are needed after basecalling to re-assemble the text snippets into a contiguous primary structure of the starting molecule given in step 1.
Arguably, among these steps, basecalling presents the most critical computing challenge. In converting from measured data to text the basecaller not only serves as a translator, but also as an information compressor, converting scores of samples into 2-bit characters. Unless a basecalling throughput matches that of the front-end, the memory burden increases as measured data must be stored until the basecaller is free to process it. This requires increased memory, power consumption (e.g. via large compute clusters), and better communications (e.g. to cloud-based solutions) and hence compromises the economics for widespread deployment of such molecular sensors.
The translation of sequencing measurements into their text equivalent presents a formidable computing challenge. The incoming data is a high-dimensional representation of the original DNA that is corrupted and distorted by the intermediate sensing and sampling circuitry. The measured data contains long range dependencies and thus requires a basecaller capable of analyzing the input on a sub-sequence basis (as opposed to symbol-by-symbol translation). We address this issue by presenting a real-time basecaller whose computational power is accelerated by custom FPGA hardware.
III. PROPOSED DEMONSTRATION
The proposed system, a major component of the Master's thesis work of Mr. ZhongPan Wu, is now described. The work is carried out with the assistance of Dr. Karim Hammad under the supervision of Drs. Ghafar-Zadeh and Magierowski.
A. System Overview
We plan to demonstrate a hardware-accelerated real-time basecalling system as presented in Fig. 3(a) . A DNA sensor, or its emulated equivalent, will be connected to a PCbased implementation of the basecalling system. Multiplechannels of measurement data will be streamed from the sensor block to the basecaller over a serial measurement channel and demultiplexed to the basecaller within the host PC. A real-time stream of the output as well as its metrics (e.g. basecalling quality score) will be available in real-time on a monitor attached to the basecaller. The system will also be able to regulate the sensor (e.g. engaging and disengaging measurement channels) via a control channel depending on the needs of the basecaller.
The entire demo is expected to consume an area of 6 × 4 feet 2 and require at most two PCs and a modern DNA sequencer already in our possession. The acceleration hardware will be based on Xilinx Virtex-7 FPGA development boards also in our possession. Given that commodity standalone CPU systems achieve a throughput of roughly 1000 bp/s we will be seeking to boost this by one to two orders of magnitude through hardware acceleration.
B. Acceleration Strategy
As indicated in Fig. 3(b) the basecaller itself will consist of a C++ program running on a PC host's CPU and working in conjunction with an external FPGA accelerator device. The basecaller implemented for this demonstration will be a sequential detector solving a hidden Markov model (HMM) of the measurement process. A simple API will be used to seamlessly offload computationally intensive tasks such as posterior probability calculations to the accelerator hardware. 
C. Interface Details
As shown in Fig. 3 (c) the physical connection between the CPU host and the FPGA accelerator will be achieved through the PC's root complex over the PCIe bus. Custom fixedpoint hardware for acceleration of certain HMM calculations will be placed within the FPGA. Given the high-performance requirements of our application, we will be facilitating this interface via main memory through a custom scatter-gather direct memory access (DMA) engine with accompanying driver. The FPGA's connection to the PCIe will be facilitated using an 8-lane endpoint IP block already available from Xilinx. To minimize FPGA memory requirements, the communications protocol between the CPU and FPGA will be configured to efficiently transmit both data and model parameters.
