A user configurable data acquisition and signal processing system for high-rate, high channel count applications by Salim, Arwa et al.
Strathprints Institutional Repository
Salim, Arwa and Crockett, Louise Helen and McLean, John and Milne, Peter (2012) A user
configurable data acquisition and signal processing system for high-rate, high channel count
applications. Fusion Engineering and Design. ISSN 0920-3796 (In Press)
Strathprints is designed to allow users to access the research output of the University of Strathclyde.
Copyright c© and Moral Rights for the papers on this site are retained by the individual authors
and/or other copyright owners. You may not engage in further distribution of the material for any
profitmaking activities or any commercial gain. You may freely distribute both the url (http://
strathprints.strath.ac.uk/) and the content of this paper for research or study, educational, or
not-for-profit purposes without prior permission or charge.
Any correspondence concerning this service should be sent to Strathprints administrator:
mailto:strathprints@strath.ac.uk
http://strathprints.strath.ac.uk/
Please cite this article in press as: A. Salim, et al., A user conﬁgurable data acquisition and signal processing system for high-rate, high channel
count  applications, Fusion Eng. Des. (2012), http://dx.doi.org/10.1016/j.fusengdes.2012.03.047
ARTICLE IN PRESSG ModelFUSION-6341; No. of Pages 4
Fusion Engineering and Design xxx (2012) xxx– xxx
Contents lists available at SciVerse ScienceDirect
Fusion  Engineering  and  Design
journa l h o me  page: www.elsev ier .com/ locate / fusengdes
A  user  conﬁgurable  data  acquisition  and  signal  processing  system  for  high-rate,
high  channel  count  applications
Arwa  Salima,∗,  Louise  Crocketta, John  McLeanb,  Peter  Milneb
a University of Strathclyde, Scotland, UK
b D-TACQ Solutions, Scotland, UK
a  r  t  i  c  l  e  i  n  f  o
Article history:
Available online xxx
Keywords:
Data acquisition
DSP
FPGAs
Reconﬁgurable
a  b  s  t  r  a  c  t
Real-time  signal  processing  in plasma  fusion  experiments  is  required  for  control  and  for  data  reduction  as
plasma pulse  times  grow longer.  The  development  time  and  cost  for  these  high-rate,  multichannel  signal
processing  systems  can be  signiﬁcant.  This  paper  proposes  a new  digital  signal  processing  (DSP)  platform
for  the  data  acquisition  system  that  will  allow  users  to easily  customize  real-time  signal  processing
systems  to meet  their individual  requirements.
The D-TACQ  reconﬁgurable  user  in-line  DSP  (DRUID)  system  carries  out  the signal  processing  tasks
in  hardware  co-processors  (CPs)  implemented  in  an  FPGA,  with an  embedded  microprocessor  (P)  for
control.  In the  fully  developed  platform,  users  will  be able  to choose  co-processors  from  a  library  and
conﬁgure  programmable  parameters  through  the  P to meet  their requirements.
The  DRUID  system  is implemented  on  a  Spartan  6 FPGA,  on  the new  rear transition  module  (RTM-T),  a
ﬁeld  upgrade  to existing  D-TACQ  digitizers.
As  proof  of concept,  a multiply-accumulate  (MAC)  co-processor  has  been  developed,  which  can  be
conﬁgured  as  a digital  chopper-integrator  for long  pulse  magnetic  fusion  devices.  The DRUID  platform
allows  users  to  set options  for  the  integrator,  such  as  the  number  of  masking  samples.  Results  from  the
digital integrator  are  presented  for  a data  acquisition  system  with  96  channels  simultaneously  acquiring
data  at 500  kSamples/s  per  channel.
© 2012 Elsevier B.V. All rights reserved.
1. Introduction
Plasma fusion experiments contain a large number of diag-
nostics, requiring high channel count data acquisition systems.
As the experiments are run for longer, the amount of data
acquired increases beyond practical real-time storage limits. Pro-
cessing this data while it is being acquired allows real time
control, and can lead to signiﬁcant reductions in data storage
requirements, allowing larger parts of the experiment to be
diagnosed.
This real-time signal processing is usually carried out in host
control central processing units (CPUs) or in ﬁeld programmable
gate arrays (FPGAs) on the digitizers. Carrying out the real-time
processing in FPGAs, which are integrated into the data acquisition
hardware, has the advantage of eliminating the additional latency
involved in transferring data over host bus adaptors to a CPU.
The parallel architecture of FPGAs is well suited to implementing
∗ Corresponding author. Tel.: +44 7999642291; fax: +44 8724379777.
E-mail addresses: arwa.salim@eee.strath.ac.uk, arwa.salim@d-tacq.co.uk
(A.  Salim).
digital signal processing (DSP) algorithms, which are inherently
computationally parallel, resulting in efﬁcient and high speed
implementations. These characteristics have led to many DSP
systems being developed in FPGAs, e.g. the coupler protection
system on Alcator C-Mod [1].
Although FPGAs are “ﬁeld programmable” in the sense that
their functionality can be deﬁned (and redeﬁned) after manufac-
ture, their ﬂexibility at run-time, or dynamic reconﬁgurability is
limited compared to microprocessors. From an end user’s perspec-
tive, this could be a disadvantage: the signal processing performed
on the acquired data cannot readily be altered. The time and cost
of re-designing FPGAs as application requirements evolve can be
signiﬁcant.
This paper describes the development of a new platform that
will allow end users of data acquisition DSP systems the ability
to dynamically reconﬁgure the systems, without requiring the spe-
cialist engineering knowledge or time needed to re-design an FPGA.
The DRUID system will allow users to ‘build’ customized signal pro-
cessing systems, optimized for high channel counts, through simple
software routines.
The architecture of the DRUID system is described in the next
section. Section 3 describes the digital integrator that has been
0920-3796/$ – see front matter ©  2012 Elsevier B.V. All rights reserved.
http://dx.doi.org/10.1016/j.fusengdes.2012.03.047
Please cite this article in press as: A. Salim, et al., A user conﬁgurable data acquisition and signal processing system for high-rate, high channel
count  applications, Fusion Eng. Des. (2012), http://dx.doi.org/10.1016/j.fusengdes.2012.03.047
ARTICLE IN PRESSG ModelFUSION-6341; No. of Pages 4
2 A. Salim et al. / Fusion Engineering and Design xxx (2012) xxx– xxx
Fig. 1. (a) The DRUID system architecture and (b) co-processor architecture.
developed as a proof of concept of the architecture and Section
4 concludes and looks forward to future developments.
2. The DRUID architecture
The DRUID system consists of an embedded microprocessor
used for conﬁguration and control, and hardware co-processors
to perform the signal processing tasks. Fig. 1(a) illustrates the
architecture with its separate control and data interconnects. Such
“coarse-grain” reconﬁgurable architectures have been proposed
before [2].  However, with modern FPGAs providing embedded
arithmetic slices optimized for DSP operations, the DRUID system
can be implemented in an FPGA. This allows the system to exploit
all of the advantages of FPGAs (cost, speed, HW reconﬁgurability
and parallel processing) while providing microprocessor-like ﬂexi-
bility, for the subset of signal processing functions commonly used
in plasma fusion diagnostics.
The DRUID microprocessor (DP) is an embedded Xilinx MicroB-
laze inside the FPGA. The DP is responsible for conﬁguring the
co-processors and sending instructions over the control intercon-
nect. The data interconnect allows data transfers in any direction
between the co-processors.
Fig. 1(b) illustrates the structure of the co-processors. Each
co-processor has been optimized to perform a single processing
function ‘simultaneously’ on a single sample data vector. Data in
the DRUID system is processed on a sample by sample basis, where
each sample consists of data from 96 channels. The co-processors
can receive instructions and update their status after processing
each sample. This approach allows great ﬂexibility in implementing
a DSP algorithm.
The DRUID microprocessor is not involved in the actual data
transfer between co-processors. It simply manages the ﬂow of data.
This allows continuous execution of the DSP algorithms, with the
DP sending instructions to the co-processors as required. The inline
processing architecture avoids traditional real-time operating sys-
tem difﬁculties, resulting in deterministic data ﬂow.
To reconﬁgure the system, users simply need to modify the
software running on the DRUID microprocessor. This will be done
through a second, supervisor microprocessor (SP) via a shared
memory interface. The SP will implement an operating system to
provide a user interface, and will be capable of generating a DRUID
system reset. On reset the DP will execute the new code from the
shared memory. Options for implementing the SP (embedded or
external) and the shared memory interface are still being inves-
tigated. Through this architecture, operating parameters can be
modiﬁed, or a completely different system can be implemented
by activating different co-processors—without the time, effort and
complexity of re-designing an FPGA.
3. Digital integrator
3.1. Principle
The ﬁrst application that has been developed for the DRUID
system is a digital integrator for long pulse magnetic fusion exper-
iments. It is based on the chopper integrator principle proposed in
[3]. The magnetic diagnostics used in fusion experiments require
continual, real-time integration of the measured signal to give mag-
netic ﬂux readings, which may  then be used for real-time control.
Offsets induced in the measured signal by the ﬁrst amplifying stage,
when integrated, can have a signiﬁcant effect on the readings. A
‘chopper-integrator’ [3,4], as illustrated in Fig. 2, is used to alleviate
the offset problem.
The analogue modulation of the data is carried out by switching
the polarity of the coils [3]. A problem with this method is that
overshoots may  appear in the signal at the switching points. Parts
of the signal around these transitions may  have to be discarded
before integration.
3.2. Implementation
The new DRUID platform has been implemented on a Xilinx
Spartan 6 FPGA [6] on D-TACQ’s RTM-T module. As illustrated in
Fig. 3, the data is digitized after ampliﬁcation. The demodulation
and integration of the data are carried out after digitization. Ref. [5]
describes progress that has been made with a ﬁxed implementation
of such a digital integrator on D-TACQ’s ACQ196CPCI digitizers. A
Fig. 2. The chopper-integrator principle [3–5].
Please cite this article in press as: A. Salim, et al., A user conﬁgurable data acquisition and signal processing system for high-rate, high channel
count  applications, Fusion Eng. Des. (2012), http://dx.doi.org/10.1016/j.fusengdes.2012.03.047
ARTICLE IN PRESSG ModelFUSION-6341; No. of Pages 4
A. Salim et al. / Fusion Engineering and Design xxx (2012) xxx– xxx 3
Fig. 3. DRUID platform implementation hardware.
Fig. 4. Simpliﬁed algorithm for digital integrator operation.
partial summation is carried out on the data between chopper tran-
sitions, discarding samples around the edges that may have become
corrupted.
This processing has now been moved to a multiply-accumulate
(MAC) co-processor on the DRUID platform which is implemented
on the RTM-T. The MAC  unit can be conﬁgured through software
to carry out the demodulation and the integration (accumulation)
between chopper edges. The demodulation is carried out by feed-
ing the same chopper signal into one of the digitizer’s digital inputs
and embedding signatures into the data stream. The DP can decide
how to deal with the data on a sample by sample basis—whether
samples need to be discarded if they surround an edge, whether
they need to be rectiﬁed before accumulation, etc. This allows
users to control the number of samples that are thrown away and
removes the difﬁculties with synchronizing a demodulation signal.
The output frequency of the system can also be controlled—in the
digital integrator implementation, piecewise integration is carried
out between chopper edges and the accumulated result is output
every time a new edge occurs.
A simpliﬁed algorithm in Fig. 4 illustrates how the integrator
is implemented using sample-by-sample processing in the MAC
co-processor.
3.3. Resource utilization/statistics
Table 1 records the percentage of the total available resources
utilized by the entire module (including the RTM-T communica-
tions and control interfaces), the DRUID platform and just the
MAC  co-processor. Block RAM utilization is high because they are
Table 1
Resource utilization on the Spartan 6 (XC6SLX45T).
Slice registers Slice LUTs RAM DSP
Total 15% 37% 54% 1%
DRUID 6.4% 3.7% 34.5% 1%
MAC  CP 1.9% 0.6% 3.5% 1%
Fig. 5. Chopped sine wave input to the digitizer.
used for the embedded microprocessor memory and because the
architecture of the platform requires storage internal to the co-
processors. Given the results obtained, there is clear scope in the
FPGA to implement more co-processors.
The latency through the data acquisition system measured from
the analogue to digital conversion all the way through to transfer
over PCIe (peripheral component interconnect express) on cable to
a host computer is less than 10 s. Additional latency due to the
DRUID platform will vary and depend upon the type of signal pro-
cessing carried out. For the digital integrator operating in streaming
mode, this additional latency was measured to be 1.2 s.
3.4. Results
The DRUID-MAC system has been tested on the D-TACQ
ACQ196-RTM-T hardware [7] with simultaneous data acquisition
on 96 channels sampled at 500 kSamples/s per channel. For clarity,
the results in the next section only show a single channel.
Fig. 5 shows the chopped sine wave input to the digitizers, along
with the kind of signal corruption that may  be caused by the switch-
ing at the chopper transitions. A DC offset can also be seen in the
signal.
Fig. 6 demonstrates how the software routine can be modiﬁed
to zero out the corrupted samples.
Fig. 6. Corrupted samples zeroed out.
Please cite this article in press as: A. Salim, et al., A user conﬁgurable data acquisition and signal processing system for high-rate, high channel
count  applications, Fusion Eng. Des. (2012), http://dx.doi.org/10.1016/j.fusengdes.2012.03.047
ARTICLE IN PRESSG ModelFUSION-6341; No. of Pages 4
4 A. Salim et al. / Fusion Engineering and Design xxx (2012) xxx– xxx
Fig. 7. Piecewise integration.
Fig. 8. Final summation carried out in software.
Fig. 7 shows the piecewise integrated output. Compare this
with Fig. 5. The DC offset has been removed—it shows up as high
frequency ‘noise’ on the required signal, which can be ﬁltered
out. This signal has been output at twice the chopper frequency.
Integration over many samples signiﬁcantly reduces the output
data rate. Piecewise integration is carried out to avoid overﬂow
problems—the ﬁnal summation can be carried out in software on a
host computer.
Fig. 8 shows result after the ﬁnal summation has been carried
out. For comparison, integration of the input signal without using a
chopper has also been plotted on the same graph. The result illus-
trates the signiﬁcant effect that offsets can have on an integrated
signal, and the effectiveness of the chopper integrator in alleviating
the problem. Because additional information can be sent out by the
DRUID system, such as the measured chopper duty cycle ratio, with
each output sample, the software can compensate for any analogue
signal distortions.
4. Conclusion
The DRUID system is being developed to provide a powerful
digital signal processing platform for plasma fusion and other high
energy physics experiments. This paper has described the architec-
ture of the system and demonstrated a working prototype for the
digital integrator application. Users can change the functionality of
the system in a microprocessor-like way, while fast, simultaneous
processing is carried out on a very large number of input signals,
utilizing the parallel processing capabilities of the FPGA. With the
addition of more CPs, users will be able to combine multiple pro-
cessing units to implement more complex algorithms, which can be
updated or modiﬁed through software as experiment requirements
evolve.
References
[1] W.  Burke, D.R. Terry, H. Kennedy, J. Stillerman, J. McLean, P. Milne, The coupler
protection system upgrade for lower hybrid current drive on Alcator C-Mod, in:
22nd Symposium on Fusion Engineering, 2007, pp. 1–4.
[2] A. Abnous, J. Rabaey, Ultra-low-power domain-speciﬁc multimedia processors,
in: Workshop on VLSI Signal Processing IX, 1996, pp. 461–470.
[3] A. Werner, W7-X magnetic diagnostics: performance of the digital integrator,
Review of Scientiﬁc Instruments 77 (10E307) (2006) 1–5.
[4] C.C. Enz, G.C. Temes, Circuit techniques for reducing the effects of op-am imper-
fections: autozeroing, correlated double sampling, and chopper stabilization,
Proceedings of the IEEE 84 (11) (1996) 1584–1614.
[5] S.H. Seo, A. Werner, M.  Marquardt, Development of a digital integrator for the
KSTAR device, Review of Scientiﬁc Instruments 81 (123507) (2010) 1–4.
[6]  Xilinx, http://www.xilinx.com/products/silicon-devices/fpga/spartan-6/lxt.
htm.
[7] D-TACQ Solutions, http://www.d-tacq.com/rtm-t.shtml.
