Deployment and future prospects of high performance diagnostics featuring serial I/O (SIO) data acquisition (DAQ) at ASDEX Upgrade by Behler, K. et al.
Deployment and Future Prospects of High Performance Diagnostics featuring 
Serial I/O (SIO) Data Acquisition (DAQ) at ASDEX Upgrade
K. Behler1, H. Blank1, H. Eixenberger1, M. Fitzek2, A. Lohs1, K. Lüddecke2, R. Merkel1,
ASDEX Upgrade Team
1Max-Planck-Institut für Plasmaphysik, Boltzmannstr. 2, D-85748 Garching bei München
2Unlimited Computer Systems GmbH, Seeshaupterstr. 15, D-82393 Iffeldorf
Abstract
The SIO DAQ concept used at the ASDEX Upgrade fusion experiment features data acquisition  
from a modular front-end (a modular crate-and-interface-cards concept for analog and digital  
input and output) over standardized serial lines and via a serial input/output computer interface  
card (the SIO card) in real-time directly into the main memory of a host computer.
Deployment of a series of diagnostics  using SIO led to various solutions and configurations for the  
different requirements. Experience has been gained and lessons learned applying the SIO concept  
at its technical limits. Requirements for a further development of the SIO concept have been  
identified, and a performance improvement by a factor of 4-8 beyond its current limits seems  
achievable. An effort has been started to develop a SIO version 2 (SIO II) featuring upgraded serial  
links and a more powerful FPGA for merging and forwarding data streams to host computer  
memory. (Compatibility with the existing SIO (SIO I) front-end system has to be maintained.)
This paper presents results achieved and experiences gained in the deployment of SIO I, the status  
of SIO II development (currently in the prototype phase), and projected enhancements and updates  
to existing implementations.
Introduction
The SIO DAQ concept was developed to provide a simple and versatile method for connecting 
external measuring systems to computers via standard serial links. Its main features are four 
bidirectional HOTLink data ports and a central timing clock port (all fiber optics technology), a 
central FPGA containing the data and time merge logic, and a fast PCI or PCIexpress computer 
interface. The FPGA as the "heart" of the SIO card merges four incoming streams of samples and 
the internal stream of time stamps to time-stamped data frames. These are transferred frame by 
frame with lowest possible latency via the PCI interface into the shared memory of the host 
computer providing immediate access to the data by application programs. This concept together 
with the in house developed analogue to digital conversion (ADC) front-end configuration was 
described earlier [1][2]. In 2009 this concept was chosen to become the new standard for ASDEX 
Upgrade diagnostics. (Similar concepts have been developed independently by others [3]. Differing 
DAQ I/O concepts like Ethernet based ADC devices or ADC carrying interface cards (PCI, PCIe, 
cPCI, PXI, ATCA or μTCA) shall not be discussed in this paper, since the justification of the 
decision to go the SIO direction with the described modular external front-end configuration for 
ASDEX Upgrade has been presented earlier [2].)
While a few bulk data diagnostics (up to 128 channels, 10-500 kHz sampling frequency) have been 
set-up using the SIO concept, it was recognized that other more challenging diagnostics could also 
benefit from the SIO technology but would need to exploit its data transport capabilities to the limit. 
The Electron Cyclotron Emission (ECE, internal name RMC) and the Doppler-Reflectometry 
(internal name PRA) diagnostic have been deployed pushing current SIO I hardware to its technical 
limits. For the PRA diagnostic, an additional FPGA software modification was necessary.
p. 1/18
In these highly tuned configurations, reliable operation is difficult to achieve. The complex multi-
threaded RT (real-time) software and system configuration is highly sensitive to small discrepancies 
from the optimal set-up. Enhancing the head room by designing a significantly more powerful SIO 
device seems desirable.
Considering the development of today's chip technology, enhancements are achievable by porting 
and refurbishing existing functionality to a new chip set, allowing for a dramatic increase in 
capabilities, features and performance. The PCI interface and the serial link handling, which were 
formerly located in separate chips, become integrated software blocks in the FPGA. The 
concomitant simplification of the board layout together with performance upgrades justify the 
higher price of an advanced FPGA. As the new "heart" of SIO II, we have selected a "mid-range" 
Altera Arria II GX [4]. This chip provides the desired data transfer capabilities for a faster and more 
flexible1 SIO engine and additionally incorporates the PCIe interface to the host and the serial 
connectivity to the front-ends in the form of IP (Intellectual Property) software blocks [5][6]. In this 
hardware configuration SIO II is projected to provide four bidirectional 2.5 Gbps serial optical links 
for input and a final data rate of up to 10 Gbps into the host.
Recently Deployed SIO I Diagnostics
SIO I as a new standard DAQ solution for diagnostics at ASDEX Upgrade (AUG) boosted the 
refurbishment, renewal and realization of old and new diagnostic projects. Table 1 gives a list of 
recently deployed AUG diagnostics realized with SIO I. In all cases DAQ is done by direct transfer 
of data into computer memory using DMA (direct memory access). The DMA transfers are 
supervised by a real-time process responsible for preparing, starting and restarting a chain of DMA 
operations as necessary. The “green” table rows mark diagnostics running additional RT processes 
sharing the DAQ memory read-only. These diagnostics provide capabilities for RT analysis and RT 
communication with plasma control computers (Control). These RT analysis results serve as input 
or feedback for advanced plasma scenarios like density profile control [7] or MHD mode 
stabilization [8][9]. While the other listed diagnostics perform the actual data transfer to memory in 
real-time, the final archiving as AUG shot files is done - as usual - off-line after the plasma 
discharge.
As a further enhancement of the ECE diagnostic, it is projected to upgrade the sampling rate from 1 
to 2 MHz and possibly the number of channels from 64 to 68 or 72. However, it is obvious that this 
upgrade is not possible with the current performance of the SIO I interface.
1 While the SIO I engine was restricted to a simple parametrized operation, the advanced SIO II engine allows fully 
programmable merge and transfer operation.
p. 2/18
Table 1: Recently deployed SIO I diagnostics at ASDEX Upgrade
Diagnostic CPU cores bus # SIO/links pipelines x slots DAQ modules sample rate data rate data size
1x sparcV9 PCI 1/1 1x12 5xFrgCnt+3x2chADCs 50 kHz 17 M
r-t current profile (MSE) 2x4 amd64 PCIe 1/1 1x16 13x2chADCs 250 kHz 180 M
ECE  (RMC) 2x4 amd64 PCIe 2/4+4 2x 4x 2 8x2x4chADC (wls) 1 MHz 1 G
2x sparcV9 PCIe 2/4+4 2x 4x 16 128x XUV-ADCs 500 kHz 1 G
XUV emission (XVS) 1x sparcV9 PCIe 2/4+3 (4+3) x 16 112x XUV-ADCs 500 kHz 960 M
VUV Spectr. (CCD) 1x sparcV9 PCI 4/1 4x CCD-HL Adapter 1 kHz 24 M
Fast Ion Loss 1x sparcV9 PCI 1/3 3x 4 9x 2chADCs 1 MHz 352 M
Lithium Beam 2x4 amd64 PCIe 1/3 3x 16 48x 2chADCs 200 kHz 27 M
Reflectometry fluct. 2x4 amd64 PCIe 1/1 1x 4 2x 2chADCs 1 MHz 128 M
2x4 amd64 PCIe 2/2 2 bands (2x 2 ADCs) 20 MHz 1.2 G
r-t n
e
 Interferom. (DCN) 1.6 MB/s
15 MB/s
2x 72 MB/s
XUV emission (XVR) 2x 68 MB/s
68+52 MB/s




Doppler Reflectometry custom FE 2x 80 MB/s
SIO I Performance
A number of the above mentioned diagnostics are testing the performance of SIO I to the limits. 
Electron Cyclotron Emission (ECE) as well as XUV emission and Doppler Reflectometry  have 
data rates very close to their nominal maximum computer bus speeds. In fact, ECE was initially 
designed as a split diagnostic on two hosts (internal names RM-A and -B - from Radio Meter), but 
by employing a more powerful computer, both SIO devices could be deployed in a single host 
named RMC. The XUV emission diagnostic (internal diagnostic names XVR and XVS) is still a 
distributed system with multiple SIO cards on multiple computer nodes. The Doppler Reflectometry 
features two SIO cards (one for each microwave band) with a modified SIO engine allowing a 20 
MHz sampling rate. (Instead of one time stamp per clock tick, only one time stamp per measuring 
interval is generated.)
While the parallel set-up of "DAQ only" diagnostics with multiple host nodes is a simple way of 
managing high demands (the two XUV diagnostics just store data in two different shot-files), it 
would make things complex in cases where all channels of a diagnostic have to be processed in 
real-time to produce a result of relevance for Control. One would like to avoid the parallelism of 
distributed DAQ with subsequent distributed RT analysis applications. (The ECE hosts RMA & 
RMB were combined into one host RMC for this reason.) RMC (see Table 1, line 3) using two SIO 
I devices being managed by two RT threads acquires all 64 radiometer channels into two shared 
memories on the same host. On the same host, an RT analysis task is keeping in step with the DMA 
reading from both shared segments. All 64 ECE channels can be examined for MHD activity 
signatures without having to resort to a distributed software solution. 
On a single host computer, the RMC diagnostic system is able to acquire an incoming data stream at 
144 MB/s, analyze the data stream in 5 or 10 ms time slices, and send the results to the ASDEX 
Upgrade discharge control system (DCS) via the diagnostics to control communication protocol.
[10]
The data rates achieved with SIO I are very close to the technical limits of the used communication 
channels. With dual-device set-ups (two SIO I in different I/O slots of the same computer) even 
higher data rates of 144 MB/s (RMC diagnostic) or 160 MB/s (Doppler Ref.) can be achieved. Both 
diagnostics are used regularly with these data rates in experiment operation.
"Frame-drops" with SIO I at High Data Rates
SIO does not use large internal buffers (see next section). The philosophy is to continously send all 
data by DMA immediately and smoothly (as a continous stream without pauses) into the main 
memory of the host computer. Under cutting edge conditions, which apply to some of our 
diagnostics, it is not easy to maintain the DMA streaming as smooth as required. Any interruption 
of the DMA caused by various reasons - reloading a virtual address table, activity of a high priority 
task or hardware device, delayed interrupt request (IRQ) handling because of other system activity - 
will lead to a short stoppage of the data flow. These stoppages, if they reduce the DMA duty cycle 
considerably, may degrade the overall transfer speed. If they become even longer, they may cause 
internal SIO buffers to overflow and ultimately, the loss of SIO time frames. In the SIO context this 
is called a "frame-drop" or a series of frame-drops and has raised some discussions about the 
reliability and usefulness of the SIO concept. Frame-drops, how they occur, and what measures to 
take against them shall be discussed briefly.
The term "frame" has already been described in [1]. To make it more clear we would like to look 
deeper into the FPGA software blocks once again. Frames are assembled by the FPGA from the five 
primary SIO inputs: four from its serial links and a fifth internal stream of time-stamps from its 
synchronized clock. This is done by the "time engine" and the "merge engine" working together in 
triggering the external front-end and managing the serial links. Together they form the "SIO 
p. 3/18
engine". The most important responsibility of the SIO engine is to keep data and time together 
correspondingly. To ensure this, each cycle of the SIO engine generates exactly one frame from the 
right pieces in the time FIFO and the four serial link FIFOs in a deterministic way. The completed 
frames are placed one by one into the DMA output FIFO to be transferred to the host's memory by 
the "DMA engine", which concurrently transfers frames out of the FIFO into host memory as 
requested by the DAQ task. If frames enter the FIFO faster than they are transferred out, the FIFO 
fills up, reducing the space available for new input frames. If the available space is not sufficient to 
store the next full frame, nothing is delivered into the FIFO and that entire frame is discarded, 
resulting in a frame drop. Frames will be dropped until enough data has been transferred out of the 
FIFO, freeing enough space to hold at least one entire frame. This policy guarantees that no 
incomplete frames will be delivered to the host.
How Frame-Drops are Affected by Data Rate and System Parameters
The DMA FIFO is small (32 KB) by design, to avoid long latency times between production of data 
and transfer to the host, fulfilling the design goal of short latency from source to memory. A 
necessary condition for this mechanism to work well is to minimize interruptions to the DMA 
traffic to the host. The FIFO size divided by the targeted data rate directly yields the maximum 
allowed interruption time. In other words, the higher the data rate, the shorter the transfer 
interruptions may be.
To avoid DMA interruptions and frame-drops one could issue a single long DMA request for the 
full acquisition length without interruptions. (This is in fact done for the XUV and Doppler 
Reflectometry diagnostics.) But due to not understood synchronization issues between CPU and 
memory in the computer architecture, this seems not to be possible when a RT process wants 
concurrently to keep in step with the DMA progress. To make the data in shared memory available 
to other processes in real time, it appears necessary to break up the DMA into many short time 
slices. To avoid frame drops, it is essential that the next DMA request is started immediately upon 
finishing the preceding one. A careful tuning of the system configuration and the data stream 
parameters is necessary to achieve this. The measures taken to minimize delays in Solaris interrupt 
request (IRQ) handling are described below.
How Frame-Drops Affect DAQ
The SIO behavior to drop only full frames guarantees the delivered data structure can always be 
interpreted. Using the embedded time-stamps, the amount of lost data and the full time axis are 
always identifiable. Typically the drop-out time at the described maximum used data rates is less 
than 1 ms in 6-8 seconds of total acquisition time. So reliability is still better than 99.98 %.
Often a characteristic frame-drop pattern over time can be observed when approaching 
performance limits. Specific to the RMC diagnostic, for example, are 1 or 2 frame-drop intervals of 
approximately 500 μs length during the first few milliseconds (the very beginning) of a plasma 
discharge. Any frame-drops lead to a non equidistant time axis of the data channels. A sophisticated 
time base handling and a special management for the missing signal intervals in the RT analysis 
task is necessary. However, the over-all usability of a diagnostic for the intended RT purpose is not 
compromised with rare frame-drops [11].
Besides frame-drops at the very beginning of DAQ, other time patterns of frame-drops (e.g. 
stochastic occurrences over the whole discharge) can be observed occasionally. This normally 
indicates a mistake in the priority set-up of the involved RT tasks or a mistake in the RT 
configuration parameters of the Solaris operating system. In cases where correcting such mistakes 
did not help, this was due either to using higher data rates than permitted by the constraints 
mentioned above or to employing a computer platform and/or configuration  inappropriate to the 
problem.
p. 4/18
Figure 1 shows the appearance of the last described (stochastic) type of frame-drops over 700 AUG 
shot numbers during the lengthy (and partly cumbersome) tuning period of the RMC diagnostic. 
The reappearance of frame-drops on minor deviations from the optimal system parameter settings 
illustrates how sensitively a SIO I configuration operated near the performance limit (here RMC) 
depends on correct or ill-tuned RT settings. On the other hand, the results demonstrate the stability 
of SIO over longer time periods when system parameters are well tuned.
Investigating Solaris OS Interrupt Latency with dtrace
When frame-drops first came into play it soon became clear how important the right choice of 
Operating System (OS) settings is  for tuning the RT behavior for the processes and threads 
involved in SIO DAQ and RT analysis.
As illustrated above, the most crucial parameter here is the latency the OS incurs in handling DMA 
interrupts,  that is, the delay time during which the DMA is not running between the end of the 
previous and the beginning of the next partial DMA read operation.
p. 5/18
Figure 1: RMC diagnostic - SIO frame-drops per shot
Figure 2: Solaris 10 real-time DMA interrupt handling latency measured for 8 seconds (1 irq/ms  
per device) for the RMC diagnostic with two SIO I cards (devices SIO0 and SIO1) operating at a  
data rate of 72 MB/s each. (Here in fact the first 1.5 s are shown. The full high resolution figure as  
well as the D-script to do the dtrace measurement can be downloaded from the online annex  
material of the digital journal. [URL to be completed during the editing process of the paper,  
please find the high resolution figure as a separate PDF attached to the submission])
In summary: the achievable over-all DMA speed  depends on the DMA duty cycle, whereas the 
occurrence of frame-drops depends on the reliable reproducibility of a latency which is short 
enough to avoid FIFO overflows.
Without going too deeply into the details, we may briefly discuss the results presented in Figure 2 
from dtrace measurements [http://www.dtrace.org] of the behavior of Solaris OS under SIO 
operation. As can be seen from this figure, DMA latency for both SIO devices is typically less than 
25 μs (which is in agreement with earlier results [1] sect. 3.6, fig. 7), time intervals where DMA 
latency extends up to 50 μs are short, one peak up to 150 μs near 2.9 s seems to be an isolated event. 
At the very beginning of the DMA chain a few long latency events are always observed. We 
suspecte these delays to be caused by some OS or processor activity (loading caches, preparing 
MMU tables) which are difficult to anticipate.
In comparison to these illustrated latency results, the buffering capability of the 32kB DMA FIFO in 
this case is enough to hold about 440 μs of data.
SIO II prospects
Requirements and Hardware Redesign
Up till here current and future diagnostic demands have been discussed in comparison to SIO I 
performance data. Diagnostic implementations, their deployment characteristics and difficulties 
have been presented. Performance demands beyond the current capabilities of SIO I have been 
identified.
Three main topics have to be considered to increase SIO performance:
1. More powerful FPGA
• SIO I is based on an Altera Cyclone II with a 50 MHz clock which leads to an 
internal data transfer speed limit at approximately 120-140 MB/s.
• Using an Arria II GX instead would provide multiple clock domains with a base 
clock of 125 MHz. At a width of 8 bytes, the internal data transfer rate could reach a 
maximum of 1 GB/s.
2. Faster PCI interface to host
• SIO I with a cPCI or cPCIexpress x1 adapter is limited by the theoretical speed of its 
bus interface. If additionally a PCI/cPCI or a PCIe/cPCIe bridge is inserted between 
SIO and host, further limitations apply. Therefore the nominal host interface speed 
varies from 78 to 160 MB/s. (N.b. further limits are imposed by the FPGA's internal 
limit.)
• Using the built-in PCIexpress-Hard-IP of the Arria II GX enables state of the art, 
high-speed, multi-lane PCIe data rates up to several Gb/s. Indeed, the FPGA's 
PCIexpress interface is expected no longer to create significant bottlenecks of its 
own.
3. Faster serial links
• SIO I provides four HOTlink II connections to the DAQ front-ends, each being able 
to transfer up to 40 MB/s. The HOTlink II protocol for these links is provided by a 
separate chip requiring additional considerations. Not only the demand for more 
speed but also the possible integration of the serial links as well as an internal FPGA 
function block justify a more far-reaching design change. The advantage of 
simplifying the hardware design compensates for violating compatibility with the 
existing front-end controllers. In consequence, a new design of this front-end 
p. 6/18
counterpart of the SIO card is probably unavoidable, but the modular design of most 
front-ends requires exchanging one card per crate only.
• The above mentioned FPGA series provide their own serial link concept, IP 
SerialLite II [5]. The nominal speed of these links is specified as 2.5 Gb/s which is 
approximately six times faster than HOTlink II. Therefore, we have decided to 
change the serial protocol between future SIO cards and DAQ front-ends to 
SerialLite II.
The redesign of the front-end controller card with an Altera FPGA to implement the 
"pipe line" front-end (c.f. [1]) control logic was expected to be straightforward and 
has successfully been completed.
Further requirements have been collected:
• functional compatibility to SIO I
◦ with the modular PIPE front-end system
◦ with the ASDEX Upgrade timing system
• extendability of the functions implemented in the FPGA
◦ implement further DAQ modes
◦ RT processing of data streams
• long-term market support of the chosen interfaces
• universal form factor, compact board size
• FPGA firmware in place programmable
• affordable price
Taking all the above considerations into account, an SIO II prototype was built as an XMC form 
factor mezzanine board. Figure 3 depicts the schematic hardware layout. The FPGA is flanked by 
the optical transmitters on the right side and the XMC connectors to the carrier board and further to 
the host on the left. While the FPGA flash memory is shown, all other supporting components have 
been omitted from this figure.
p. 7/18
Figure 3: SIO II XMC board layout diagram
FPGA Software Redesign
The described redesign decisions for SIO II have also made necessary a revision of the internal 
FPGA software layout. An "SOPC" (System On Programmable Chip) approach has been chosen. 
Keeping all components within one chip avoids difficult high frequency PCB design issues. 
Together with a very flexible reprogrammability, the system facilitates an outstanding adaptability 
to future needs. The software layout features a central communication fabric and various function 
blocks arranged accordingly as Figure 4 depicts. The upper part displays the TDC timing module, 
four serial link modules, and in between the transfer module derived from the so called "merge 
engine" of SIO I. Together these three implement the core SIO engine. Below the central 
"Interconnect Fabric" the host side modules are shown. Beside the PCIe controller block, the DMA 
engine, addressing table memory, and FPGA control & status memory and registers are shown. The 
p. 8/18
Figure 4: SIO II FPGA "system on a programmable chip" layout diagram
"Service and Flash Module" implements the possibility to update the FPGA program in situ via the 
host interface.
First Prototype of SIO II
Together with the SIO II XMC mezzanine board a PCIe carrier board was prototyped. Figure 5 
shows both boards. On top on the left the component side view (mirrored to the schematic in Figure 
4) shows the four optical transmitters for the serial links. These occupy the entire available XMC 
front-panel space. The FPGA in the middle is covered by a heat sink. Beside and above the FPGA 
the already mentioned flash memory and a couple of DC/DC power supply chips can be seen. To 
the right of the FPGA the XMC connectors providing high frequency links to the carrier board are 
located. On the very right edge of the mezzanine, diagnose and programming connectors are visible.
The ASDEX Upgrade timing network receiver is not part of the SIO II card itself. Instead it is 
located on the PCIe carrier board, which is shown in the same figure behind the SIO II board. The 
carrier board merely consists of a transparent PCIexpress bridge and the optical AUG time receiver. 
The receiver converts the fiber optics time signal to the Manchester-Coded AUG experiment 
clock[12]. This clock is provided to the SIO FPGA by the carrier board via the XMC connector. 
Conveniently, multiple SIO II XMC's may share the same external clock receiver. Furthermore, 
other timing networks could be used if the respective carrier board provides the timing clock via the 
XMC carrier connector and the FPGA's timing software is adapted to handle the corresponding time 
representation.
As an XMC mezzanine, SIO II is designed to be useable on various other carrier platforms if 
desired. Most likely these could be compact PCIexpress, PXI, or ATCA carrier boards aggregating 
one or more SIO II in one host slot. To adapt to future, faster PCIe versions an FPGA software 
upgrade or a slight redesign is expected to be sufficient.
The illustrated PCIe carrier bridge features four PCIexpress lanes, thus permitting up to 1 GB/s 
DMA data rates from SIO II to host memory. In first trials, the SIO II prototype achieved a data rate 
of 480 MB/s from FPGA to host memory.
p. 9/18
Figure 5: SIO II XMC prototype (foreground) and XMC/PCIe carrier board (background)
Conclusions
Serial I/O low latency delivery of data direct into the memory of computers as DAQ concept 
emerged a couple of years ago from previous ideas (see [1]). Since then, SIO has become a new 
diagnostics standard at ASDEX Upgrade. Deployed successfully in a series of diagnostics, the SIO 
approach demonstrates power and versatility for real-time and classical DAQ applications. This 
success together with demanding requirements in prospect of future diagnostic needs has led to a 
definition and new design of SIO II as a next step serial I/O interface card. A first prototype is 
currently being tested with promising results. Among these, a measured throughput of 480 MB/s 
seems to be most noteworthy.
We are planning to upgrade and condense a series of diagnostics which are either still distributed 
among a bunch of computer nodes (in the long-run these will be the Soft-X-Ray and Mirnov Probe 
diagnostics; in the short term, several CAMAC-based diagnostics will be refurbished using the SIO 
solution), or have not yet been capable of handling the sampling rates we were aspiring to for years 
(e.g. an upgrade of ECE from 1 to 2 MHz as well as an upgrade of at least a part of the Langmuir 
probe channels to 2-4 MHz sampling rate).
Acknowledgment
We want to thank our colleagues from the ASDEX Upgrade team for their encouraging 
contributions in testing and tuning the new diagnostics. Especially M. Reich and G. Conway had a 
lot extra work to get things up and running as well as adapting their preparation and analysis 
programs carefully to the new modus operandi SIO is bringing in.
We also would like to thank Constantin Gonzales from Oracle (former SUN Microsystems) who 
spent a day with us breeding over Solaris performance issues and how to unleash the power of 
dtrace for our investigations.
References
[1] K. Behler, H. Blank, H. Eixenberger, A. Lohs, K. Lüddecke, R. Merkel, G. Raupp, 
G. Schramm, W. Treutterer, M. Zilker, ASDEX Upgrade Team, Real-Time Diagnostics at ASDEX 
Upgrade - Architecture and Operation, Fusion Engineering and Design 83 (2008) 304-311, 
http://dx.doi.org/10.1016/j.fusengdes.2008.01.015
[2] K. Behler, H. Blank, A. Buhler, R. Cole, R. Drube, K. Engelhardt, H. Eixenberger, 
N.K. Hicks, A. Lohs, K. Lüddecke, A. Mlynek, U. Mszanowski, R. Merkel, G. Neu, G. Raupp, 
M. Reich, W. Suttrop, W. Treutterer, M. Zilker, ASDEX Upgrade Team, Real-Time diagnostics at 
ASDEX Upgrade - Architecture and operation, Fusion Engineering and Design 85 (2010) 313-320, 
http://dx.doi.org/10.1016/j.fusengdes.2010.03.001
[3] Daniel Charlet, P. Abbon, R. Ansari, C. Beigbeder, D. Breton, T. Cacérès, H. Deschamps, C. 
Flouzat,P. Kestener, C. Magneville, B. Mansoux, C. Pailler, M. Taurigna, and C. Yèche, The BAO 
Radio Acquisition System, IEEE Transactions on Nuclear Science 58 (2011) 1833-1837, 
http://dx.doi.org/10.1109/TNS.2011.2155085
[4] Altera Corporation, Arria II FPGAs: Cost-Optimized, Lowest Power 6G Transceiver 
FPGAs, http://www.altera.com/devices/fpga/arria-fpgas/arria-ii-gx/aiigx-index.jsp (visited 2012-08-
22)
[5] Altera Corporation, Altera's Complete SerialLite II Solution, 
http://www.altera.com/technology/high_speed/protocols/seriallite/pro-seriallite.html (visited 2012-
08-22)
[6] Altera Corporation, Altera's Intellectual Properties PCI Compiler, 
http://www.altera.com/products/ip/iup/pci/m-alt-pci_mt32.html (visited 2012-08-28)
p. 10/18
[7] A. Mlynek, M. Reich, L. Giannone, W. Treutterer, K. Behler, H. Blank, A. Buhler, R. Cole, 
H. Eixenberger, R. Fischer, A. Lohs, K. Lüddecke, R. Merkel, G. Neu, F. Ryter, D. Zasche, ASDEX 
Upgrade Team, Real-time feedback control of the plasma density profile on ASDEX Upgrade, 
Nuclear Fusion 51 (2011) 043002, http://dx.doi.org/10.1088/0029-5515/51/4/043002
[8] N.K. Hicks, W. Suttrop, K. Behler, M. García-Muñoz, L. Giannone, M. Maraschek, G. 
Raupp, M. Reich, A.C.C. Sips, J. Stober, ASDEX Upgrade Team, S. Cirant, G. D'Antona, Fast 
Sampling Upgrade ans Real-Time NTM Control Application of the ECE Radiometer on ASDEX 
Upgrade, Fusion Science and Technology 57 (2010) 1-9, 
http://www.new.ans.org/pubs/journals/fst/a_9263
[9] M. Reich, K. Behler, R. Drube, L. Giannone, A. Kallenbach, A. Mlynek, J. Stober, W. 
Treutterer, ASDEX Upgrade Team, Real-Time diagnostics and their Applications at ASDEX 
Upgrade, Fusion Science and Technology 58 (2010) 727-732, 
http://www.new.ans.org/pubs/journals/fst/a_10921
[10] Wolfgang Treutterer, Gregor Neu, Gerhard Raupp, Thomas Zehetbauer,Dieter Zasche, Klaus 
Lüddecke, Richard Cole, ASDEX Upgrade Team, Real-time signal communication between 
diagnostic and control in ASDEX Upgrade, Fusion Engineering and Design 85 (2010) 466-469, 
http://dx.doi.org/10.1016/j.fusengdes.2010.04.031
[11] M. Reich, A. Bock, M. Maraschek, ASDEX Upgrade Team, NTM Localization by 
Correlation of Te and dB/dt, Fusion Science and Technology 61 (2012) 309-313, 
[12] G. Raupp, R. Cole, K. Behler, M. Fitzek, P. Heimann, A. Lohs, K. Lüddecke, G. Neu, J. 
Schacht, W. Treutterer, D. Zasche, Th. Zehetbauer, M. Zilker, A "Universal Time" system for 
ASDEX Upgrade, Fusion Engineering and Design 66-68 (2003) 947-951, 
Table Captions
Table 1: Recently deployed SIO I diagnostics at ASDEX Upgrade.....................................................2
Figure Captions
Figure 1: RMC diagnostic - SIO frame-drops per shot........................................................................5
Figure 2: Solaris 10 real-time DMA interrupt handling latency measured for 8 seconds (1 irq/ms per 
device) for the RMC diagnostic with two SIO cards operating at a data rate of 72 MB/s each. (Here 
in fact the first 1.5 s are shown. The full high resolution figure as well as the D-script to do the 
dtrace measurement can be downloaded from the online annex material of the digital journal. [URL 
to be completed during the editing process of the paper, please find the high resolution figure as a 
separate PDF attached to the submission])...........................................................................................5
Figure 3: SIO II XMC board layout diagram.......................................................................................7
Figure 4: SIO II FPGA "system on a programmable chip" layout diagram.........................................8
Figure 5: SIO II XMC prototype (foreground) and XMC/PCIe carrier board (background)..............9
p. 11/18
Annex material to be provided on-line
1. Full resolution version of Figure 2 provided with the submission as separate PDF-file.
2. dtrace script to do the DMA latency measurements
3. All figures and tables to be printed in the journal in b/w or gray scale should be provided on-




























Table 1: Recently deployed SIO I diagnostics at ASDEX Upgrade
Diagnostic CPU cores bus # SIO/links pipelines x slots DAQ modules sample rate data rate data size
1x sparcV9 PCI 1/1 1x12 5xFrgCnt+3x2chADCs 50 kHz 17 M
r-t current profile (MSE) 2x4 amd64 PCIe 1/1 1x16 13x2chADCs 250 kHz 180 M
ECE  (RMC) 2x4 amd64 PCIe 2/4+4 2x 4x 2 8x2x4chADC (wls) 1 MHz 1 G
2x sparcV9 PCIe 2/4+4 2x 4x 16 128x XUV-ADCs 500 kHz 1 G
XUV emission (XVS) 1x sparcV9 PCIe 2/4+3 (4+3) x 16 112x XUV-ADCs 500 kHz 960 M
VUV Spectr. (CCD) 1x sparcV9 PCI 4/1 4x CCD-HL Adapter 1 kHz 24 M
Fast Ion Loss 1x sparcV9 PCI 1/3 3x 4 9x 2chADCs 1 MHz 352 M
Lithium Beam 2x4 amd64 PCIe 1/3 3x 16 48x 2chADCs 200 kHz 27 M
Reflectometry fluct. 2x4 amd64 PCIe 1/1 1x 4 2x 2chADCs 1 MHz 128 M
2x4 amd64 PCIe 2/2 2 bands (2x 2 ADCs) 20 MHz 1.2 G
r-t n
e
 Interferom. (DCN) 1.6 MB/s
15 MB/s
2x 72 MB/s
XUV emission (XVR) 2x 68 MB/s
68+52 MB/s




Doppler Reflectometry custom FE 2x 80 MB/s
Figures
Figure 1: RMC diagnostic - SIO frame-drops per shot
Figure 2: Solaris 10 real-time DMA interrupt handling latency measured for 8 seconds (1 irq/ms per device) for the RMC diagnostic with two SIO 
cards operating at a data rate of 72 MB/s each. (Here in fact the first 1.5 s are shown. The full high resolution figure as well as the D-script to do the  
dtrace measurement can be downloaded from the online annex material of the digital journal. [URL to be completed during the editing process of the  
paper, please find the high resolution figure as a separate PDF attached to the submission])
Figure 3: SIO II XMC board layout diagram
p. 17/18
Figure 4: SIO II FPGA "system on a programmable chip" layout diagram
Figure 5: SIO II XMC prototype (foreground) and XMC/PCIe carrier board (background)








Real-time behaviour of SIO configurations under Solaris 10 X86 - DTRACE measurement
of the total latency spent in between two DMA transfers to handle the interrupt from the
last and prepare for the the next 72kB (1 ms) DMA transfer from each SIO card to memory
Diagnostic RMC real-time data acquisition with two devices (SIO 0 and SIO 1) delivering 72 MB/s each. Real-Time analysis running parallel with the DAQ processes.
Total acquisition time 8 seconds (2x 8000 1ms DMA transfers). New time behaviour after introducing 64bit addressing in SIO driver. (29.11.2010 by hjb)
SIO 0
SIO 1
elapsed time [s]
la
te
nc
y 
[μ
s]
