A common DAQ system is being developed within the CALICE collaboration. It provides a flexible and scalable architecture based on giga-ethernet and 8b/10b serial links in order to transmit either slow control data, fast signals or read out data. A detector interface (DIF) is used to connect detectors to the DAQ system based on a single firmware shared among the collaboration but targeted on various physical implementations. The DIF allows to build, store and queue packets of data as well as to control the detectors providing USB and serial link connectivity. The overall architecture is foreseen to manage several hundreds of thousands channels.
Introduction and overview
The breaking of the electro-weak symmetry is the central subject of the particle physics research today. The LHC (Large Hadron Collider at CERN) is expected to bring a strong light on the subject but a proper study will need to be done using a lepton collider. The International Linear Collider (ILC) [1] is expected to be, after the LHC, the next machine for exploring the physics at the TeV-scale. During the past years a global R&D effort has lead to a number of improvements on acceleration, detection and data acquisition techniques.
Detector concept
Different detector concepts have been developed, in particular the ILD proposal has been published in the form of a letter of intent on the 31 th of March 2009. A second proposal named SiD is also in a good shape. The technique of "particle flow analysis" will be used in order to optimise the measurement of the jets energy-momentum. The simple idea behind consists in measuring independently the particles constitutive of the jet. The charged particles are measured by a magnetic tracker, the photons by an electromagnetic calorimeter and the neutral hadrons by the full calorimeter including both electromagnetic and hadronic. Thus the hadronic calorimeter resolution would spoil only the neutral energy hadron fraction. Using such a method, a high spatial granularity had been identified to have a major impact on energy resolution.
The ILD detector has been designed for that purpose and it results in a large tracker with a high field (4T) followed by a very granular calorimeter, christened "pictorial" or "imaging" or "tracking" calorimeter. In terms of numerical photography the grain of the collision picture is about 1000 times better than ALEPH or any LHC detector: about 100 Mpixels. The gain brought by this technique in terms of jet resolution is close to a factor two and the incidence on the effective luminosity is more than two.
CALICE : a R&D collaboration
Concerning the calorimeters, most of the efforts at the level of R&D and prototyping, have been done within a large international collaboration named CALICE for CAlorimeter for the LInear Collider Experiment [2] . The CALICE collaboration is made up of about three hundreds of physicists/engineers in 57 institutes from many countries coming from all over the world. Part of the activity is done through the European program EUDET and now in a E.U. 7
th Framework program with a project called AIDA (Advanced european Infrastructure for Detectors on Accelerators). Following a first step of prototyping intended to evaluate the physics performance, a second generation of prototypes including both electromagnetic and hadronic calorimeters is being designed with dramatic technological improvements, close to what expected for a detector on a linear collider experiment.
Highly granular, compact and power pulsed prototypes
The granularity and the compactness are dramatically increased. As an example, the density of individual channels for the prototype of a silicon-tungsten electromagnetic calorimeter should reach 10000, the equivalent of LHCb EM calorimeter, in one dm 3 only. A major change compared to the first prototypes will be the integration of the front end electronics into the detector volume. This implies significant development in the miniaturisation of the ASICs, and the development of a printed circuit board (PCB) which will host the ASICs and the detectors. The heat dissipated by the electronic parts must be extracted; indeed the detector modules are placed in a confined volume. The electronics will be run in power pulsed mode, which takes advantage of the ILC beam structure in order to power-off the electronics for ~99% of the time minimizing heat dissipation. The power-cycle has a repetition rate of 5Hz. dissipated power per channel (integrated over time and including front-end electronics only). A dedicated, low mass cooling system is being designed based on copper sheets connected to cold water pipes.
Reading out so many channels in a so small volume is a challenge, designing a data acquisition system (DAQ) for the few hundreds of millions channel of an ILD detector makes this challenge trickier.
DAQ concept for compact detectors
The large number of channels foreseen for a detector like ILD promotes the idea of a backplane-less system. In addition it should avoid the need of a large amount of cables distributed within the detector. A significant constraint is that the initial engineering studies for the integration of the systems have shown that a very small volume would be available for the DAQ system inside the detector envelop. Thus, this system should be compact and have a reduced power consumption. For example, the detector interface which is the terminal element of the DAQ chain, close to the detector, should have the size of a credit card, 6 mm tick only but able to control and read about 10 000 detector channels.
An initial version of the system has been established and prototyped by the CALICE institutes from United Kingdom. The version is based on standard protocols like GigaEthernet (connection to computers) and 8b/10b encoded serial links, similar to FastEthernet. A particularity is to bundle on the same cable 3 types of access: control of the detector, read-out of data and fast control (clocks and synchronization).
The prototype architecture is very similar to a computing network. It is based on several levels of intermediate data concentrators, similar to hubs and switches. For this first approach the design addresses the scalability of the system more than the power consumption. all the components are based on programmable technologies in order to achieve the maximum flexibility for R&D steps. At midterm, production of ASICs may be envisaged.
DAQ prototype

Overall architecture
It is important to quote that the system has been designed according to the expected beam structure of a future International Linear Collider. The bunch crossings are expected to be grouped in spills of a few thousands (2820 at the moment). During a spill the bunch crossings are separated by a few hundreds of ns (337 ns). The expected spill rate is 5 Hz. between two bunches.
The system is synchronous to a central clock delivered by a common Clock and Control Card (CCC). This clock must be synchronous to the bunch frequency in order to provide a bunch ID. The clock network relies on DAQ components which are in charge to distribute the reference clock down to the detectors. The CCC also delivers fast signals according to the beam structure to open and close the data acquisition gate, stop or pause the read-out process and eventually reset the electronics.
These fast signals are fan-out from the highest level of data concentration so called Local Data Aggregator (LDA). The LDA is also linked to a computers farm for the data read-out and the control of the detectors one one hand. And on the other hand the LDA can be connected up to ten lower level data concentration cards (DCC). Finally at leaf level, close to the detectors, a so called Detector Interface Board (DIF) connects the detector modules to the DAQ and control system. For the first prototype, widely used and versatile Giga Ethernet links have been chosen to connect LDA and computers allowing using standard LAN components. Every other connection is based on a customized serial link described in the next paragraph.
Customized 8b/10b serial link
The data concentrators (LDA and/or DCC) and the detector interfaces are connected together in a star topology. All the "logical" functions are embedded in the same cable and protocol: timing, fast control, slow control (configuration) and data read-out thus minimizing the number of cables.
This link is synchronized using the beam clock ranging from 40 MHz to 120 MHz. The clock is provided on a separate twisted pair in order to ease phase alignment while the data are sent on two pairs, one for each direction with LVDS electrical levels.
For compatibility with test beam environment two other signals are distributed isochronously, an external trigger and a detector busy signal. They should not be needed anymore on an ILC experiment provided the detector front -end electronics is self triggered as foreseen up to now.
Data are transmitted in both directions with 16b words encoded in two 8b/10b characters so that the functioning is very similar to the FastEthernet. Except the clock, fast signal are encoded using dedicated control characters followed by a data character used as a command identifier. Slow control and read-out data are encapsulated in packets with a 4 words header, a data section and a CRC at the end. Each packet is preceded by a start of frame control character and followed by a end of file control character.
In addition, fast commands can be sent using dedicated 8b/10b control characters interleaved with regular data. This allows distributing fast orders with a time precision of one clock period (10 to 20 ns) over the whole detector.
Stringent space constraints (particularly true for the DIF) have lead to choose HDMI connector and corresponding cables as they can provide 5 differential pairs and some power connections within a rather small cross section.
The same transceiver blocks (MAC layer) are used in every components of the system (firmware).
Higher level data aggregator (LDA)
The LDA prototype is based on a commercial demonstration board holding a GigaEthernet mezzanine with SFP cage allowing the use of optical fiber or CAT5 copper cables. This board acts both as a switch and a transceiver to convert data to/from GigaEthernet from/to lower speed customized serial links. Thus a second mezzanine is used to connect up to 10 devices (either DCC of DIF). The internal architecture is rather simple and consists in 10 input/output buffers multiplexed with the GigaEthernet interface.
The data packets for the control and for the read-out are mixed together. Optionally, on the GigaEthernet side, two different MAC addresses can be used in order to send each of these data flow on two different systems.
The LDA also handles the fast commands received from the CCC board or from the control computer through GigaEthernet and manage the interleaving with regular data. The fast commands are broadcasted over all the ports.
Finally, the clock and optional trigger and busy signals used for test beams are fan-out/in. The LDA thus provides a complete and unique interface between hardware systems and software components intended for controlling, monitoring and reading a detector. 
Lower level data concentrator (DCC)
The DCC prototype is a full custom board made at LLR acting similarly as a internet hub with one upstream link and up to 9 downstream links. Upstream and downstream are at the same speed so the DCC include large buffers to prevent data loss provided to mean data rate of each link remain rather low. Indeed the occupancy of detectors on an electron-positron collider is expected to be quite low (about 0.4% per channel) taken into account zero suppression at the level of the front-end electronics.
The board consists in a single economic Xilinx Spartan3 FPGA with minimum external components. The board size corresponds to a 6U VME, easy to insert in standard crate. It should be scaled down for the forthcoming development steps.
Functionalities are very similar as for the LDA. Source code for the interface blocks, buffers and packet multiplexers are shared among the two designs. 
Detector interface
At leaf level, close to the detector a so called Detector Interface Board connects the detector modules to the control and DAQ system. The shape of the DIF board can be adapted according to the constraint for the integration of each sub-detector but the firmware has been developed to be common to every detectors as the interface to the detectors front-end ASICs is the same. It is compliant with Altera or Xilinx technologies. The DIF prototypes are based on low cost FPGA but would probably fit in a full custom microchip for a final version implementing a common chipset together with front-end electronics ASICs.
The DIF architecture is versatile and extremely modular in order to make easier any update of functionalities. A specific internal bus is used in order to interconnect all the components in both direction of the data transfer. It has a similar but lightweight architecture of a system on chip design with well separated communication ports, storage structure, peripherals (detector bus, external memory, ...) and a central supervisor (packet analysis, chip management, resource sharing, command handling,...).
A DIF board is expected to handle several thousands of detector channels. It allows to read and to write configuration memories, to send control orders and to receive acknowledges and finally to read physics data and sent them through the DAQ system. Within the CALICE collaboration a DIF task force made up of 4 people is responsible of the specifications. 
Pulsed power supply
The front-end electronics can be activated according to the beam structure. The acquisition gate is opened about 25 µs (wake up time of font-end ASICs) before the beginning of the spill of bunch crossings. The read-out of the detector can take place at the end of the spill after the acquisition gate is closed.
Thanks to this behaviour, the front-end electronics can be only powered on during the time necessary to acquire physics signal and to read-out digitized data. A duty cycle of less than a few % is expected.
For the prototypes the DAQ system is not concerned but the DIF has to manage the power cycling, enabling and disabling the front-end electronics. A more advanced version of the DIF should probably optimize its power consumption being power cycled itself.
Finally the front-end electronics and the detector are seen as a variable load from the power supply point a view. The current driven is ranging from almost 0 to up to 10 A at a few Hertz (typically 5). As the space constraints are tight for all the services within the detector (a cross section of 1 or 2 cm2 for a module a few 100 000 channels), the idea is to bring the main power supply thank to a small size cable for a mean current of a few 100 mA. Thus the current pulse could then be delivered from large capacitances acting as local batteries.
The selection of the appropriate capacitance technology is done minimizing serial resistance together with an overall capacitance of several tens or hundreds mF. AVX BESTcap components can achieve this. We selected a 400 mF for its low For the tests, a current limited supply is set to deliver a constant 100 mA current used to slowly charge the capacitance while a power MOSFET generates high current calls.
We have achieved to supply the equivalent of about 6 000 detector channels with a duty cycle of 10% (current pulses from 0 to 6 A during 10 ms) with an output stability of 20 mV/ms.
The next step will be to validate the functionality in a magnetic field which is expected to be as high as 4T on ILC.
Conclusion
A prototype of a DAQ system has been developed within the CALICE collaboration taking into account scalability and compactness constraints as for a detector for an ILC experiment. Essential hardware component and the custom serial link have been developed by UK groups while DCC and most of the firmware has been developed at LLR and LAPP (interface to front-end electronics).
The functioning of the system is relying on the small amount of data generated by the detectors. This is due to an embedded zero suppression in the front-end electronics chips, a low background noise environment and a duty cycle of a few per cent. A scalable and compact design is proposed based on standard protocols (GigaEthernet, 8b/10b). A prototype version is being deployed for the 400 000 channels of a digital hadronic calorimeter to be tested on beam at the beginning of Autumn 2011. Tests at lab have shown a satisfactory behavior of the system with an overall bit error rate below 10-13. The installation of the whole system including power pulsed front-end electronics is ongoing and has been validated taking cosmic events.
Future version will include higher speed links and will optimize the power consumption of the DAQ devices which could also be partly power cycled.
This DAQ system will be integrated with the EUDAQ system in the scope of the Advanced Infrastructures for Detectors and Accelerators (AIDA) project [3] funded by the 7 th framework program of the European Union.
