Abstract-High availability and reliability are among the most desirable features of control systems in modern high-energy physics (HEP) and other big-scale scientific experiments. One of the recent developments that has influenced this field has been the emergence of the Advanced Telecommunications Computing Architecture (ATCA). Designed for the telecommunications industry, it has been successfully applied in other domains, such as accelerator control systems. A good example is the application of ATCA standard for the design of the low-level RF (LLRF) control system for the X-Ray Free Electron Laser (XFEL) being developed in Deutsches Elektronen Synchrotron (DESY). Reliability and availability requirements for such a facility play a crucial role among other parameters. Thus, the ATCA standard, with five-nines availability, is considered to be one of the best candidates for this system. This paper focuses on the central-management unit of every ATCA board, namely, the intelligent platform-management controller (IPMC), developed for the LLRF ATCA carrier board. It is also argued that it is possible to create a fully functional IPMC using base specifications which is only a more economical solution than acquiring such products from ATCA vendors. This work supports the concept of an open-source community solution under the xTCA for physics collaboration dealing with IPMC/MMC development and wishes to contribute to it. The IPMC solution presented here is mainly hardware independent as proper code organization allowed to separate low-level device drivers and high-level application logic dealing with the ATCA standard, which makes it portable for new carrier-board designs. It also follows the latest trends in xTCA development introduced by the xTCA for Physics initiative. A firmware upgrade of programmable devices (field-programmable gate arrays and digital signal processors) has been proposed. Currently, this is not included in the standard. However, this functionality is needed in HEP applications by using xTCA and is useful in these cases.
control subsystems of these activities is concerned [1] . For many years, this goal was not an easy thing to accomplish. Pieces of equipment used in the experiments often came from different vendors and were not compatible with each other. This was caused by a lack of a widely accepted standards in which the production and application of this equipment could be based on. In mid-2004, however, a new standard emerged. The Advanced Telecommunications Computing Architecture (ATCA), although designed primarily for the telecommunications industry, has found use in other applications, including high-energy physics experiments and accelerator control systems [2] [3] [4] . At the heart of the coveted reliability and availability features of ATCA is the intelligent platform management interface (IPMI) specification [5] . It has been adapted by ATCA as the communication protocol of the standard. Each ATCA board must be equipped with an intelligent platform-management controller (IPMC) which is responsible for controlling the hot-swap activation and deactivation of the board, managing the onboard sensors, verifying that the values given by them are limited to specific thresholds [6] . These modules monitor the most crucial components of the entire system and play an important role in early detection of any failures that might arise. This, in turn, enables the user to remove or replace a faulty module or take necessary actions to prevent the upcoming failure. Any messages that need to be delivered to the active Shelf Manager (ShM) concerning the aforementioned events are processed and sent by the IPMC. That is why it is a vital component of the system and its proper operation is crucial in order for the high levels of reliability and availability to be upheld [7] . Implementations of IPMC are hardware-dependent and, thus, ready-made solutions are neither easily obtainable nor relatively cheap. Although reference designs offered by some vendors exist, their flexibility implies resource wastage when implemented in less-demanding applications. On the other hand, more robust projects, which include specialized hardware, require drivers for these devices to be written separately from the available IPMC solutions. These drivers need to be integrated into the ready-made product (e.g., for advanced mezzanine card (AMC) sensor data record (SDR) management [8] or electronic keying (EK) purposes [9] , [10] . This could be a difficult task to accomplish, taking into consideration limited access to the source code offered by some vendors.
II. XTCA FOR PHYSICS
Since many projects include the development of custom, but still standard compliant, hardware, it is not an issue to create a custom IPMC as well. This paper focuses on one such application developed for the low-level RF control system of the 0018-9499/$26.00 © 2011 IEEE XFEL accelerator [11] . The application was designed in the Department of Microelectronics and Computer Science (DMCS) at the Technical University of Lodz (TUL), Poland. It describes the IPMC designed for the ATCA carrier boards (CB) used in the system.
However, more physics experiments in research centers around the world are interested in using xTCA systems for their purposes [12] [13] [14] [15] . This led to the formation of the "xTCA for Physics Coordination Committee" whose main goal is to look at the specific requirements of the adaptation of AdvancedTCA and MicroTCA to physics experiments. The authors of this paper are involved in the work on the new specification that should include information about such features as the firmware upgrade of reprogrammable devices [field-programmable gate arrays (FPGAs)], electronic keying of high-voltage signals, or monitoring clock distribution networks. Its other goals are standardization and simplification of the hardware and software. The implementation of the IPMC proposed by the authors complies with the general guidelines about modularity, portability, and manageability of the code.
It is obvious that most physicists want to focus on dealing with the data provided by the system and not on its internal workings. A lack of IPMC (or MMC) for their custom-made xTCA-based blades often creates a bottleneck in the development of the entire system. That is why the introduction of the xTCA for Physics specification extension is an important step in the direction of eliminating this problem. Opening the source code of the implementation designed by the authors should also bring us closer to the goal. Although some organization and clearing up the code needs to be done, this should be accomplished in the near future.
III. ATCA CARRIER BOARD FOR THE LLRF SYSTEM
In order for the CB to be compatible with other ATCA boards as well as ShM cards controlling the shelf, it needs to comply with the standard as far as hardware and software are concerned [16] .
The CB communicates with the ShM over a redundant bus called IPMB-0 and with up to three AMCs over IPMB-L. Other peripheral devices are connected to the one remaining bus. These include voltage, temperature, and current sensors, I/O port expanders, an external EEPROM memory chip, integrated clock generator, cross switches, and an integrated clock buffer (see Fig. 1 [9] ).
IV. IPMC REQUIREMENTS IMPLEMENTED ON THE LLRF ATCA CARRIER BOARD
The custom IPMC designed by the authors needs to fulfill specific requirements that follow the general requirements of the LLRF system architecture as well as the hardware available on the CB itself. These features are:
• compliance with the basic IPMI commands required by the ATCA standard; • compliance with the PICMG 3.0 extension commands; • CB management, including hot-plug functionality, EK functionality for GbE, custom PCIe and LLL, Blue light-emitting diode (LED) control, hardware-address (HA) recognition; • management of three AMC modules, including hot-plug functionality, EK functionality, power-supply control; • firmware upgrade of programmable devices-FPGAs and DSP; • external sensor monitoring for stand-alone devices (e.g., MAX6626) and integrated devices (e.g., ATC210), including voltage, temperature, and current sensors; • debugging and diagnostic functionality; • economical, easy to implement, and low-area solution. The messaging between the IPMC on the CB and one on the ShM takes into account the requirements and structure as shown in version 2.0 of the IPMI specification and the PICMG 3.0 Revision 3.0 document [17] . All of the messages that are mandatory, as defined by those documents, have been implemented, allowing the CB to operate in all standard-compliant shelves. This implementation includes the EK feature which allows dynamic configuration of links between the boards in one shelf [18] as well as between the AMC modules and other devices in the shelf.
There are four channels supported by the IPMC. Two of those are used for the redundant intelligent platform-management bus (IPMB-0) over which the CB communicates with the ShM: one is used for the local IPMB bus (IPMB-L) to which the AMC modules are connected. A maximum of three single-width AMC modules can be controlled by the CB. This number is caused by the physical design of the hardware and not by the inadequacy of the IPMC. The IPMC can be adapted to deal with more or fewer AMC modules. Again, the communication between these components is fully in accordance with the documents mentioned before. It also follows the AMC.0 Revision 2.0 specification [8] which describes in detail the communication patterns between CBs and AMCs.
The HA translates directly to a address that the board needs to use on the IPMB-0 bus. Its recognition is peformed on the CB with the aid of the I/O expanders which collect this information from the backplane, as detailed in the specification. Two more crucial ATCA-specific elements are the hot-swap plug and the Blue light-emitting diode (LED). The former is required for the board to be safely removed and replaced on the fly without the need to stop the entire system and the latter indicates the state of the board to the user which can perform specific actions depending on this information. The CB IPMC is able to detect the changes of the hot-swap plug as well as control the blue LED as defined by the requirements.
In order to eliminate errors in the IPMC development, it has been designed with the support of a debugging feature. It can work with the IPMI analyzer, developed in parallel, or send plain-text messages over the serial interface directly to the user without the need for extra equipment [19] .
The IPMC reads data from all of the sensors present on the CB and is capable of sending event messages to the ShM whenever any of the thresholds specified for any of the sensors is crossed. These messages are then interpreted by the ShM and stored in the system event log (SEL) where the system manager can spot unusual behavior and take actions accordingly. An example of this action would be increasing the speed of fans in the shelf for overtemperature alerts. This mechanism greatly contributes to the overall stability and reliability of the system.
V. IPMC HARDWARE AND SOFTWARE DETAILS
Taking into consideration the requirements set for the IPMC, suitable hardware and software solutions needed to be found. Some of the requirements force the designer to use specific components, such as those for power distribution. Still, there is a wide range of possibilities to choose from as far as the sensors, the microprocessor for IPMC, or the software implementation are concerned.
The IPMC code can be easily divided into several parts which correspond directly to the LLRF system requirements for this solution. The major software components, which will be described in more detail in the following sections, are:
• IPMI library, including functions dealing with all of the base IPMI and PICMG 3.0 commands; • communication section, including functions dealing with messaging between and ; • AMC-management section for proper AMC activation, deactivation, and EK operation; • sensor-management section for initializing, controlling, and event detection from onboard sensors;
• firmware upgrade section for programmable devices and a DSP processor; • debugging section for sending human-readable output over a serial interface for diagnostic and verification purposes.
A. Microprocessor Selection
The IPMC on the CB is implemented using an Atmel AVR ATmega 1281 microprocessor. It comes with 128 kB of Flash program memory, 8 kB of SRAM and 4 kB of internal EEPROM. This chip has been chosen in order to utilize its resources as optimally as possible. The program memory is filled in around 70% while the SRAM is just below 90%. The internal EEPROM is used for storing the debug messages sent over the serial interface. Also, what is more important, it contains the field replaceable unit (FRU) information each IPMC is required to implement. Data stored there are supplied to the ShM and detail all of the crucial features of the CB, such as power consumption, number of supported AMCs, or supported interfaces used in EK negotiations. Although this information may be modified at runtime, it is a very rare occurrence and, thus, storing it in nonvolatile memory is justifiable. This is also a requirement of the standard.
This microprocessor was chosen as the best compromise between price and efficiency. The memory as well as peripheral resources usage shows that a smaller chip might not have been able to deal with the IPMC functionality and a bigger one would introduce resource wastage. Also, the tools used to write the source code are easily obtainable and free. These include the GNU toolchain for C-language adapted for the AVR architecture and an Integrated Development Environment available from the Atmel website. The chip itself is inexpensive compared with more robust microprocessors.
B. Code Organization
As mentioned before, the internal operation of the IPMC is hardware dependent. It needs to know what kind of sensors and other peripheral devices are present on the board and how to communicate with them. Also, the hardware implementation of controllers may vary from board to board, and the IPMC has to be able to read and send the IPMI messages over the IPMB busses. The ATmega 1281 present on the CB has only one built-in two-wire interface (TWI) controller used for communication with AMC modules and all of the other buses are managed by external chips connected to the IPMC by a parallel bus. Other implementations may use microprocessors with a higher number of built-in controllers or use other devices altogether such as field-programmable gate arrays (FPGAs).
The IPMC source code, written in C language, is unaffected by some of these variations because of the way it has been structured. Most of the interactions between IPMC and the low-level devices have been wrapped in high-level functions with an API available in form of doxygen-generated documents. Thus, the device driver part of the design is separated from the logic of the program. Similarly, the reception and transmission of the IPMI messages is largely independent of the logic processing this information. The low-level device driver monitors the bus, collects the data, and then inserts them into a receive queue where it is read from by a high-level function that analyzes the entire packet. Likewise, the response message is packed into a send queue and the low-level device driver deals with sending it over the interface. If a change to hardware needs to be made, only those drivers need to be changed while the core logic remains untouched. The structure of the code is presented in Fig. 2 .
For debugging purposes, an additional application has been developed that gives the user control over most of the onboard peripherals. Presented with a graphical user interface, the values of the sensors or the I/O port expanders can be read, the sensors can be reconfigured, the LEDs states can be checked and changed, etc. [20] . Such a tool is invaluable in the early stages of system development, enabling verification of the hardware and software components. This is even more important in HEP applications where ATCA is used and completely new systems are being designed. Some features are not defined by the standards and a good way of debugging and diagnostics helps overcome initial design issues. The functions dealing with communication with this application can be switched on or off at compile time of the IPMC and, if on, they do not interfere with the IPMC functionality.
C. Real-Time Pseudokernel
It is understandable that a system controlling high-energy physics experiments of which high reliability and availability are required should present response times to various events which are as low as possible. The external events include the arrival of IPMI messages, sensor interrupts, or user hot-swap interaction. Fast message processing maximizes the utilization of the bus because no timeout message repetitions occur if the original message is handled and responded to quickly. A fully featured real-time operating system would consume too many resources and be too complicated in implementation and analysis. That is why a hybrid pseudokernel has been implemented to perform real-time actions for this application. The idea of event-driven cyclic executives in conjunction with external device interrupts has been employed [21] . The interrupt service routines (ISRs) have been reduced to minimum in order to maximize the response time of the application. Nested interrupts are disabled by default in ATmega1281 and are not enabled in the proposed solution in order to avoid the overhead associated with pushing and popping register values onto and from the stack. Thus, the faster one interrupt is serviced the sooner another one will be. Actually, the major role of ISRs in this solution is feeding the event-driven part of the code organization with specific events where the cyclic executives nature of the application takes over. This approach involves executing short processes in a continuous loop as depicted in Table I .
The CB-related events include opening or closing the hotswap handle by the user, alerts from external sensors, requests for EK link reconfiguration, bus stuck alerts, watchdog alerts, and requests for IPMI message transmission to ShM. The AMC-related events include requests for TWI reconfiguration, requests for IPMI message transmission to AMC, insertion/removal of an AMC alert, and requests for sending a ping message to AMC. This way, all of the major events that influence the operation of IPMC are dealt with in one place where it is easy to control and modify them. For example, to add or remove a sensor would only require adding an event to the event list and no changes to the core loop would have to be made.
D. IPMI Message Processing
The IPMI message processing is a crucial activity of the main loop of IPMC. The communication between the CB and the ShM or AMCs should be carried out as smoothly as possible and, although the IPMI standard assumes the possibility of resending unanswered requests, these occurrences should be limited to a minimum in order to not overload the buses. Taking into account the high-level functions, the whole process can be subdivided into four parts: 1) read an IPMI message from the receive buffer; 2) process the message; 3) formulate the response and put it in the transmit buffer; 4) indicate a message-ready event. Before the first action and after the last one, the low-level drivers are responsible for putting the received message into the input buffer and transmitting the response out of the output buffer, respectively. The input and output buffers are implemented in form of first-in first-out (FIFO) queues, eight messages deep, where each message can hold up to 32 B, which is also defined by the standard.
The first function merely copies the message indicated by the read pointer of the FIFO queue to a local buffer and passes control to the processing function. This procedure verifies the checksum of the message and analyzes the frame itself according to the IPMI core specification as well as the PICMG 3.0 documentation. The messages are divided into several groups following the division of the latter document (see Table II ). The first five categories represent the core IPMI specification commands while the last two follow the PICMG 3.0 commands extension.
Appropriate actions are taken when the message is processed. Some are visible to the user (blinking LED) while some are not (exchange of sensor information). In case of request messages, a reply is formed and sent by using a similar FIFO scheme for the input buffering.
E. Low Latency Protocol Special Considerations
The application of PCIe in case of the LLRF ATCA-based system is a nonstandard one. Usually, a PCIe switch is inserted into one of the slots and the other blades connect to it in a star configuration. However, this setup was not suitable here because a PCIe switch introduces additional latency which was not desirable. That is why the blades need to be connected in a daisy chain. This means, however, that if a board is inserted into a working system, some connections may need to be reconfigured. Consider a scenario where two boards are present in a shelf. One occupies slot number 1 and the other one occupies slot number 5. During activation, the shelf manager issued 'Set Port State (Enabled)' commands to both blades, and a connection has been made. When a third board is inserted into slot number 3, the shelf manager sends more commands which would enable all possible connections. The implemented algorithm, however, causes the connection 1 5 to be broken and only two connections to be made, that is, 1 3 and 3 5. A 5 1 connection cannot and is not established. This way, a daisy chain is formed, and the unwanted connections are discarded. The IPMC configures the PCIe cross-switch present on each blade in order to achieve that goal.
This specific algorithm is based on the event mechanism described previously. Whenever the IPMC receives a command from the ShMC telling it to enable or disable a connection, an event is indicated. This enables easy implementation of various algorithms suitable for a given application.
Another interface used in the LLRF CB are the low latency links. The boards need to be configured in a master-slave setup as far as this interface is concerned. In order to do that, IPMC needs to communicate with the FPGA which uses built-in transceivers to implement the LLL. Again, IPMC events are used to control the configuration algorithm.
VI. FUTURE PLANS: UPGRADE OF PROGRAMMABLE DEVICES FIRMWARE
Computation blades used in HEP experiments require high processing power often provided by field-programmable gate arrays (FPGAs) or digital signal processors (DSPs). What is more, their functionality is not constant throughout the duration of the experiment. This means that the firmware running on them needs to be changed, which can prove difficult in high radiation environments or in systems with limited human accessibility. In any case, having to physically use programmers to change the firmware on several dozens of FPGA devices manually would require a lot of time and effort. This is one of the problems xTCA for Physics is planning to deal with by developing a standard for firmware transmission using IPMC. This can be based on the HPM.1 specification for the IPMC firmware upgrade. The authors of the proposed solution tested the FPGA firmware upgrade by using an Ethernet connection to a PC computer which uploaded the generated bitstream to the IPMC. It was then sent to the FPGA using the JTAG interface.
FPGA firmware upgrade is very important in case of the LLRF system for which this particular IPMC solution has been developed. The implementation of the LLRF controller may change during operation of the machine and it is not possible to reprogram the devices on many ATCA boards manually. However, the IPMB-0 bus is not suitable for this because of its very low speed (100 kb/s). Uploading a 4-MB FPGA bitstream would take between 5 to 10 min. Taking into consideration that many of these devices often need to be reprogrammed, such long periods of time are unacceptable. Other than IPMI-over-Ethernet, with speeds around 100 Mb/s, the links in the FPGA itself (PCIe, GbE, RapidIO) can be used. This kind of firmware upgrade is currently under development by the authors. The IPMC would then be responsible for switching the firmware version, since backup copies of working, older versions have to be available. FPGA would use a built-in bootloader to copy the firmware into its structure. The HPM.1 algorithms are well-suited for this purpose. Only the medium needs to be changed for faster data rates.
The development of EK for LLRF CB indicated some problems that are not solved in the available PICMG specifications and should be standardized in xTCA for Physics. These include communication between the IPMC and other onboard devices, usually FPGAs.
VII. SUMMARY
The implemented IPMC has done its job well as the heart of the CB used in the ATCA-based LLRF system for FLASH tests. Few problems have been encountered in the first sessions all of which have subsequently been eliminated. The IPMC has been shown to be able to work on CBs equipped with different sensors taking advantage of the driver/logic separation. The activation and deactivation procedures, defined in the PICMG 3.0 specification, have been followed directly since there have been interactions with AMC modules and EK processes. Both PCIE and GbE links have been established.
By proving all of the aforementioned functionality, it has been shown that it is possible to develop and apply a custom IPMC code that is able to work in conditions requiring high availability and reliability rates proposed by the ATCA standard. This solution is economically sound when compared to those offered by major vendors dealing with this architecture. At the same time, it gives the firmware developer full control of the features that need to be implemented for a given application. This, in turn, ensures that the utilization of all system resources will be optimal. Also, as shown by the firmware upgrade example, future improvement of IPMC features is possible and easy. This enables following the developments of xTCA for Physics in order to provide a solution well suited for application in systems working in HEP experiments. One of these developments is linking the IPMC with other programmable devices on the same blade. The firmware upgrade of these devices as well as advanced EK for clock signals or a dedicated, standardized IPMC-to-FPGA bus are among the topics that need to be discussed and introduced in the new specification. Accepting the suggestions from the physics community and making the source code open enables implementing and testing these improvements, greatly facilitating and decreasing the amount of time needed to set up an xTCA-based experiment. It will also pave the road for further evolution of the IPMC specification.
