26 research outputs found
A Modular and Fault-Tolerant Data Transport Framework
The High Level Trigger (HLT) of the future ALICE heavy-ion experiment has to
reduce its input data rate of up to 25 GB/s to at most 1.25 GB/s for output
before the data is written to permanent storage. To cope with these data rates
a large PC cluster system is being designed to scale to several 1000 nodes,
connected by a fast network. For the software that will run on these nodes a
flexible data transport and distribution software framework, described in this
thesis, has been developed. The framework consists of a set of separate
components, that can be connected via a common interface. This allows to
construct different configurations for the HLT, that are even changeable at
runtime. To ensure a fault-tolerant operation of the HLT, the framework
includes a basic fail-over mechanism that allows to replace whole nodes after a
failure. The mechanism will be further expanded in the future, utilizing the
runtime reconnection feature of the framework's component interface. To connect
cluster nodes a communication class library is used that abstracts from the
actual network technology and protocol used to retain flexibility in the
hardware choice. It contains already two working prototype versions for the TCP
protocol as well as SCI network adapters. Extensions can be added to the
library without modifications to other parts of the framework. Extensive tests
and measurements have been performed with the framework. Their results as well
as conclusions drawn from them are also presented in this thesis. Performance
tests show very promising results for the system, indicating that it can
fulfill ALICE's requirements concerning the data transport.Comment: Ph.D. Thesis, Ruprecht-Karls-University Heidelberg Large, 251 page
The ALICE Dimuon Spectrometer High Level Trigger
The ALICE Dimuon Spectrometer High Level Trigger (dHLT) is an on-line processing stage whose primary function is to select interesting events that contain distinct physics signals from heavy resonance decays such as J/psi and Gamma particles, amidst unwanted background events. It forms part of the High Level Trigger of the ALICE experiment, whose goal is to reduce the large data rate of about 25 GB/s from the ALICE detectors by an order of magnitude, without loosing interesting physics events. The dHLT has been implemented as a software trigger within a high performance and fault tolerant data transportation framework, which is run on a large cluster of commodity compute nodes. To reach the required processing speeds, the system is built as a concurrent system with a hierarchy of processing steps. The main algorithms perform partial event reconstruction, starting with hit reconstruction on the level of the raw data received from the spectrometer. Then a tracking algorithm finds track candidates from the reconstructed hit points. Physical parameters such as momentum are extracted from the track candidates and finally a dHLT decision is made to readout the event based on certain trigger criteria. Various simulations and commissioning tests have shown that the dHLT can expect a background rejection factor of at least 5 compared to hardware triggering alone, with little impact on the signal detection efficiency
Towards automation of computing fabrics using tools from the fabric management workpackage of the EU DataGrid project
This article describes the architecture behind the designed fabric management system and the status of the different developments. It also covers the experience with an existing tool for automated configuration and installation that have been adapted and used from the beginning to manage the EU DataGrid testbed, which is now used for LHC data challenge