13,216 research outputs found
Installing, Running and Maintaining Large Linux Clusters at CERN
Having built up Linux clusters to more than 1000 nodes over the past five
years, we already have practical experience confronting some of the LHC scale
computing challenges: scalability, automation, hardware diversity, security,
and rolling OS upgrades. This paper describes the tools and processes we have
implemented, working in close collaboration with the EDG project [1],
especially with the WP4 subtask, to improve the manageability of our clusters,
in particular in the areas of system installation, configuration, and
monitoring. In addition to the purely technical issues, providing shared
interactive and batch services which can adapt to meet the diverse and changing
requirements of our users is a significant challenge. We describe the
developments and tuning that we have introduced on our LSF based systems to
maximise both responsiveness to users and overall system utilisation. Finally,
this paper will describe the problems we are facing in enlarging our
heterogeneous Linux clusters, the progress we have made in dealing with the
current issues and the steps we are taking to gridify the clustersComment: 5 pages, Proceedings for the CHEP 2003 conference, La Jolla,
California, March 24 - 28, 200
Reducing the Barrier to Entry of Complex Robotic Software: a MoveIt! Case Study
Developing robot agnostic software frameworks involves synthesizing the
disparate fields of robotic theory and software engineering while
simultaneously accounting for a large variability in hardware designs and
control paradigms. As the capabilities of robotic software frameworks increase,
the setup difficulty and learning curve for new users also increase. If the
entry barriers for configuring and using the software on robots is too high,
even the most powerful of frameworks are useless. A growing need exists in
robotic software engineering to aid users in getting started with, and
customizing, the software framework as necessary for particular robotic
applications. In this paper a case study is presented for the best practices
found for lowering the barrier of entry in the MoveIt! framework, an
open-source tool for mobile manipulation in ROS, that allows users to 1)
quickly get basic motion planning functionality with minimal initial setup, 2)
automate its configuration and optimization, and 3) easily customize its
components. A graphical interface that assists the user in configuring MoveIt!
is the cornerstone of our approach, coupled with the use of an existing
standardized robot model for input, automatically generated robot-specific
configuration files, and a plugin-based architecture for extensibility. These
best practices are summarized into a set of barrier to entry design principles
applicable to other robotic software. The approaches for lowering the entry
barrier are evaluated by usage statistics, a user survey, and compared against
our design objectives for their effectiveness to users
A configuration system for the ATLAS trigger
The ATLAS detector at CERN's Large Hadron Collider will be exposed to
proton-proton collisions from beams crossing at 40 MHz that have to be reduced
to the few 100 Hz allowed by the storage systems. A three-level trigger system
has been designed to achieve this goal. We describe the configuration system
under construction for the ATLAS trigger chain. It provides the trigger system
with all the parameters required for decision taking and to record its history.
The same system configures the event reconstruction, Monte Carlo simulation and
data analysis, and provides tools for accessing and manipulating the
configuration data in all contexts.Comment: 4 pages, 2 figures, contribution to the Conference on Computing in
High Energy and Nuclear Physics (CHEP06), 13.-17. Feb 2006, Mumbai, Indi
Planning through Automatic Portfolio Configuration: The PbP Approach
In the field of domain-independent planning, several powerful planners implementing different techniques have been developed. However, no one of these systems outperforms all others in every known benchmark domain. In this work, we propose a multi-planner approach that automatically configures a portfolio of planning techniques for each given domain. The configuration process for a given domain uses a set of training instances to: (i) compute and analyze some alternative sets of macro-actions for each planner in the portfolio identifying a (possibly empty) useful set, (ii) select a cluster of planners, each one with the identified useful set of macro-actions, that is expected to perform best, and (iii) derive some additional information for configuring the execution scheduling of the selected planners at planning time. The resulting planning system, called PbP (Portfolio- based Planner), has two variants focusing on speed and plan quality. Different versions of PbP entered and won the learning track of the sixth and seventh International Planning Competitions. In this paper, we experimentally analyze PbP considering planning speed and plan quality in depth. We provide a collection of results that help to understand PbP�s behavior, and demonstrate the effectiveness of our approach to configuring a portfolio of planners with macro-actions
Prototype of Fault Adaptive Embedded Software for Large-Scale Real-Time Systems
This paper describes a comprehensive prototype of large-scale fault adaptive
embedded software developed for the proposed Fermilab BTeV high energy physics
experiment. Lightweight self-optimizing agents embedded within Level 1 of the
prototype are responsible for proactive and reactive monitoring and mitigation
based on specified layers of competence. The agents are self-protecting,
detecting cascading failures using a distributed approach. Adaptive,
reconfigurable, and mobile objects for reliablility are designed to be
self-configuring to adapt automatically to dynamically changing environments.
These objects provide a self-healing layer with the ability to discover,
diagnose, and react to discontinuities in real-time processing. A generic
modeling environment was developed to facilitate design and implementation of
hardware resource specifications, application data flow, and failure mitigation
strategies. Level 1 of the planned BTeV trigger system alone will consist of
2500 DSPs, so the number of components and intractable fault scenarios involved
make it impossible to design an `expert system' that applies traditional
centralized mitigative strategies based on rules capturing every possible
system state. Instead, a distributed reactive approach is implemented using the
tools and methodologies developed by the Real-Time Embedded Systems group.Comment: 2nd Workshop on Engineering of Autonomic Systems (EASe), in the 12th
Annual IEEE International Conference and Workshop on the Engineering of
Computer Based Systems (ECBS), Washington, DC, April, 200
- …