20 research outputs found

    Evolution in Computing Hardware

    No full text
    Brief Description:&nbsp;This lecture will review key aspects of CERN’s central computing from mid-1970 until now; more than four decades corresponding to the speaker’s involvement with CERN. Lecturer's Short Bio:&nbsp;Sverre Jarp worked in the IT Department at CERN for over 40 years and held multiple managerial and technical positions promoting advanced but cost-effective, large-scale computing and data management solutions for the Laboratory. Today, as honorary staff at CERN, he retains an unabated interest in several IT areas, in particular processor architecture related to the domains of Big Data and High Throughput Computing as well as application scalability based on vector- and parallel-programming. S. Jarp holds a degree in Theoretical Physics from the Norwegian University of Science and Technology (NTNU) in Trondheim.</p

    25 years ago: the strategic move to PCs in high-energy physics

    No full text
    In September 1995, 25 years ago, a paper entitled “The PC as Physics Computer for LHC?” was presented at the CHEP-95 conference. It paved the way to a new era in high-energy physics computing. One of its authors, Sverre Jarp, recalls

    The future of commodity computing and many-core versus the interests of HEP software

    No full text
    As the mainstream computing world has shifted from multi-core to many-core platforms, the situation for software developers has changed as well. With the numerous hardware and software options available, choices balancing programmability and performance are becoming a significant challenge. The expanding multiplicative dimensions of performance offer a growing number of possibilities that need to be assessed and addressed on several levels of abstraction. This paper reviews the major trade-offs forced upon the software domain by the changing landscape of parallel technologies - hardware and software alike. Recent developments, paradigms and techniques are considered with respect to their impact on the rather traditional HEP programming models. Other considerations addressed include aspects of efficiency and reasonably achievable targets for the parallelization of large scale HEP workloads

    Allocation efficace et non contraignante des ressources de grilles de calcul Ă  l'aide d'environnements virtuels

    No full text
    Les grilles de calcul agrègent des centres de calcul qui conservent le choix et la maitrise de leurs ressources. Les scientifiques qui utilisent ces ressources sont réunis en organisations d'utilisateurs. Chaque organisation a ses propres applications et ses propres besoins. Une application qui s'exécute sur une grille se décompose en de nombreuses taches. L'allocation de ressources consiste à associer les taches aux serveurs pour leur bonne exécution. L'allocation de ressources met en jeu de multiples intérêts et s'opère sur de multiples domaines administratifs. Comment réaliser une allocation efficace dans ces conditions particulières? Pour répondre a cette question, nous présentons un modèle qui permet de raisonner mathématiquement sur les allocations de ressourCE un patron architectural qui sépare les responsabilités des acteurs des grilles de calcul, son déploiement, et deux nouvelles techniques de prédiction des fautes de cache en vue de l'optimisation de la performance des allocations.Computing grids aggregate computing centers that autonomously manage their own resources. Grid users are scientists that group into user collaborations according to their project or application. Every collaboration has its own resource requirements. An application that executes on a grid is divided in tasks. Resource allocation consists in mapping tasks to servers for their execution. Can grids efficiently orchestrate an allocation that involves multiple user collaborations and resource administrative domains with diverging interests? ln order to answer this question, we present a formai model that allows us to reason on resource allocations, an architectural design pattern that separates the responsibilities of the different parties, a software solution for its deployment, and two novel cache misses prediction techniques in order to help optimize allocation performance.PARIS-Télécom ParisTech (751132302) / SudocSudocFranceF

    Many-core experience with HEP software at CERN openlab

    No full text
    The continued progression of Moore's law has led to many-core platforms becoming easily accessible commodity equipment. New opportunities that arose from this change have also brought new challenges: harnessing the raw potential of computation of such a platform is not always a straightforward task. This paper describes practical experience coming out of the work with many-core systems at CERN openlab and the observed differences with respect to their predecessors. We provide the latest results for a set of parallelized HEP benchmarks running on several classes of many-core platforms

    Comparison of Software Technologies for Vectorization and Parallelization

    No full text
    This paper demonstrates how modern software development methodologies can be used to give an existing sequential application a considerable performance speed-up on modern x86 server systems. Whereas, in the past, speed-up was directly linked to the increase in clock frequency when moving to a more modern system, current x86 servers present a plethora of “performance dimensions” that need to be harnessed with great care. The application we used is a real-life data analysis example in C++ analyzing High Energy Physics data. The key software methods used are OpenMP, Intel Threading Building Blocks (TBB), Intel Cilk Plus, and the auto-vectorization capability of the Intel compiler (Composer XE). Somewhat surprisingly, the Message Passing Interface (MPI) is successfully added, although our focus is on single-node rather than multi-node performance optimization. The paper underlines the importance of algorithmic redesign in order to optimize each performance dimension and links this to close control of the memory layout in a thread-safe environment. The data fitting algorithm at the heart of the application is very floating-point intensive so the paper also discusses how to ensure optimal performance of mathematical functions (in our case, the exponential function) as well as numerical correctness and reproducibility. The test runs on single-, dual-, and quad-socket servers show first of all that vectorization of the algorithm (with either auto-vectorization by the compiler or the use of Intel Cilk Plus Array Notation) gives more than a factor 2 in speed-up when the data layout in memory is properly optimized. Using coarse-grained parallelism all three approaches (OpenMP, Cilk Plus, and TBB) showed good parallel speed-up on the available CPU cores. The best one was obtained with OpenMP, but by combining Cilk Plus and TBB with MPI in order to tie processes to sockets, these two software methods nicely closed the gap and TBB came out with a slight advantage in the end. Overall, we conclude that the best implementation in terms of both ease of implementation and the resulting performance is a combination of the Intel Cilk Plus Array Notation for vectorization and a hybrid TBB and MPI approach for parallelization
    corecore