893 research outputs found
Study of the data acquisition network for the triggerless data acquisition of the LHCb experiment and new particle track reconstruction strategies for the LHCb upgrade
The LHCb experiment will receive a major upgrade by the end of February 2021. This upgrade will allow the recording of proton-proton collision data at with an instantaneous luminosity of , making possible measurements of unprecedented precision in the and -quark flavour sectors.
For taking advantage of the increased luminosity provided, the data acquisition system will receive a substantial upgrade. The upgraded system will be capable of processing the full collision rate of , without any low-level hardware preselection. This new design constraint poses a non-trivial technological challenge, both from a networking and computing point of view.
A possible design of a data acquisition network is presented, and low-level network simulations are used to validate the design. Those simulations use an accurate behavioural model developed and optimised for this specific purpose.
It is mandatory to optimise the reconstruction algorithms using a computing and physics approach, to perform the online reconstruction of the full collisions rate. A new parametrisation of the charged particles' bending generated by the dipole of the LHCb experiment is presented. The accuracy of the model is tested against Monte Carlo data. This strategy can reduce by a factor four the size of the search windows needed in the SciFi sub-detector. The LookingForward algorithm in the Allen framework uses this model
An empirical evaluation of techniques for parallel simulation of message passing networks
209 p.[EN]In the field of computer design, simulation is an essential tool to validate and evaluate architectural proposals. Conventional simulation techniques, designed for their use in sequential computers, are too slow if the system to simulate is large or complex. The aim of this work is to search for techniques to accelerate simulations exploiting the parallelism available in current, commercial multicomputers, and to use these techniques to study a model of a message router. This router has been designed to constitute the communication infrastructure of a (hypothetical) massively parallel computer.
Three parallel simulation techniques have been considered: synchronous, asynchronous-conservative and asynchronous-optimistic. These algorithms have been implemented in three multicomputers: a transputer-based Supernode, an Intel Paragon and a network of workstations. The influence that factors such as the characteristics of the simulated models, the organization of the simulators and the characteristics of the target multicomputers have in the performance of the simulations has been measured and characterized.
It is concluded that optimistic parallel simulation techniques are not suitable for the considered kind of models, although they may provide good performance in other environments. A network of workstations is not the right platform for our experiments, because the communication demands of the parallel simulators surpass the abilities of local area networks—the granularity is too fine. Synchronous and conservative parallel simulation techniques perform very well in the Supernode and in the Paragon, specially if the model to simulate is complex or large—precisely the worst case for traditional, sequential simulators. This way, studies previously considered as unrealizable, due to their exceedingly high computational cost, can be performed in reasonable times. Additionally, the spectrum of possibilities of using multicomputers can be broadened to execute more than numeric applications.[ES]En el ámbito del diseño de computadores, la simulación es una herramienta imprescindible para la validación y evaluación de cualquier propuesta arquitectónica. Las ténicas convencionales de simulación, diseñadas para su utilización en computadores secuenciales, son demasiado lentas si el sistema a simular es grande o complejo. El objetivo de esta tesis es buscar técnicas para acelerar estas simulaciones, aprovechando el paralelismo disponible en multicomputadores comerciales, y usar esas técnicas para el estudio de un modelo de encaminador de mensajes. Este encaminador está diseñado para formar infraestructura de comunicaciones de un hipotético computador masivamente paralelo.
En este trabajo se consideran tres técnicas de simulación paralela: síncrona, asíncrona-conservadora y asíncrona-optimista. Estos algoritmos se han implementado en tres multicomputadores: un Supernode basado en Transputers, un Intel Paragon y una red de estaciones de trabajo. Se caracteriza la influencia que tienen en las prestaciones de los simuladores aspectos tales como los parámetros del modelo simulado, la organización del simulador y las características del multicomputador utilizado.
Se concluye que las técnicas de simulación paralela optimista no resultan adecuadas para trabajar con el modelo considerado, aunque pueden ofrecer un buen rendimiento en otros entornos. La red de estaciones de trabajo no resulta una plataforma apropiada para estas simulaciones, ya que una red local no reúne condiciones para la ejecución de aplicaciones paralelas de grano fino. Las técnicas de simulación paralela síncrona y conservadora dan muy buenos resultados en el Supernode y en el Paragon, especialmente si el modelo a simular es complejo o grande—precisamente el peor caso para los algoritmos secuenciales. De esta forma, estudios previamente considerados inviables, por ser demasiado costosos computacionalmente, pueden realizarse en tiempos razonables. Además, se amplía el espectro de posibilidades de los multicomputadores, utilizándolos para algo más que aplicaciones numéricas.Este trabajo ha sido parcialmente subvencionado por la Comisión Interministerial de Ciencia y Tecnología, bajo contrato TIC95-037
Performance analysis of wormhole routing in multicomputer interconnection networks
Perhaps the most critical component in determining the ultimate performance potential of a multicomputer is its interconnection network, the hardware fabric supporting communication among individual processors. The message latency and throughput of such a network are affected by many factors of which topology, switching method, routing algorithm and traffic load are the most significant. In this context, the present study focuses on a performance analysis of k-ary n-cube networks employing wormhole switching, virtual channels and adaptive routing, a scenario of especial interest to current research.
This project aims to build upon earlier work in two main ways: constructing new analytical models for k-ary n-cubes, and comparing the performance merits of cubes of different dimensionality. To this end, some important topological properties of k-ary n-cubes are explored initially; in particular, expressions are derived to calculate the number of nodes at/within a given distance from a chosen centre. These results are important in their own right but their primary significance here is to assist in the construction of new and more realistic analytical models of wormhole-routed k-ary n-cubes.
An accurate analytical model for wormhole-routed k-ary n-cubes with adaptive routing and uniform traffic is then developed, incorporating the use of virtual channels and the effect of locality in the traffic pattern. New models are constructed for wormhole k-ary n-cubes, with the ability to simulate behaviour under adaptive routing and non-uniform communication workloads, such as hotspot traffic, matrix-transpose and digit-reversal permutation patterns. The models are equally applicable to unidirectional and bidirectional k-ary n-cubes and are significantly more realistic than any in use up to now. With this level of accuracy, the effect of each important network parameter on the overall network performance can be investigated in a more comprehensive manner than before.
Finally, k-ary n-cubes of different dimensionality are compared using the new models. The comparison takes account of various traffic patterns and implementation costs, using both pin-out and bisection bandwidth as metrics. Networks with both normal and pipelined channels are considered. While previous similar studies have only taken account of network channel costs, our model incorporates router costs as well thus generating more realistic results. In fact the results of this work differ markedly from those yielded by earlier studies which assumed deterministic routing and uniform traffic, illustrating the importance of using accurate models to conduct such analyses
Parameterizable network-on-chip emulation framework
Networks-on-Chip (NoCs) have been proposed as a promising solution to complex on-chip communication problems. But there is no public accessible HDL synthesizable NoC framework which connects industrial level cores and runs real applications on them. Moreover, many challenging research problems remain unsolved at all levels of design abstraction; design exploration of NoC architecture for applications, scheduling and mapping algorithms, evaluation of switching, topology or routing algorithm for efficient execution of application and optimizing communication cost, area, energy etc Solution to solve the above problem calls for the development of synthesizable, parameterizable NoC Framework that would evaluate and implement the above outstanding research problems and algorithms with minimum ease and flexibility.
The proposed NoC Framework has been used to specifically evaluate the following algorithms or variations in architecture: i) Evaluate Switching Algorithms compare latency, congestion, area and power of Wormhole (WH) and Store and Forward (SF) switching, ii) Efficient Router Architecture: Proposed an efficient Virtual Channel architecture with loopback for SF routing is introduced to improve throughput, latency and area, iii) Static routing algorithm: Proposed a simple and efficient routing algorithm called “Mirror Routing” for Torus architectures. This helps in reducing congestion and the routing algorithm is also deadlock free, iv) Adaptive Routing Algorithm: Proposed and evaluated an adaptive routing algorithm for WK topology.
The simulation results show Wormhole Routing with better latency than Store and Forward. Area and Power usage is also relatively less for Wormhole Routing. Study on different traffic scenarios with different Virtual Channel architectures in Store and Forward routing shows considerable improvement in latency in Virtual Channel architecture with loopback. Also it is proved that the proposed Mirror Routing algorithm is able to handle a single congestion or fault in routing path. The latency increases with increase in size of Torus structure. The Adaptive routing algorithm proposed for WK Topology results in increase in latency but can be considered in scenarios where the receiver node at the congested link is comparatively slow or when the fault in link is permanent
Distributed Simulation of High-Level Algebraic Petri Nets
In the field of Petri nets, simulation is an essential tool to validate and evaluate models. Conventional simulation techniques, designed for their use in sequential computers, are too slow if the system to simulate is large or complex. The aim of this work is to search for techniques to accelerate simulations exploiting the parallelism available in current, commercial multicomputers, and to use these techniques to study a class of Petri nets called high-level algebraic nets. These nets exploit the rich theory of algebraic specifications for high-level Petri nets: Petri nets gain a great deal of modelling power by representing dynamically changing items as structured tokens whereas algebraic specifications turned out to be an adequate and flexible instrument for handling structured items. In this work we focus on ECATNets (Extended Concurrent Algebraic Term Nets) whose most distinctive feature is their semantics which is defined in terms of rewriting logic. Nevertheless, ECATNets have two drawbacks: the occultation of the aspect of time and a bad exploitation of the parallelism inherent in the models. Three distributed simulation techniques have been considered: asynchronous conservative, asynchronous optimistic and synchronous. These algorithms have been implemented in a multicomputer environment: a network of workstations. The influence that factors such as the characteristics of the simulated models, the organisation of the simulators and the characteristics of the target multicomputer have in the performance of the simulations have been measured and characterised. It is concluded that synchronous distributed simulation techniques are not suitable for the considered kind of models, although they may provide good performance in other environments. Conservative and optimistic distributed simulation techniques perform well, specially if the model to simulate is complex or large - precisely the worst case for traditional, sequential simulators. This way, studies previously considered as unrealisable, due to their exceedingly high computational cost, can be performed in reasonable times. Additionally, the spectrum of possibilities of using multicomputers can be broadened to execute more than numeric applications
Contributions to the deadlock problem in multithreaded software applications observed as Resource Allocation Systems
Desde el punto de vista de la competencia por recursos compartidos sucesivamente reutilizables, se dice que un sistema concurrente compuesto por procesos secuenciales está en situación de bloqueo si existe en él un conjunto de procesos que están indefinidamente esperando la liberación de ciertos recursos retenidos por miembros del mismo conjunto de procesos. En sistemas razonablemente complejos o distribuidos, establecer una política de asignación de recursos que sea libre de bloqueos puede ser un problema muy difícil de resolver de forma eficiente. En este sentido, los modelos formales, y particularmente las redes de Petri, se han ido afianzando como herramientas fructíferas que permiten abstraer el problema de asignación de recursos en este tipo de sistemas, con el fin de abordarlo analíticamente y proveer métodos eficientes para la correcta construcción o corrección de estos sistemas. En particular, la teoría estructural de redes de Petri se postula como un potente aliado para lidiar con el problema de la explosión de estados inherente a aquéllos. En este fértil contexto han florecido una serie de trabajos que defienden una propuesta metodológica de diseño orientada al estudio estructural y la correspondiente corrección física del problema de asignación de recursos en familias de sistemas muy significativas en determinados contextos de aplicación, como el de los Sistemas de Fabricación Flexible. Las clases de modelos de redes de Petri resultantes asumen ciertas restricciones, con significado físico en el contexto de aplicación para el que están destinadas, que alivian en buena medida la complejidad del problema. En la presente tesis, se intenta acercar ese tipo de aproximación metodológica al diseño de aplicaciones software multihilo libres de bloqueos. A tal efecto, se pone de manifiesto cómo aquellas restricciones procedentes del mundo de los Sistemas de Fabricación Flexible se muestran demasiado severas para aprehender la versatilidad inherente a los sistemas software en lo que respecta a la interacción de los procesos con los recursos compartidos. En particular, se han de resaltar dos necesidades de modelado fundamentales que obstaculizan la mera adopción de antiguas aproximaciones surgidas bajo el prisma de otros dominios: (1) la necesidad de soportar el anidamiento de bucles no desplegables en el interior de los procesos, y (2) la posible compartición de recursos no disponibles en el arranque del sistema pero que son creados o declarados por un proceso en ejecución. A resultas, se identifica una serie de requerimientos básicos para la definición de un tipo de modelos orientado al estudio de sistemas software multihilo y se presenta una clase de redes de Petri, llamada PC2R, que cumple dicha lista de requerimientos, manteniéndose a su vez respetuosa con la filosofía de diseño de anteriores subclases enfocadas a otros contextos de aplicación. Junto con la revisión e integración de anteriores resultados en el nuevo marco conceptual, se aborda el estudio de propiedades inherentes a los sistemas resultantes y su relación profunda con otros tipos de modelos, la confección de resultados y algoritmos eficientes para el análisis estructural de vivacidad en la nueva clase, así como la revisión y propuesta de métodos de resolución de los problemas de bloqueo adaptadas a las particularidades físicas del dominio de aplicación. Asimismo, se estudia la complejidad computacional de ciertas vertientes relacionadas con el problema de asignación de recursos en el nuevo contexto, así como la traslación de los resultados anteriormente mencionados sobre el dominio de la ingeniería de software multihilo, donde la nueva clase de redes permite afrontar problemas inabordables considerando el marco teórico y las herramientas suministradas para subclases anteriormente explotadas
Parallel simulation techniques for telecommunication network modelling
In this thesis, we consider the application of parallel simulation to the performance modelling of telecommunication networks. A largely automated approach was first explored using a parallelizing compiler to speed up the simulation of simple models of circuit-switched networks. This yielded reasonable results for relatively little effort compared with other approaches. However, more complex simulation models of packet- and cell-based telecommunication networks, requiring the use of discrete event techniques, need an alternative approach. A critical review of parallel discrete event simulation indicated that a distributed model components approach using conservative or optimistic synchronization would be worth exploring. Experiments were therefore conducted using simulation models of queuing networks and Asynchronous Transfer Mode (ATM) networks to explore the potential speed-up possible using this approach. Specifically, it is shown that these techniques can be used successfully to speed-up the execution of useful telecommunication network simulations. A detailed investigation has demonstrated that conservative synchronization performs very well for applications with good look ahead properties and sufficient message traffic density and, given such properties, will significantly outperform optimistic synchronization. Optimistic synchronization, however, gives reasonable speed-up for models with a wider range of such properties and can be optimized for speed-up and memory usage at run time. Thus, it is confirmed as being more generally applicable particularly as model development is somewhat easier than for conservative synchronization. This has to be balanced against the more difficult task of developing and debugging an optimistic synchronization kernel and the application models
- …