Search CORE

5,342 research outputs found

Single-layer bus routing for high-speed boards

Author: Chang Yun Wei
Publication venue
Publication date: 01/12/2011
Field of study

As the clock frequencies used in industry increase, the timing requirements on high-speed boards become very tight. Since wire length is directly proportional to wire delay of the buses that connect each chip on high-speed boards, each wire in the bus has to be tightly bounded by the maximum and minimum lengths during routing. These rigid requirements cause challenges for automatic routing. Therefore, more aggressive routing algorithms are required for current industrial circuits. This thesis intends to improve Ozdal and Wong's previous work, which is an algorithmic study of single-layer bus routing on high-speed boards. Their routing algorithm assumes that there are no boundaries in the grid during routing, and the maximum-length bound for each net is always met. This thesis modifies their code so that it does not make those assumptions. As a result, the program can now handle boundaries with wire snaking to meet the minimum-length bound and use diagonal wires if the Manhattan distance between the two terminal pins cannot satisfy the maximum-length bound

Illinois Digital Environment for Access to Learning and Scholarship Repository

Scalable Interactive Volume Rendering Using Off-the-shelf Components

Author: Breen David
Heirich Alan
Lombeyda Santiago
Moll Laurent
Shand Mark
Publication venue: 'California Institute of Technology Library'
Publication date: 01/01/2001
Field of study

This paper describes an application of a second generation implementation of the Sepia architecture (Sepia-2) to interactive volu-metric visualization of large rectilinear scalar fields. By employingpipelined associative blending operators in a sort-last configuration a demonstration system with 8 rendering computers sustains 24 to 28 frames per second while interactively rendering large data volumes (1024x256x256 voxels, and 512x512x512 voxels). We believe interactive performance at these frame rates and data sizes is unprecedented. We also believe these results can be extended to other types of structured and unstructured grids and a variety of GL rendering techniques including surface rendering and shadow map-ping. We show how to extend our single-stage crossbar demonstration system to multi-stage networks in order to support much larger data sizes and higher image resolutions. This requires solving a dynamic mapping problem for a class of blending operators that includes Porter-Duff compositing operators

CiteSeerX

Caltech Authors

A Multifunctional Processing Board for the Fast Track Trigger of the H1 Experiment

Author: Meer D.
Muller D.
Muller J.
Schoning A.
Wissing Ch.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2001
Field of study

The electron-proton collider HERA is being upgraded to provide higher luminosity from the end of the year 2001. In order to enhance the selectivity on exclusive processes a Fast Track Trigger (FTT) with high momentum resolution is being built for the H1 Collaboration. The FTT will perform a 3-dimensional reconstruction of curved tracks in a magnetic field of 1.1 Tesla down to 100 MeV in transverse momentum. It is able to reconstruct up to 48 tracks within 23 mus in a high track multiplicity environment. The FTT consists of two hardware levels L1, L2 and a third software level. Analog signals of 450 wires are digitized at the first level stage followed by a quick lookup of valid track segment patterns. For the main processing tasks at the second level such as linking, fitting and deciding, a multifunctional processing board has been developed by the ETH Zurich in collaboration with Supercomputing Systems (Zurich). It integrates a high-density FPGA (Altera APEX 20K600E) and four floating point DSPs (Texas Instruments TMS320C6701). This presentation will mainly concentrate on second trigger level hardware aspects and on the implementation of the algorithms used for linking and fitting. Emphasis is especially put on the integrated CAM (content addressable memory) functionality of the FPGA, which is ideally suited for implementing fast search tasks like track segment linking.Comment: 6 pages, 4 figures, submitted to TN

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

The Design and Implementation of a PCIe-based LESS Label Switch

Author: Williams Amy C
Publication venue: Scholars Crossing
Publication date: 01/04/2017
Field of study

With the explosion of the Internet of Things, the number of smart, embedded devices has grown exponentially in the last decade, with growth projected at a commiserate rate. These devices create strain on the existing infrastructure of the Internet, creating challenges with scalability of routing tables and reliability of packet delivery. Various schemes based on Location-Based Forwarding and ID-based routing have been proposed to solve the aforementioned problems, but thus far, no solution has completely been achieved. This thesis seeks to improve current proposed LORIF routers by designing, implementing, and testing and a PCIe-based LESS switch to process unrouteable packets under the current LESS forwarding engine

Liberty University Digital Commons

HERO: High-speed Enhanced Routing Operation in Software Routers NICs

Author: Bianco Andrea
Birke Robert Rene' Maria
Petracca Michele
Publication venue: IEEE
Publication date: 01/01/2008
Field of study

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

NaNet: a Low-Latency, Real-Time, Multi-Standard Network Interface Card with GPUDirect Features

Author: Ameli F.
Ammendola R.
Biagioni A.
Frezza O.
Lamanna G.
Lo Cicero F.
Lonardo A.
Martinelli M.
Nicolau C.
Paolucci P.S.
Pastorelli E.
Pontisso L.
Rossetti D.
Simeone F.
Simula F.
Sozzi M.
Tosoratto L.
Vicini P.
Publication venue
Publication date: 13/06/2014
Field of study

While the GPGPU paradigm is widely recognized as an effective approach to high performance computing, its adoption in low-latency, real-time systems is still in its early stages. Although GPUs typically show deterministic behaviour in terms of latency in executing computational kernels as soon as data is available in their internal memories, assessment of real-time features of a standard GPGPU system needs careful characterization of all subsystems along data stream path. The networking subsystem results in being the most critical one in terms of absolute value and fluctuations of its response latency. Our envisioned solution to this issue is NaNet, a FPGA-based PCIe Network Interface Card (NIC) design featuring a configurable and extensible set of network channels with direct access through GPUDirect to NVIDIA Fermi/Kepler GPU memories. NaNet design currently supports both standard - GbE (1000BASE-T) and 10GbE (10Base-R) - and custom - 34~Gbps APElink and 2.5~Gbps deterministic latency KM3link - channels, but its modularity allows for a straightforward inclusion of other link technologies. To avoid host OS intervention on data stream and remove a possible source of jitter, the design includes a network/transport layer offload module with cycle-accurate, upper-bound latency, supporting UDP, KM3link Time Division Multiplexing and APElink protocols. After NaNet architecture description and its latency/bandwidth characterization for all supported links, two real world use cases will be presented: the GPU-based low level trigger for the RICH detector in the NA62 experiment at CERN and the on-/off-shore data link for KM3 underwater neutrino telescope

arXiv.org e-Print Archive

CERN Document Server

Construction and commissioning of a technological prototype of a high-granularity semi-digital hadronic calorimeter

A large prototype of 1.3m3 was designed and built as a demonstrator of the semi-digital hadronic calorimeter (SDHCAL) concept proposed for the future ILC experiments. The prototype is a sampling hadronic calorimeter of 48 units. Each unit is built of an active layer made of 1m2 Glass Resistive Plate Chamber(GRPC) detector placed inside a cassette whose walls are made of stainless steel. The cassette contains also the electronics used to read out the GRPC detector. The lateral granularity of the active layer is provided by the electronics pick-up pads of 1cm2 each. The cassettes are inserted into a self-supporting mechanical structure built also of stainless steel plates which, with the cassettes walls, play the role of the absorber. The prototype was designed to be very compact and important efforts were made to minimize the number of services cables to optimize the efficiency of the Particle Flow Algorithm techniques to be used in the future ILC experiments. The different components of the SDHCAL prototype were studied individually and strict criteria were applied for the final selection of these components. Basic calibration procedures were performed after the prototype assembling. The prototype is the first of a series of new-generation detectors equipped with a power-pulsing mode intended to reduce the power consumption of this highly granular detector. A dedicated acquisition system was developed to deal with the output of more than 440000 electronics channels in both trigger and triggerless modes. After its completion in 2011, the prototype was commissioned using cosmic rays and particles beams at CERN.Comment: 49 pages, 41 figure

arXiv.org e-Print Archive

HAL-IN2P3

Hal - Université Grenoble Alpes

Ghent University Academic Bibliography

HAL Université de Savoie

CERN Document Server

HAL-Polytechnique