5,342 research outputs found
Single-layer bus routing for high-speed boards
As the clock frequencies used in industry increase, the timing requirements on high-speed boards become very tight. Since wire length is directly proportional to wire delay of the buses that connect each chip on high-speed
boards, each wire in the bus has to be tightly bounded by the maximum and minimum lengths during routing. These rigid requirements cause challenges for automatic routing. Therefore, more aggressive routing algorithms are required for current industrial circuits.
This thesis intends to improve Ozdal and Wong's previous work, which is an algorithmic study of single-layer bus routing on high-speed boards. Their
routing algorithm assumes that there are no boundaries in the grid during routing, and the maximum-length bound for each net is always met. This thesis modifies their code so that it does not make those assumptions. As a result, the program can now handle boundaries with wire snaking to meet the minimum-length bound and use diagonal wires if the Manhattan distance between the two terminal pins cannot satisfy the maximum-length bound
Scalable Interactive Volume Rendering Using Off-the-shelf Components
This paper describes an application of a second generation implementation of the Sepia architecture (Sepia-2) to interactive volu-metric visualization of large rectilinear scalar fields. By employingpipelined associative blending operators in a sort-last configuration a demonstration system with 8 rendering computers sustains 24 to 28 frames per second while interactively rendering large data volumes (1024x256x256 voxels, and 512x512x512 voxels). We believe interactive performance at these frame rates and data sizes is unprecedented. We also believe these results can be extended to other types of structured and unstructured grids and a variety of GL rendering techniques including surface rendering and shadow map-ping. We show how to extend our single-stage crossbar demonstration system to multi-stage networks in order to support much larger data sizes and higher image resolutions. This requires solving a dynamic mapping problem for a class of blending operators that includes Porter-Duff compositing operators
A Multifunctional Processing Board for the Fast Track Trigger of the H1 Experiment
The electron-proton collider HERA is being upgraded to provide higher
luminosity from the end of the year 2001. In order to enhance the selectivity
on exclusive processes a Fast Track Trigger (FTT) with high momentum resolution
is being built for the H1 Collaboration. The FTT will perform a 3-dimensional
reconstruction of curved tracks in a magnetic field of 1.1 Tesla down to 100
MeV in transverse momentum. It is able to reconstruct up to 48 tracks within 23
mus in a high track multiplicity environment. The FTT consists of two hardware
levels L1, L2 and a third software level. Analog signals of 450 wires are
digitized at the first level stage followed by a quick lookup of valid track
segment patterns.
For the main processing tasks at the second level such as linking, fitting
and deciding, a multifunctional processing board has been developed by the ETH
Zurich in collaboration with Supercomputing Systems (Zurich). It integrates a
high-density FPGA (Altera APEX 20K600E) and four floating point DSPs (Texas
Instruments TMS320C6701). This presentation will mainly concentrate on second
trigger level hardware aspects and on the implementation of the algorithms used
for linking and fitting. Emphasis is especially put on the integrated CAM
(content addressable memory) functionality of the FPGA, which is ideally suited
for implementing fast search tasks like track segment linking.Comment: 6 pages, 4 figures, submitted to TN
The Design and Implementation of a PCIe-based LESS Label Switch
With the explosion of the Internet of Things, the number of smart, embedded devices has grown exponentially in the last decade, with growth projected at a commiserate rate. These devices create strain on the existing infrastructure of the Internet, creating challenges with scalability of routing tables and reliability of packet delivery. Various schemes based on Location-Based Forwarding and ID-based routing have been proposed to solve the aforementioned problems, but thus far, no solution has completely been achieved. This thesis seeks to improve current proposed LORIF routers by designing, implementing, and testing and a PCIe-based LESS switch to process unrouteable packets under the current LESS forwarding engine
NaNet: a Low-Latency, Real-Time, Multi-Standard Network Interface Card with GPUDirect Features
While the GPGPU paradigm is widely recognized as an effective approach to
high performance computing, its adoption in low-latency, real-time systems is
still in its early stages.
Although GPUs typically show deterministic behaviour in terms of latency in
executing computational kernels as soon as data is available in their internal
memories, assessment of real-time features of a standard GPGPU system needs
careful characterization of all subsystems along data stream path.
The networking subsystem results in being the most critical one in terms of
absolute value and fluctuations of its response latency.
Our envisioned solution to this issue is NaNet, a FPGA-based PCIe Network
Interface Card (NIC) design featuring a configurable and extensible set of
network channels with direct access through GPUDirect to NVIDIA Fermi/Kepler
GPU memories.
NaNet design currently supports both standard - GbE (1000BASE-T) and 10GbE
(10Base-R) - and custom - 34~Gbps APElink and 2.5~Gbps deterministic latency
KM3link - channels, but its modularity allows for a straightforward inclusion
of other link technologies.
To avoid host OS intervention on data stream and remove a possible source of
jitter, the design includes a network/transport layer offload module with
cycle-accurate, upper-bound latency, supporting UDP, KM3link Time Division
Multiplexing and APElink protocols.
After NaNet architecture description and its latency/bandwidth
characterization for all supported links, two real world use cases will be
presented: the GPU-based low level trigger for the RICH detector in the NA62
experiment at CERN and the on-/off-shore data link for KM3 underwater neutrino
telescope
Construction and commissioning of a technological prototype of a high-granularity semi-digital hadronic calorimeter
A large prototype of 1.3m3 was designed and built as a demonstrator of the
semi-digital hadronic calorimeter (SDHCAL) concept proposed for the future ILC
experiments. The prototype is a sampling hadronic calorimeter of 48 units. Each
unit is built of an active layer made of 1m2 Glass Resistive Plate
Chamber(GRPC) detector placed inside a cassette whose walls are made of
stainless steel. The cassette contains also the electronics used to read out
the GRPC detector. The lateral granularity of the active layer is provided by
the electronics pick-up pads of 1cm2 each. The cassettes are inserted into a
self-supporting mechanical structure built also of stainless steel plates
which, with the cassettes walls, play the role of the absorber. The prototype
was designed to be very compact and important efforts were made to minimize the
number of services cables to optimize the efficiency of the Particle Flow
Algorithm techniques to be used in the future ILC experiments. The different
components of the SDHCAL prototype were studied individually and strict
criteria were applied for the final selection of these components. Basic
calibration procedures were performed after the prototype assembling. The
prototype is the first of a series of new-generation detectors equipped with a
power-pulsing mode intended to reduce the power consumption of this highly
granular detector. A dedicated acquisition system was developed to deal with
the output of more than 440000 electronics channels in both trigger and
triggerless modes. After its completion in 2011, the prototype was commissioned
using cosmic rays and particles beams at CERN.Comment: 49 pages, 41 figure
- âŠ