660 research outputs found
Design of testbed and emulation tools
The research summarized was concerned with the design of testbed and emulation tools suitable to assist in projecting, with reasonable accuracy, the expected performance of highly concurrent computing systems on large, complete applications. Such testbed and emulation tools are intended for the eventual use of those exploring new concurrent system architectures and organizations, either as users or as designers of such systems. While a range of alternatives was considered, a software based set of hierarchical tools was chosen to provide maximum flexibility, to ease in moving to new computers as technology improves and to take advantage of the inherent reliability and availability of commercially available computing systems
FieldPlacer - A flexible, fast and unconstrained force-directed placement method for heterogeneous reconfigurable logic architectures
The field of placement methods for components of integrated circuits, especially in the domain of reconfigurable chip architectures, is mainly dominated by a handful of concepts. While some of these are easy to apply but difficult to adapt to new situations, others are more flexible but rather complex to realize.
This work presents the FieldPlacer framework, a flexible, fast and unconstrained force-directed placement method for heterogeneous reconfigurable logic architectures, in particular for the ever important heterogeneous FPGAs.
In contrast to many other force-directed placers, this approach is called ‘unconstrained’ as it does not require a priori fixed logic elements in order to calculate a force equilibrium as the solution to a system of equations. Instead, it is based on a free spring embedder simulation of a graph representation which includes all logic block types of a design simultaneously. The FieldPlacer framework offers a huge amount of flexibility in applying different distance norms (e. g., the Manhattan distance) for the force-directed layout and aims at creating adapted layouts for various objective functions, e. g., highest performance or improved routability. Depending on the individual situation, a runtime-quality trade-off can be considered to either produce a decent placement in a very short time or to generate an exceptionally good placement, which takes longer.
An extensive comparison with the latest simulated annealing placement method from the well-known Versatile Place and Route (VPR) framework shows that the FieldPlacer approach can create placements of comparable quality much faster than VPR or, alternatively, generate better placements in the same time. The flexibility in defining arbitrary objective functions and the intuitive adaptability of the method, which, among others, includes different concepts from the field of graph drawing, should facilitate further developments with this framework, e. g., for new upcoming optimization targets like the energy consumption of an implemented design
The 1991 3rd NASA Symposium on VLSI Design
Papers from the symposium are presented from the following sessions: (1) featured presentations 1; (2) very large scale integration (VLSI) circuit design; (3) VLSI architecture 1; (4) featured presentations 2; (5) neural networks; (6) VLSI architectures 2; (7) featured presentations 3; (8) verification 1; (9) analog design; (10) verification 2; (11) design innovations 1; (12) asynchronous design; and (13) design innovations 2
A microprocessor based high speed packet switch for satellite communications
The architectures of a single processor, a three processor, and a multiple processor system are described. The hardware circuits, and software routines required for implementing the three and multiple processor designs are presented. A bit-slice microprocessor was designed and microprogrammed. Maximum throughput was calculated for all three designs. Queue theoretic models for these three designs were developed and utilized to obtain analytical expressions for the average waiting times, overall average response times and average queue sizes. From these expressions, graphs were obtained showing the effect on the system performance of a number of design parameters
On-board B-ISDN fast packet switching architectures. Phase 1: Study
The broadband integrate services digital network (B-ISDN) is an emerging telecommunications technology that will meet most of the telecommunications networking needs in the mid-1990's to early next century. The satellite-based system is well positioned for providing B-ISDN service with its inherent capabilities of point-to-multipoint and broadcast transmission, virtually unlimited connectivity between any two points within a beam coverage, short deployment time of communications facility, flexible and dynamic reallocation of space segment capacity, and distance insensitive cost. On-board processing satellites, particularly in a multiple spot beam environment, will provide enhanced connectivity, better performance, optimized access and transmission link design, and lower user service cost. The following are described: the user and network aspects of broadband services; the current development status in broadband services; various satellite network architectures including system design issues; and various fast packet switch architectures and their detail designs
Recommended from our members
Testability considerations for implementing an embedded memory subsystem
textThere are a number of testability considerations for VLSI design,
but test coverage, test time, accuracy of test patterns and
correctness of design information for DFD (Design for debug) are
the most important ones in design with embedded memories. The goal
of DFT (Design-for-Test) is to achieve zero defects. When it comes
to the memory subsystem in SOCs (system on chips), many flavors of
memory BIST (built-in self test) are able to get high test
coverage in a memory, but often, no proper attention is given to
the memory interface logic (shadow logic). Functional testing and
BIST are the most prevalent tests for this logic, but functional
testing is impractical for complicated SOC designs. As a result,
industry has widely used at-speed scan testing to detect delay
induced defects. Compared with functional testing, scan-based
testing for delay faults reduces overall pattern generation
complexity and cost by enhancing both controllability and
observability of flip-flops. However, without proper modeling of
memory, Xs are generated from memories. Also, when the design has
chip compression logic, the number of ATPG patterns is increased
significantly due to Xs from memories. In this dissertation, a
register based testing method and X prevention logic are presented
to tackle these problems.
An important design stage for scan based testing with memory
subsystems is the step to create a gate level model and verify
with this model. The flow needs to provide a robust ATPG netlist
model. Most industry standard CAD tools used to analyze fault
coverage and generate test vectors require gate level models.
However, custom embedded memories are typically designed using a
transistor-level flow, there is a need for an abstraction step to
generate the gate models, which must be equivalent to the actual
design (transistor level). The contribution of the research is a
framework to verify that the gate level representation of custom
designs is equivalent to the transistor-level design.
Compared to basic stuck-at fault testing, the number of patterns
for at-speed testing is much larger than for basic stuck-at fault
testing. So reducing test and data volume are important. In this
desertion, a new scan reordering method is introduced to reduce
test data with an optimal routing solution. With in depth
understanding of embedded memories and flows developed during the
study of custom memory DFT, a custom embedded memory Bit Mapping
method using a symbolic simulator is presented in the last chapter
to achieve high yield for memories.Electrical and Computer Engineerin
Interconnect technologies for very large spiking neural networks
In the scope of this thesis, a neural event communication architecture has been developed for use in an accelerated neuromorphic computing system and with a packet-based high performance interconnection network. Existing neuromorphic computing systems mostly use highly customised interconnection networks, directly routing single spike events to their destination. In contrast, the approach of this thesis uses a general purpose packet-based interconnection network and accumulates multiple spike events at the source node into larger network packets destined to common destinations. This is required to optimise the payload efficiency, given relatively large packet headers as compared to the size of neural spike events.
Theoretical considerations are made about the efficiency of different event aggregation strategies. Thereby, important factors are the number of occurring event network-destinations and their relative frequency, as well as the number of available accumulation buffers. Based on the concept of Markov Chains, an analytical method is developed and used to evaluate these aggregation strategies. Additionally, some of these strategies are stochastically simulated in order to verify the analytical method and evaluate them beyond its applicability. Based on the results of this analysis, an optimisation strategy is proposed for the mapping of neural populations onto interconnected neuromorphic chips, as well as the joint assignment of event network-destinations to a set of accumulation buffers.
During this thesis, such an event communication architecture has been implemented on the communication FPGAs in the BrainScaleS-2 accelerated neuromorphic computing system. Thereby, its usability can be scaled beyond single chip setups. For this, the EXTOLL network technology is used to transport and route the aggregated neural event packets with high bandwidth and low latency. At the FPGA, a network bandwidth of up to 12 Gbit/s is usable at a maximum payload efficiency of 94 %. The latency has been measured in the scope of this thesis to a range between 1.6 μs and 2.3 μs across the network between two neuron circuits on separate chips. This latency is thereby mostly dominated by the path from the neuromorphic chip across the communication FPGA into the network and back on the receiving side. As the EXTOLL network hardware itself is clocked at a much higher frequency than the FPGAs, the latency is expected to scale in the order of only approximately 75 ns for each additional hop through the network.
For being able to globally interpret the arrival timestamps that are transmitted with every spike event, the system time counters on the FPGAs are synchronised across the network. For this, the global interrupt mechanism implemented in the EXTOLL hardware is characterised and used within this thesis. With this, a synchronisation accuracy of ±40ns could be measured.
At the end of this thesis, the successful emulation of a neural signal propagation model, distributed across two BrainScaleS-2 chips and FPGAs is demonstrated using the implemented event communication architecture and the described synchronisation mechanism
Evolutionary algorithms for synthesis and optimisation of sequential logic circuits
Considerable progress has been made recently 1n the understanding of combinational logic optimization. Consequently a large number of university and industrial Electric Computing Aided Design (ECAD) programs are now available for optimal logic synthesis of combinational circuits. The progress with sequential logic synthesis and optimization, on the other hand, is considerably less mature. In recent years, evolutionary algorithms have been found to be remarkably effective way of using computers for solving difficult problems. This thesis is, in large part, a concentrated effort to apply this philosophy to the synthesis and optimization of sequential circuits. A state assignment based on the use of a Genetic Algorithm (GA) for the optimal synthesis of sequential circuits is presented. The state assignment determines the structure of the sequential circuit realizing the state machine and therefore its area and performances. The synthesis based on the GA approach produced designs with the smallest area to date. Test results on standard fmite state machine (FS:M) benchmarks show that the GA could generate state assignments, which required on average 15.44% fewer gates and 13.47% fewer literals compared with alternative techniques. Hardware evolution is performed through a succeSSlOn of changes/reconfigurations of elementary components, inter-connectivity and selection of the fittest configurations until the target functionality is reached. The thesis presents new approaches, which combine both genetic algorithm for state assignment and extrinsic Evolvable Hardware (EHW) to design sequential logic circuits. The implemented evolutionary algorithms are able to design logic circuits with size and complexity, which have not been demonstrated in published work. There are still plenty of opportunities to develop this new line of research for the synthesis, optimization and test of novel digital, analogue and mixed circuits. This should lead to a new generation of Electronic Design Automation tools.EThOS - Electronic Theses Online ServiceGBUnited Kingdo
- …