8 research outputs found
Studies of inspection algorithms and associated microprogrammable hardware implementations
This work is concerned with the design and development of real-time algorithms for industrial inspection applications. Rather than implement algorithms in dedicated hardware, microprogrammable machines were considered essential in order to maintain flexibility. After a survey of image pattern recognition where algorithms applicable to real-time use are cited, this thesis presents industrial inspection algorithms that locate and scrutinise actual manufactured products. These are fast and robust - a necessary requirement in industrial environments. The National Physical Laboratory have developed a Linear Array Processor (LAP) specifically designed for industrial recognition work. As with most array processors, the LAP has a greater performance than conventional processors, yet is strictly limited to parallel algorithms for optimum performance. It was therefore necessary to incorporate sequentialism into the design of a multiprocessor system. A microcoded bit-slice Sequential Image Processor (SIP) has been designed and built at RHBNC in conjunction with the NPL. This was primarily intended as a post-processor for the LAP based on the VMEbus but in fact has proved its usefulness as a stand-alone processor. This is described along with an assembler written for SIP which translates assembly language mnemonics to microcode. This work, which includes a review of current architectures, leads to the specification of a hybrid (SIMD/NIMD) architecture consisting of multiple autonomous sequential processors. This involves an analysis of various configurations and entails an investigation of the source of bottlenecks within each design. Such systems require a significant amount of interprocessor communication: methods for achieving this are discussed, some of which have only become practical with the decrease incost of electronic components. This eventually leads to a system for which algorithm execution speed increases approximately linearly with the number of processors. The algorithms described in earlier chapters are examined on the system and the practicalities of such a design are analysed in detail. Overall, this thesis has arrived at designs of programmable real-time inspection systems, and has obtained guidelines which will help with the implementation of future inspection systems.<p
Decompose and Conquer: Addressing Evasive Errors in Systems on Chip
Modern computer chips comprise many components, including microprocessor cores, memory modules, on-chip networks, and accelerators. Such system-on-chip (SoC) designs are deployed in a variety of computing devices: from internet-of-things, to smartphones, to personal computers, to data centers. In this dissertation, we discuss evasive errors in SoC designs and how these errors can be addressed efficiently. In particular, we focus on two types of errors: design bugs and permanent faults. Design bugs originate from the limited amount of time allowed for design verification and validation. Thus, they are often found in functional features that are rarely activated. Complete functional verification, which can eliminate design bugs, is extremely time-consuming, thus impractical in modern complex SoC designs. Permanent faults are caused by failures of fragile transistors in nano-scale semiconductor manufacturing processes. Indeed, weak transistors may wear out unexpectedly within the lifespan of the design. Hardware structures that reduce the occurrence of permanent faults incur significant silicon area or performance overheads, thus they are infeasible for most cost-sensitive SoC designs.
To tackle and overcome these evasive errors efficiently, we propose to leverage the principle of decomposition to lower the complexity of the software analysis or the hardware structures involved. To this end, we present several decomposition techniques, specific to major SoC components. We first focus on microprocessor cores, by presenting a lightweight bug-masking analysis that decomposes a program into individual instructions to identify if a design bug would be masked by the program's execution. We then move to memory subsystems: there, we offer an efficient memory consistency testing framework to detect buggy memory-ordering behaviors, which decomposes the memory-ordering graph into small components based on incremental differences. We also propose a microarchitectural patching solution for memory subsystem bugs, which augments each core node with a small distributed programmable logic, instead of including a global patching module. In the context of on-chip networks, we propose two routing reconfiguration algorithms that bypass faulty network resources. The first computes short-term routes in a distributed fashion, localized to the fault region. The second decomposes application-aware routing computation into simple routing rules so to quickly find deadlock-free, application-optimized routes in a fault-ridden network. Finally, we consider general accelerator modules in SoC designs. When a system includes many accelerators, there are a variety of interactions among them that must be verified to catch buggy interactions. To this end, we decompose such inter-module communication into basic interaction elements, which can be reassembled into new, interesting tests.
Overall, we show that the decomposition of complex software algorithms and hardware structures can significantly reduce overheads: up to three orders of magnitude in the bug-masking analysis and the application-aware routing, approximately 50 times in the routing reconfiguration latency, and 5 times on average in the memory-ordering graph checking. These overhead reductions come with losses in error coverage: 23% undetected bug-masking incidents, 39% non-patchable memory bugs, and occasionally we overlook rare patterns of multiple faults. In this dissertation, we discuss the ideas and their trade-offs, and present future research directions.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/147637/1/doowon_1.pd
Algorithm to layout (ATL) systems for VLSI design
PhD ThesisThe complexities involved in custom VLSI design together with the
failure of CAD techniques to keep pace with advances in the fabrication
technology have resulted in a design bottleneck. Powerful tools are
required to exploit the processing potential offered by the densities now
available. Describing a system in a high level algorithmic notation
makes writing, understanding, modification, and verification of a design
description easier. It also removes some of the emphasis on the physical
issues of VLSI design, and focus attention on formulating a correct and
well structured design. This thesis examines how current trends in CAD
techniques might influence the evolution of advanced Algorithm To Layout
(ATL) systems. The envisaged features of an example system are
specified. Particular attention is given to the implementation of one
its features COPTS (Compilation Of Occam Programs To Schematics).
COPTS is capable of generating schematic diagrams from which an
actual layout can be derived. It takes a description written in a subset
of Occam and generates a high level schematic diagram depicting its
realisation as a VLSI system. This diagram provides the designer with
feedback on the relative placement and interconnection of the operators
used in the source code. It also gives a visual representation of the
parallelism defined in the Occam description. Such diagrams are a
valuable aid in documenting the implementation of a design.
Occam has also been selected as the input to the design system that
COPTS is a feature of. The choice of Occam was made on the assumption
that the most appropriate algorithmic notation for such a design system
will be a suitable high level programming language. This is in contrast
to current automated VLSI design systems, which typically use a hardware
des~ription language for input. These special purpose languages
currently concentrate on handling structural/behavioural information and
have limited ability to express algorithms. Using a language such as
Occam allows a designer to write a behavioural description which can be
compiled and executed as a simulator, or prototype, of the system. The
programmability introduced into the design process enables designers to
concentrate on a design's underlying algorithm. The choice of this
algorithm is the most crucial decision since it determines the
performance and area of the silicon implementation.
The thesis is divided into four sections, each of several chapters.
The first section considers VLSI design complexity, compares the expert
systems and silicon compilation approaches to tackling it, and examines
its parallels with software complexity. The second section reviews the
advantages of using a conventional programming language for VLSI system
descriptions. A number of alternative high level programming languages
are considered for application in VLSI design. The third section defines
the overall ATL system COPTS is envisaged to be part of, and considers
the schematic representation of Occam programs. The final section
presents a summary of the overall project and suggestions for future work
on realising the full ATL system
MULTI-OBJECTIVE DESIGN AUTOMATION FOR RECONFIGURABLE MULTI-PROCESSOR SYSTEMS
Ph.DDOCTOR OF PHILOSOPH