26 research outputs found
Application specific instruction set processor design for embedded application using the coware tool
An Application Specific Instruction Set Processor (ASIP) is widely used as a System on a Chip(SoC) Component. ASIPs possess an instruction set which is tai-lored to benefit a specific application. Such specialization allows ASIPs to serve as an intermediate between two dominant processor design styles- ASICs which has high processing abilities at the cost of limited programmability and Programmable solu-tions such as FPGAs that provide programming exibility at the cost of less energy eficiency. In this dissertation the goal is to design ASIP, keeping in mind a temper-ature sensor system. The platform used for processor design is LISA 2.0 description language and processor designing environment from CoWare. Coware processor de-signer allows processor architecture to be defined at an abstract level and automatic generation of chain of software tools like assembler, linker and simulator for functional verification followed by RTL level description. RTL level description is used to gen-erate synthesized report of the design using RTL compiler and finally the layout is created using Cadence encounter
Design of an Application Specific Instruction Set Processor Using LISA
A Digital Signal Processor with specific instruction sets and meant for a specific application is called as Application Specific Instruction set Processor(ASIP). To design an ASIP many approaches are available. However optimization of an ASIP becomes handy if it is designed in a higher level of abstraction that is higher than Register Transfer Level (RTL). Application Description Languages (ADLs) are becoming popular recently because of its quick and optimal design convergence achievement capability during the design of ASIPs. Several stages are required to design a processor which are architecture design implementation, software development, instruction and system verification. Verification of such ASIPs at various design stages is a tedious job to do. This thesis presents the architecture description of a simple DSP processor using ADL based instruction set description. The design process is more consistent after allowing maximum flexibility here. Further more, it enables the design process in both instruction and cycle accurate modes. The design process of a three stage pipelined FIR Filter processor is demonstrated as a case study. Further optimization can be done with respect to resources, memory size and power consumption by changing the LISA code written in CoWare platform
Increasing the efficacy of automated instruction set extension
The use of Instruction Set Extension (ISE) in customising embedded processors for a specific
application has been studied extensively in recent years. The addition of a set of complex
arithmetic instructions to a baseline core has proven to be a cost-effective means of meeting
design performance requirements. This thesis proposes and evaluates a reconfigurable ISE
implementation called “Configurable Flow Accelerators” (CFAs), a number of refinements to
an existing Automated ISE (AISE) algorithm called “ISEGEN”, and the effects of source form
on AISE.
The CFA is demonstrated repeatedly to be a cost-effective design for ISE implementation.
A temporal partitioning algorithm called “staggering” is proposed and demonstrated on average
to reduce the area of CFA implementation by 37% for only an 8% reduction in acceleration.
This thesis then turns to concerns within the ISEGEN AISE algorithm. A methodology
for finding a good static heuristic weighting vector for ISEGEN is proposed and demonstrated.
Up to 100% of merit is shown to be lost or gained through the choice of vector. ISEGEN
early-termination is introduced and shown to improve the runtime of the algorithm by up to
7.26x, and 5.82x on average. An extension to the ISEGEN heuristic to account for pipelining
is proposed and evaluated, increasing acceleration by up to an additional 1.5x. An energyaware
heuristic is added to ISEGEN, which reduces the energy used by a CFA implementation
of a set of ISEs by an average of 1.6x, up to 3.6x. This result directly contradicts the frequently
espoused notion that “bigger is better” in ISE.
The last stretch of work in this thesis is concerned with source-level transformation: the effect
of changing the representation of the application on the quality of the combined hardwaresoftware
solution. A methodology for combined exploration of source transformation and ISE
is presented, and demonstrated to improve the acceleration of the result by an average of 35%
versus ISE alone. Floating point is demonstrated to perform worse than fixed point, for all
design concerns and applications studied here, regardless of ISEs employed
Customising compilers for customisable processors
The automatic generation of instruction set extensions to provide application-specific acceleration
for embedded processors has been a productive area of research in recent years. There
have been incremental improvements in the quality of the algorithms that discover and select
which instructions to add to a processor. The use of automatic algorithms, however, result in
instructions which are radically different from those found in conventional, human-designed,
RISC or CISC ISAs. This has resulted in a gap between the hardware’s capabilities and the
compiler’s ability to exploit them.
This thesis proposes and investigates the use of a high-level compiler pass that uses graph-subgraph
isomorphism checking to exploit these complex instructions. Operating in a separate
pass permits techniques to be applied that are uniquely suited for mapping complex instructions,
but unsuitable for conventional instruction selection. The existing, mature, compiler
back-end can then handle the remainder of the compilation. With this method, the high-level
pass was able to use 1965 different automatically produced instructions to obtain an initial average
speed-up of 1.11x over 179 benchmarks evaluated on a hardware-verified cycle-accurate
simulator.
This result was improved following an investigation of how the produced instructions were
being used by the compiler. It was established that the models the automatic tools were using to
develop instructions did not take account of how well the compiler could realistically use them.
Adding additional parameters to the search heuristic to account for compiler issues increased
the speed-up from 1.11x to 1.24x. An alternative approach using a re-designed hardware interface
was also investigated and this achieved a speed-up of 1.26x while reducing hardware and
compiler complexity.
A complementary, high-level, method of exploiting dual memory banks was created to increase
memory bandwidth to accommodate the increased data-processing bandwidth provided
by extension instructions. Finally, the compiler was considered for use in a non-conventional
role where rather than generating code it is used to apply source-level transformations prior to
the generation of extension instructions and thus affect the shape of the instructions that are
generated
Doctor of Philosophy
dissertationFatigue cracks typically occur at stress risers such as geometry changes and holes. This type of failure has serious safety and economic repercussions affecting structures such as aircraft. The need to prevent catastrophic failure due to fatigue cracks and other discontinuities has led to durability and damage tolerant methodologies influencing the design of aircraft structures. Holes in a plate or sheet filled with a fastener are common fatigue critical locations in aircraft structure requiring damage tolerance analysis (DTA). Often, the fastener is transferring load which leads to a loading condition involving both far-field stresses such as tension and bending, and localized bearing at the hole. The difference between the bearing stress and the tensile field at the hole is known as load transfer. The ratio of load transfer as well as the magnitude of the stresses plays a significant part in how quickly a crack will progress to failure. Unfortunately, the determination of load transfer in a complex joint is far from trivial. Many methods exist in the open literature regarding the analysis of splices, doublers and attachment joints to determine individual fastener loads. These methods work well for static analyses but greater refinement is needed for crack growth analysis. The first fastener in a splice or joint is typically the most critical but different fastener flexibility equations will all give different results. The constraint of the fastener head and shop end, along with the type of fastener, affects the stiffness or flexibility of the fastener
Exploration architecturale de communications-sur-puce au niveau système
Système sur puce multiprocesseur -- Le besoin grandissant -- Le logiciel -- Le matériel -- Méthodologies et plateformes de conception -- Les communication-sur-puce -- Les différentes architectures -- Réseau sur puce -- Tchniques d'analyse -- Méthodes d'exploration architecturale -- Exploration architecturale des communications sur puce -- La plateforme Space -- Méthodologie d'exploration -- Les composants au niveau TF -- Les composants au niveau BCA -- Méthode des fenêtres dans les ponts -- Composants annexes pour aider à améliorer le réseau multibus -- Analyse de l'exploration et des performances -- Outis de mesure -- Comparaison des estimations de simulation au niveau TF et BCA -- Performance à travers la méthodologie dexploration -- Risques liés à l'utilisation du pont direct
Instruction and data cache modeling for timing analysis in real-time systems
Master'sMASTER OF ENGINEERIN
Instrumenting and analyzing platform-independent communication in applications
The performance of microprocessors is limited by communication. This limitation, sometimes alluded to as the memory wall, refers to the hardware-level cost of communicating with memory. Recent studies have found that the promise of speedup from transistor scaling, or employing heterogeneous processors, such as GPUs, is diminished when such hardware communication costs are included. Based on the insight that hardware communication at run-time is a manifestation of communication in software, this dissertation proposes that automatically capturing and classifying software-level communication is the first step in performing fast, early-stage design space exploration of future multicore systems. Software-level communication refers to the exchange of data between software entities such as functions, threads or basic blocks. Communication classification helps differentiate the first-time use from the reuse of communicated data, and distinguishes between communication external to a software entity and local communication within a software entity. We present Sigil, a novel tool that automatically captures and classifies software-level communication in an efficient way. Due to its platform-independent nature, software-level communication can be useful during the early-stage design of future multicore systems. Using the two different representations of output data that Sigil produces, we show that the measurement of software-level communication can be used to analyze i) function-level interaction in single-threaded programs to determine which specialized logic can be included in future heterogeneous multicore systems, and ii) thread-level interaction in multi-threaded programs to aid in chip multi-processor(CMP) design space exploration.Ph.D., Electrical Engineering -- Drexel University, 201
Kommunikation und Bildverarbeitung in der Automation
In diesem Open-Access-Tagungsband sind die besten Beiträge des 9. Jahreskolloquiums "Kommunikation in der Automation" (KommA 2018) und des 6. Jahreskolloquiums "Bildverarbeitung in der Automation" (BVAu 2018) enthalten. Die Kolloquien fanden am 20. und 21. November 2018 in der SmartFactoryOWL, einer gemeinsamen Einrichtung des Fraunhofer IOSB-INA und der Technischen Hochschule Ostwestfalen-Lippe statt. Die vorgestellten neuesten Forschungsergebnisse auf den Gebieten der industriellen Kommunikationstechnik und Bildverarbeitung erweitern den aktuellen Stand der Forschung und Technik. Die in den Beiträgen enthaltenen anschaulichen Beispiele aus dem Bereich der Automation setzen die Ergebnisse in den direkten Anwendungsbezug
A cumulative index to the 1977 issues of a continuing bibliography on aerospace medicine and biology
This publication is a cumulative index to the abstracts contained in the Supplements 164 through 175 of Aerospace Medicine and Biology: A Continuing Bibliography. It includes three indexes-- subject, personal author, and corporate source