Search CORE

26 research outputs found

Application specific instruction set processor design for embedded application using the coware tool

Author: Samal Lopamudra
Publication venue
Publication date: 04/06/2012
Field of study

An Application Specific Instruction Set Processor (ASIP) is widely used as a System on a Chip(SoC) Component. ASIPs possess an instruction set which is tai-lored to benefit a specific application. Such specialization allows ASIPs to serve as an intermediate between two dominant processor design styles- ASICs which has high processing abilities at the cost of limited programmability and Programmable solu-tions such as FPGAs that provide programming exibility at the cost of less energy eficiency. In this dissertation the goal is to design ASIP, keeping in mind a temper-ature sensor system. The platform used for processor design is LISA 2.0 description language and processor designing environment from CoWare. Coware processor de-signer allows processor architecture to be defined at an abstract level and automatic generation of chain of software tools like assembler, linker and simulator for functional verification followed by RTL level description. RTL level description is used to gen-erate synthesized report of the design using RTL compiler and finally the layout is created using Cadence encounter

ethesis@nitr

Design of an Application Specific Instruction Set Processor Using LISA

Author: Nanda Umakanta
Publication venue
Publication date: 28/05/2010
Field of study

A Digital Signal Processor with specific instruction sets and meant for a specific application is called as Application Specific Instruction set Processor(ASIP). To design an ASIP many approaches are available. However optimization of an ASIP becomes handy if it is designed in a higher level of abstraction that is higher than Register Transfer Level (RTL). Application Description Languages (ADLs) are becoming popular recently because of its quick and optimal design convergence achievement capability during the design of ASIPs. Several stages are required to design a processor which are architecture design implementation, software development, instruction and system verification. Verification of such ASIPs at various design stages is a tedious job to do. This thesis presents the architecture description of a simple DSP processor using ADL based instruction set description. The design process is more consistent after allowing maximum flexibility here. Further more, it enables the design process in both instruction and cycle accurate modes. The design process of a three stage pipelined FIR Filter processor is demonstrated as a case study. Further optimization can be done with respect to resources, memory size and power consumption by changing the LISA code written in CoWare platform

ethesis@nitr

Increasing the efficacy of automated instruction set extension

Author: Bennett Richard Vincent
Publication venue: The University of Edinburgh
Publication date: 24/11/2011
Field of study

The use of Instruction Set Extension (ISE) in customising embedded processors for a specific application has been studied extensively in recent years. The addition of a set of complex arithmetic instructions to a baseline core has proven to be a cost-effective means of meeting design performance requirements. This thesis proposes and evaluates a reconfigurable ISE implementation called “Configurable Flow Accelerators” (CFAs), a number of refinements to an existing Automated ISE (AISE) algorithm called “ISEGEN”, and the effects of source form on AISE. The CFA is demonstrated repeatedly to be a cost-effective design for ISE implementation. A temporal partitioning algorithm called “staggering” is proposed and demonstrated on average to reduce the area of CFA implementation by 37% for only an 8% reduction in acceleration. This thesis then turns to concerns within the ISEGEN AISE algorithm. A methodology for finding a good static heuristic weighting vector for ISEGEN is proposed and demonstrated. Up to 100% of merit is shown to be lost or gained through the choice of vector. ISEGEN early-termination is introduced and shown to improve the runtime of the algorithm by up to 7.26x, and 5.82x on average. An extension to the ISEGEN heuristic to account for pipelining is proposed and evaluated, increasing acceleration by up to an additional 1.5x. An energyaware heuristic is added to ISEGEN, which reduces the energy used by a CFA implementation of a set of ISEs by an average of 1.6x, up to 3.6x. This result directly contradicts the frequently espoused notion that “bigger is better” in ISE. The last stretch of work in this thesis is concerned with source-level transformation: the effect of changing the representation of the application on the quality of the combined hardwaresoftware solution. A methodology for combined exploration of source transformation and ISE is presented, and demonstrated to improve the acceleration of the result by an average of 35% versus ISE alone. Floating point is demonstrated to perform worse than fixed point, for all design concerns and applications studied here, regardless of ISEs employed

Edinburgh Research Archive

Customising compilers for customisable processors

Author: Murray Alastair Colin
Publication venue: The University of Edinburgh
Publication date: 29/11/2012
Field of study

The automatic generation of instruction set extensions to provide application-specific acceleration for embedded processors has been a productive area of research in recent years. There have been incremental improvements in the quality of the algorithms that discover and select which instructions to add to a processor. The use of automatic algorithms, however, result in instructions which are radically different from those found in conventional, human-designed, RISC or CISC ISAs. This has resulted in a gap between the hardware’s capabilities and the compiler’s ability to exploit them. This thesis proposes and investigates the use of a high-level compiler pass that uses graph-subgraph isomorphism checking to exploit these complex instructions. Operating in a separate pass permits techniques to be applied that are uniquely suited for mapping complex instructions, but unsuitable for conventional instruction selection. The existing, mature, compiler back-end can then handle the remainder of the compilation. With this method, the high-level pass was able to use 1965 different automatically produced instructions to obtain an initial average speed-up of 1.11x over 179 benchmarks evaluated on a hardware-verified cycle-accurate simulator. This result was improved following an investigation of how the produced instructions were being used by the compiler. It was established that the models the automatic tools were using to develop instructions did not take account of how well the compiler could realistically use them. Adding additional parameters to the search heuristic to account for compiler issues increased the speed-up from 1.11x to 1.24x. An alternative approach using a re-designed hardware interface was also investigated and this achieved a speed-up of 1.26x while reducing hardware and compiler complexity. A complementary, high-level, method of exploiting dual memory banks was created to increase memory bandwidth to accommodate the increased data-processing bandwidth provided by extension instructions. Finally, the compiler was considered for use in a non-conventional role where rather than generating code it is used to apply source-level transformations prior to the generation of extension instructions and thus affect the shape of the instructions that are generated

Edinburgh Research Archive

Doctor of Philosophy

Author: Whitman Zachary Layne
Publication venue: University of Utah
Publication date: 01/05/2012
Field of study

dissertationFatigue cracks typically occur at stress risers such as geometry changes and holes. This type of failure has serious safety and economic repercussions affecting structures such as aircraft. The need to prevent catastrophic failure due to fatigue cracks and other discontinuities has led to durability and damage tolerant methodologies influencing the design of aircraft structures. Holes in a plate or sheet filled with a fastener are common fatigue critical locations in aircraft structure requiring damage tolerance analysis (DTA). Often, the fastener is transferring load which leads to a loading condition involving both far-field stresses such as tension and bending, and localized bearing at the hole. The difference between the bearing stress and the tensile field at the hole is known as load transfer. The ratio of load transfer as well as the magnitude of the stresses plays a significant part in how quickly a crack will progress to failure. Unfortunately, the determination of load transfer in a complex joint is far from trivial. Many methods exist in the open literature regarding the analysis of splices, doublers and attachment joints to determine individual fastener loads. These methods work well for static analyses but greater refinement is needed for crack growth analysis. The first fastener in a splice or joint is typically the most critical but different fastener flexibility equations will all give different results. The constraint of the fastener head and shop end, along with the type of fastener, affects the stiffness or flexibility of the fastener

The University of Utah: J. Willard Marriott Digital Library

Exploration architecturale de communications-sur-puce au niveau système

Author: Migliorini Cédric
Publication venue
Publication date: 01/01/2008
Field of study

Système sur puce multiprocesseur -- Le besoin grandissant -- Le logiciel -- Le matériel -- Méthodologies et plateformes de conception -- Les communication-sur-puce -- Les différentes architectures -- Réseau sur puce -- Tchniques d'analyse -- Méthodes d'exploration architecturale -- Exploration architecturale des communications sur puce -- La plateforme Space -- Méthodologie d'exploration -- Les composants au niveau TF -- Les composants au niveau BCA -- Méthode des fenêtres dans les ponts -- Composants annexes pour aider à améliorer le réseau multibus -- Analyse de l'exploration et des performances -- Outis de mesure -- Comparaison des estimations de simulation au niveau TF et BCA -- Performance à travers la méthodologie dexploration -- Risques liés à l'utilisation du pont direct

PolyPublie

Instruction and data cache modeling for timing analysis in real-time systems

Author: LI YANHUI
Publication venue
Publication date: 09/02/2009
Field of study

Master'sMASTER OF ENGINEERIN

ScholarBank@NUS

Instrumenting and analyzing platform-independent communication in applications

Author: Nilakantan Siddharth
Publication venue: Drexel University
Publication date
Field of study

The performance of microprocessors is limited by communication. This limitation, sometimes alluded to as the memory wall, refers to the hardware-level cost of communicating with memory. Recent studies have found that the promise of speedup from transistor scaling, or employing heterogeneous processors, such as GPUs, is diminished when such hardware communication costs are included. Based on the insight that hardware communication at run-time is a manifestation of communication in software, this dissertation proposes that automatically capturing and classifying software-level communication is the first step in performing fast, early-stage design space exploration of future multicore systems. Software-level communication refers to the exchange of data between software entities such as functions, threads or basic blocks. Communication classification helps differentiate the first-time use from the reuse of communicated data, and distinguishes between communication external to a software entity and local communication within a software entity. We present Sigil, a novel tool that automatically captures and classifies software-level communication in an efficient way. Due to its platform-independent nature, software-level communication can be useful during the early-stage design of future multicore systems. Using the two different representations of output data that Sigil produces, we show that the measurement of software-level communication can be used to analyze i) function-level interaction in single-threaded programs to determine which specialized logic can be included in future heterogeneous multicore systems, and ii) thread-level interaction in multi-threaded programs to aid in chip multi-processor(CMP) design space exploration.Ph.D., Electrical Engineering -- Drexel University, 201

Drexel Libraries E-Repository and Archives

Kommunikation und Bildverarbeitung in der Automation

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

In diesem Open-Access-Tagungsband sind die besten Beiträge des 9. Jahreskolloquiums "Kommunikation in der Automation" (KommA 2018) und des 6. Jahreskolloquiums "Bildverarbeitung in der Automation" (BVAu 2018) enthalten. Die Kolloquien fanden am 20. und 21. November 2018 in der SmartFactoryOWL, einer gemeinsamen Einrichtung des Fraunhofer IOSB-INA und der Technischen Hochschule Ostwestfalen-Lippe statt. Die vorgestellten neuesten Forschungsergebnisse auf den Gebieten der industriellen Kommunikationstechnik und Bildverarbeitung erweitern den aktuellen Stand der Forschung und Technik. Die in den Beiträgen enthaltenen anschaulichen Beispiele aus dem Bereich der Automation setzen die Ergebnisse in den direkten Anwendungsbezug

OAPEN Library

A cumulative index to the 1977 issues of a continuing bibliography on aerospace medicine and biology

Author
Publication venue
Publication date
Field of study

This publication is a cumulative index to the abstracts contained in the Supplements 164 through 175 of Aerospace Medicine and Biology: A Continuing Bibliography. It includes three indexes-- subject, personal author, and corporate source

NASA Technical Reports Server