1,398 research outputs found

    A CAD Tool for Synthesizing Optimized Variants of Altera\u27s Nios II Soft-Core Processor

    Get PDF
    Soft-core processors offer embedded system designers the benefits of customization, flexibility and reusability. Altera\u27s NIOS II soft-core processor is a popular, commercially available soft-core processor that can be implemented on a variety of Altera FPGAs. In this thesis, the Nios II soft-core processor from Altera Corporation was studied and a VHDL implementation, called UW_Nios II, was developed. UW_Nios II was developed to enable us to perform design space exploration (DSE) for the Nios II processor. It was evaluated and compared with Altera Nios II and shown to be competitive. SCBuild is an existing CAD tool that was developed to enable DSE of soft-core processors. We modified SCBuild to automatically explore the design space of the UW_Nios II using a genetic algorithm. This tool can accurately estimate the area and critical path delay of different variants of the UW_Nios II on a field programmable gate array. Through experiments conducted using SCBuild, it was shown that employing a genetic algorithm to explore the design space of parameterized Nios II core, with a large design space, helps designers find optimized variants of UW_Nios II

    SEPARATING INSTRUCTION FETCHES FROM MEMORY ACCESSES : ILAR (INSTRUCTION LINE ASSOCIATIVE REGISTERS)

    Get PDF
    Due to the growing mismatch between processor performance and memory latency, many dynamic mechanisms which are “invisible” to the user have been proposed: for example, trace caches and automatic pre-fetch units. However, these dynamic mechanisms have become inadequate due to implicit memory accesses that have become so expensive. On the other hand, compiler-visible mechanisms like SWAR (SIMD Within A Register) and LARs (Line Associative Registers) are potentially more effective at improving data access performance. This thesis investigates applying the same ideas to improve instruction access. ILAR (Instruction LARs) store instructions in wide registers. Instruction blocks are explicitly loaded into ILAR, using block compression to enhance memory bandwidth. The control flow of the program then refers to instructions directly by their position within an ILAR, rather than by lengthy memory addresses. Because instructions are accessed directly from within registers, there is no implicit instruction fetch from memory. This thesis proposes an instruction set architecture for ILAR, investigates a mechanism to load ILAR using the best available block compression algorithm and also develop hardware descriptions for both ILAR and a conventional memory cache model so that performance comparisons could be made on the instruction fetch stage

    MURAC: A unified machine model for heterogeneous computers

    Get PDF
    Includes bibliographical referencesHeterogeneous computing enables the performance and energy advantages of multiple distinct processing architectures to be efficiently exploited within a single machine. These systems are capable of delivering large performance increases by matching the applications to architectures that are most suited to them. The Multiple Runtime-reconfigurable Architecture Computer (MURAC) model has been proposed to tackle the problems commonly found in the design and usage of these machines. This model presents a system-level approach that creates a clear separation of concerns between the system implementer and the application developer. The three key concepts that make up the MURAC model are a unified machine model, a unified instruction stream and a unified memory space. A simple programming model built upon these abstractions provides a consistent interface for interacting with the underlying machine to the user application. This programming model simplifies application partitioning between hardware and software and allows the easy integration of different execution models within the single control ow of a mixed-architecture application. The theoretical and practical trade-offs of the proposed model have been explored through the design of several systems. An instruction-accurate system simulator has been developed that supports the simulated execution of mixed-architecture applications. An embedded System-on-Chip implementation has been used to measure the overhead in hardware resources required to support the model, which was found to be minimal. An implementation of the model within an operating system on a tightly-coupled reconfigurable processor platform has been created. This implementation is used to extend the software scheduler to allow for the full support of mixed-architecture applications in a multitasking environment. Different scheduling strategies have been tested using this scheduler for mixed-architecture applications. The design and implementation of these systems has shown that a unified abstraction model for heterogeneous computers provides important usability benefits to system and application designers. These benefits are achieved through a consistent view of the multiple different architectures to the operating system and user applications. This allows them to focus on achieving their performance and efficiency goals by gaining the benefits of different execution models during runtime without the complex implementation details of the system-level synchronisation and coordination

    The formal verification of generic interpreters

    Get PDF
    The task assignment 3 of the design and validation of digital flight control systems suitable for fly-by-wire applications is studied. Task 3 is associated with formal verification of embedded systems. In particular, results are presented that provide a methodological approach to microprocessor verification. A hierarchical decomposition strategy for specifying microprocessors is also presented. A theory of generic interpreters is presented that can be used to model microprocessor behavior. The generic interpreter theory abstracts away the details of instruction functionality, leaving a general model of what an interpreter does

    Automatic Application-Specific Customization of Softcore Processor Microarchitecture, Masters Thesis, May 2006

    Get PDF
    Applications for constrained embedded systems are subject to strict runtime and resource utilization bounds. With soft core processors, application developers can customize the processor for their application, constrained by available hardware resources but aimed at high application performance. The more reconfigurable the processor is, the more options the application developers will have for customization and hence increased potential for improving application performance. However, such customization entails developing in-depth familiarity with all the parameters, in order to configure them effectively. This is typically infeasible, given the tight time-to-market pressure on the developers. Alternatively, developers could explore all possible configurations, but being exponential, this is infeasible even given only tens of parameters. This thesis presents an approach based on an assumption of parameter independence, for automatic microarchitecture customization. This approach is linear with the number of parameter values and hence, feasible and scalable. For the dimensions that we customize, namely application runtime and hardware resources, we formulate their costs as a constrained binary integer nonlinear optimization program. Though the results are not guaranteed to be optimal, we find they are near-optimal in practice. Our technique itself is general and can be applied to other design-space exploration problems

    Computer vision algorithms on reconfigurable logic arrays

    Full text link

    Program synthesis and vulnerability injection using a Grammar VAE

    Full text link
    The ability to automatically detect and repair vulnerabilities in code before deployment has become the subject of increasing attention. Some approaches to this problem rely on machine learning techniques, however the lack of datasets–code samples labeled as containing a vulnerability or not–presents a barrier to performance. We design and implement a deep neural network based on the recently developed Grammar Variational Autoencoder (VAE) architecture to generate an arbitrary number of unique C functions labeled in the aforementioned manner. We make several improvements on the original Grammar VAE: we guarantee that every vector in the neural network’s latent space decodes to a syntactically valid C function; we extend the Grammar VAE into a context-sensitive environment; and we implement a semantic repair algorithm that transforms syntactically valid C functions into fully semantically valid C functions that compile and execute. Users can control the semantic qualities of output functions with our constraint system. Our constraints allow users to modify the return type, change control flow structures, inject vulnerabilities into generated code, and more. We demonstrate the advantages of our model over other program synthesis models targeting similar applications. We also explore alternative applications for our model, including code plagiarism detection and compiler fuzzing, testing, and optimization
    • …
    corecore