140 research outputs found

    An Early Real-Time Checker for Retargetable Compile-Time Analysis

    Get PDF
    ABSTRACT With the demand for energy-efficient embedded computing and the rise of heterogeneous architectures, automatically retargetable techniques are likely to grow in importance. On the one hand, retargetable compilers do not handle realtime constraints properly. On the other hand, conventional worst-case execution time (WCET) approaches are not automatically retargetable: measurement-based methods require time-consuming dynamic characterization of target processors, whereas static program analysis and abstract interpretation are performed in a post-compiling phase, being therefore restricted to the set of supported targets. This paper proposes a retargetable technique to grant early realtime checking (ERTC) capabilities for design space exploration. The technique provides a general (minimum, maximum and exact-delay) timing analysis at compile time. It allows the early detection of inconsistent time-constraint combinations prior to the generation of binary executables, thereby promising higher design productivity. ERTC is a complement to state-of-the-art design flows, which could benefit from early infeasiblity detection and exploration of alternative target processors, before the binary executables are submitted to tight-bound BCET and WCET analyses for the selected target processor

    Scheduling Analysis from Architectural Models of Embedded Multi-Processor Systems

    No full text
    International audienceAs embedded systems need more and more computing power, many products require hardware platforms based on multiple processors. In case of real-time constrained systems, the use of scheduling analysis tools is mandatory to validate the design choices, and to better use the processing capacity of the system. To this end, this paper presents the extension of the scheduling analysis tool Cheddar to deal with multi-processor schedul- ing. In a Model Driven Engineering approach, useful infor- mation about the scheduling of the application is extracted from a model expressed with an architectural language called AADL. We also define how the AADL model must be writen to express the standard policies for the multi-processor scheduling

    A configurable vector processor for accelerating speech coding algorithms

    Get PDF
    The growing demand for voice-over-packer (VoIP) services and multimedia-rich applications has made increasingly important the efficient, real-time implementation of low-bit rates speech coders on embedded VLSI platforms. Such speech coders are designed to substantially reduce the bandwidth requirements thus enabling dense multichannel gateways in small form factor. This however comes at a high computational cost which mandates the use of very high performance embedded processors. This thesis investigates the potential acceleration of two major ITU-T speech coding algorithms, namely G.729A and G.723.1, through their efficient implementation on a configurable extensible vector embedded CPU architecture. New scalar and vector ISAs were introduced which resulted in up to 80% reduction in the dynamic instruction count of both workloads. These instructions were subsequently encapsulated into a parametric, hybrid SISD (scalar processor)–SIMD (vector) processor. This work presents the research and implementation of the vector datapath of this vector coprocessor which is tightly-coupled to a Sparc-V8 compliant CPU, the optimization and simulation methodologies employed and the use of Electronic System Level (ESL) techniques to rapidly design SIMD datapaths

    From High Level Architecture Descriptions to Fast Instruction Set Simulators

    Get PDF
    As computer systems become increasingly complex and diverse, so too do the architectures they implement. This leads to an increase in complexity in the tools used to design new hardware and software. One particularly important tool in hardware and software design is the Instruction Set Simulator, which is used to prototype new architectures and hardware features, verify hardware, and test and debug software. Many Architecture Description Languages exist which facilitate the description of new architectural or hardware features, and generate a tools such as simulators. However, these typically suffer from poor performance, are difficult to test effectively, and may be limited in functionality. This thesis considers three objectives when developing Instruction Set Simulators: performance, correctness, and completeness, and presents techniques which contribute to each of these. Performance is obtained by combining Dynamic Binary Translation techniques with a novel analysis of high level architecture descriptions. This makes use of partial evaluation techniques in order to both improve the translation system, and to improve the quality of the translated code, leading a performance improvement of over 2.5x compared to a naïve implementation. This thesis also presents techniques which contribute to the correctness objective. Each possible behaviour of each described instruction is used to guide the generation of a test case. Constraint satisfaction techniques are used to determine the necessary instruction encoding and context for each behaviour to be produced. It is shown that this is a significant improvement over benchmark-driven testing, and this technique has led to the discovery of several bugs and inconsistencies in multiple state of the art instruction set simulators. Finally, several challenges in ‘Full System’ simulation are addressed, contributing to both the performance and completeness objectives. Full System simulation generally carries significant performance costs compared with other simulation strategies. Crucially, instructions which access memory require virtual to physical address translation and can now cause exceptions. Both of these processes must be correctly and efficiently handled by the simulator. This thesis presents novel techniques to address this issue which provide up to a 1.65x speedup over a state of the art solution

    A new finite element based parameter to predict bone fracture

    Get PDF
    Dual Energy X-Ray Absorptiometry (DXA) is currently the most widely adopted non-invasive clinical technique to assess bone mineral density and bone mineral content in human research and represents the primary tool for the diagnosis of osteoporosis. DXA measures areal bone mineral density, BMD, which does not account for the three-dimensional structure of the vertebrae and for the distribution of bone mass. The result is that longitudinal DXA can only predict about 70% of vertebral fractures. This study proposes a complementary tool, based on Finite Element (FE) models, to improve the DXA accuracy. Bone is simulated as elastic and inhomogeneous material, with stiffness distribution derived from DXA greyscale images of density. The numerical procedure simulates a compressive load on each vertebra to evaluate the local minimum principal strain values. From these values, both the local average and the maximum strains are computed over the cross sections and along the height of the analysed bone region, to provide a parameter, named Strain Index of Bone (SIB), which could be considered as a bone fragility index. The procedure is initially validated on 33 cylindrical trabecular bone samples obtained from porcine lumbar vertebrae, experimentally tested under static compressive loading. Comparing the experimental mechanical parameters with the SIB, we could find a higher correlation of the ultimate stress, \u3c3ULT, with the SIB values (R2adj = 0.63) than that observed with the conventional DXA-based clinical parameters, i.e. Bone Mineral Density, BMD (R2adj = 0.34) and Trabecular Bone Score, TBS (R2adj = -0.03). The paper finally presents a few case studies of numerical simulations carried out on human lumbar vertebrae. If our results are confirmed in prospective studies, SIB could be used-together with BMD and TBS-to improve the fracture risk assessment and support the clinical decision to assume specific drugs for metabolic bone diseases

    Eine adaptive Architekturbeschreibung für eingebettete Multicoresysteme

    Get PDF

    Increasing the efficacy of automated instruction set extension

    Get PDF
    The use of Instruction Set Extension (ISE) in customising embedded processors for a specific application has been studied extensively in recent years. The addition of a set of complex arithmetic instructions to a baseline core has proven to be a cost-effective means of meeting design performance requirements. This thesis proposes and evaluates a reconfigurable ISE implementation called “Configurable Flow Accelerators” (CFAs), a number of refinements to an existing Automated ISE (AISE) algorithm called “ISEGEN”, and the effects of source form on AISE. The CFA is demonstrated repeatedly to be a cost-effective design for ISE implementation. A temporal partitioning algorithm called “staggering” is proposed and demonstrated on average to reduce the area of CFA implementation by 37% for only an 8% reduction in acceleration. This thesis then turns to concerns within the ISEGEN AISE algorithm. A methodology for finding a good static heuristic weighting vector for ISEGEN is proposed and demonstrated. Up to 100% of merit is shown to be lost or gained through the choice of vector. ISEGEN early-termination is introduced and shown to improve the runtime of the algorithm by up to 7.26x, and 5.82x on average. An extension to the ISEGEN heuristic to account for pipelining is proposed and evaluated, increasing acceleration by up to an additional 1.5x. An energyaware heuristic is added to ISEGEN, which reduces the energy used by a CFA implementation of a set of ISEs by an average of 1.6x, up to 3.6x. This result directly contradicts the frequently espoused notion that “bigger is better” in ISE. The last stretch of work in this thesis is concerned with source-level transformation: the effect of changing the representation of the application on the quality of the combined hardwaresoftware solution. A methodology for combined exploration of source transformation and ISE is presented, and demonstrated to improve the acceleration of the result by an average of 35% versus ISE alone. Floating point is demonstrated to perform worse than fixed point, for all design concerns and applications studied here, regardless of ISEs employed

    Embedded DSP Processor Design using Coware Processor Designer and Magma Layout Tool

    Get PDF
    A Digital Signal Processing (DSP) application can be implemented in a variety of ways. The objective of this project is to design an Embedded DSP Processor. The desired processor is run by an instruction set. Such a processor is called an Application Specific Instruction Set Processor (ASIP). ASIP is becoming essential to convergent System on Chip (SoC) Design. Usually there are two approaches to design an ASIP. One of them is at Register Transfer Level (RTL) and another is at just higher level than RTL and is known as Electronic System Level (ESL). Application Description Languages (ADLs) are becoming popular recently because of its quick and optimal design convergence achievement capability during the design of ASIPs. In this project we first concentrate on the implementation and optimization of an ASIP using an ADL known as Language for Instruction Set Architecture (LISA) and CoWare Processor Designer environment. We have written a LISA 2.0 description of the processor. Given a LISA code, the CoWare Processor Designer (PD) then generates Software Development tools like assembler, disassembler, linker and compiler. A particular application in assembly language to find out the convolution using FIR filter is then run on the processor. Provided that the functionality of the processor is correct, synthesizable RTL for the processor can be generated using Coware Processor Generator. Using the RTL generated, we implemented our processor in the following IC Design technologies: • Semi-Custom IC Design Technology Here, the RTL is synthesized using Magma Blast Create Tool and the final Layout is drawn using Magma Blast Fusion Tool • Programmable Logic Device IC Design Technology Here, the processor is dumped to a Field Programmable Gate Array (FPGA). The FPGA used for this purpose is Xilinx Virtex II Pro

    A Review of Natural Joint Systems and Numerical Investigation of Bio-Inspired GFRP-to-Steel Joints.

    Get PDF
    There are a great variety of joint types used in nature which can inspire engineering joints. In order to design such biomimetic joints, it is at first important to understand how biological joints work. A comprehensive literature review, considering natural joints from a mechanical point of view, was undertaken. This was used to develop a taxonomy based on the different methods/functions that nature successfully uses to attach dissimilar tissues. One of the key methods that nature uses to join dissimilar materials is a transitional zone of stiffness at the insertion site. This method was used to propose bio-inspired solutions with a transitional zone of stiffness at the joint site for several glass fibre reinforced plastic (GFRP) to steel adhesively bonded joint configurations. The transition zone was used to reduce the material stiffness mismatch of the joint parts. A numerical finite element model was used to identify the optimum variation in material stiffness that minimises potential failure of the joint. The best bio-inspired joints showed a 118% increase of joint strength compared to the standard joints.The authors acknowledge the financial support provided by the Engineering and Physical Sciences Research Council (EPSRC) and Dowty Propellers (part of GE Aviation) via an industrial CASE studentship.This is the final version of the article. It first appeared from MDPI via http://dx.doi.org/10.3390/ma9070566
    corecore