140 research outputs found
An Early Real-Time Checker for Retargetable Compile-Time Analysis
ABSTRACT With the demand for energy-efficient embedded computing and the rise of heterogeneous architectures, automatically retargetable techniques are likely to grow in importance. On the one hand, retargetable compilers do not handle realtime constraints properly. On the other hand, conventional worst-case execution time (WCET) approaches are not automatically retargetable: measurement-based methods require time-consuming dynamic characterization of target processors, whereas static program analysis and abstract interpretation are performed in a post-compiling phase, being therefore restricted to the set of supported targets. This paper proposes a retargetable technique to grant early realtime checking (ERTC) capabilities for design space exploration. The technique provides a general (minimum, maximum and exact-delay) timing analysis at compile time. It allows the early detection of inconsistent time-constraint combinations prior to the generation of binary executables, thereby promising higher design productivity. ERTC is a complement to state-of-the-art design flows, which could benefit from early infeasiblity detection and exploration of alternative target processors, before the binary executables are submitted to tight-bound BCET and WCET analyses for the selected target processor
Scheduling Analysis from Architectural Models of Embedded Multi-Processor Systems
International audienceAs embedded systems need more and more computing power, many products require hardware platforms based on multiple processors. In case of real-time constrained systems, the use of scheduling analysis tools is mandatory to validate the design choices, and to better use the processing capacity of the system. To this end, this paper presents the extension of the scheduling analysis tool Cheddar to deal with multi-processor schedul- ing. In a Model Driven Engineering approach, useful infor- mation about the scheduling of the application is extracted from a model expressed with an architectural language called AADL. We also define how the AADL model must be writen to express the standard policies for the multi-processor scheduling
A configurable vector processor for accelerating speech coding algorithms
The growing demand for voice-over-packer (VoIP) services and multimedia-rich
applications has made increasingly important the efficient, real-time implementation of
low-bit rates speech coders on embedded VLSI platforms. Such speech coders are
designed to substantially reduce the bandwidth requirements thus enabling dense multichannel
gateways in small form factor. This however comes at a high computational cost
which mandates the use of very high performance embedded processors.
This thesis investigates the potential acceleration of two major ITU-T speech coding
algorithms, namely G.729A and G.723.1, through their efficient implementation on a
configurable extensible vector embedded CPU architecture. New scalar and vector ISAs
were introduced which resulted in up to 80% reduction in the dynamic instruction count
of both workloads. These instructions were subsequently encapsulated into a parametric,
hybrid SISD (scalar processor)–SIMD (vector) processor. This work presents the research
and implementation of the vector datapath of this vector coprocessor which is tightly-coupled
to a Sparc-V8 compliant CPU, the optimization and simulation methodologies
employed and the use of Electronic System Level (ESL) techniques to rapidly design
SIMD datapaths
From High Level Architecture Descriptions to Fast Instruction Set Simulators
As computer systems become increasingly complex and diverse, so too do the architectures
they implement. This leads to an increase in complexity in the tools used to design
new hardware and software. One particularly important tool in hardware and software
design is the Instruction Set Simulator, which is used to prototype new architectures and
hardware features, verify hardware, and test and debug software. Many Architecture
Description Languages exist which facilitate the description of new architectural or
hardware features, and generate a tools such as simulators. However, these typically
suffer from poor performance, are difficult to test effectively, and may be limited in
functionality.
This thesis considers three objectives when developing Instruction Set Simulators:
performance, correctness, and completeness, and presents techniques which contribute
to each of these. Performance is obtained by combining Dynamic Binary Translation
techniques with a novel analysis of high level architecture descriptions. This makes use
of partial evaluation techniques in order to both improve the translation system, and to
improve the quality of the translated code, leading a performance improvement of over
2.5x compared to a naïve implementation.
This thesis also presents techniques which contribute to the correctness objective.
Each possible behaviour of each described instruction is used to guide the generation
of a test case. Constraint satisfaction techniques are used to determine the necessary
instruction encoding and context for each behaviour to be produced. It is shown that
this is a significant improvement over benchmark-driven testing, and this technique
has led to the discovery of several bugs and inconsistencies in multiple state of the art
instruction set simulators.
Finally, several challenges in ‘Full System’ simulation are addressed, contributing
to both the performance and completeness objectives. Full System simulation generally
carries significant performance costs compared with other simulation strategies. Crucially,
instructions which access memory require virtual to physical address translation
and can now cause exceptions. Both of these processes must be correctly and efficiently
handled by the simulator. This thesis presents novel techniques to address this issue
which provide up to a 1.65x speedup over a state of the art solution
A new finite element based parameter to predict bone fracture
Dual Energy X-Ray Absorptiometry (DXA) is currently the most widely adopted non-invasive clinical technique to assess bone mineral density and bone mineral content in human research and represents the primary tool for the diagnosis of osteoporosis. DXA measures areal bone mineral density, BMD, which does not account for the three-dimensional structure of the vertebrae and for the distribution of bone mass. The result is that longitudinal DXA can only predict about 70% of vertebral fractures. This study proposes a complementary tool, based on Finite Element (FE) models, to improve the DXA accuracy. Bone is simulated as elastic and inhomogeneous material, with stiffness distribution derived from DXA greyscale images of density. The numerical procedure simulates a compressive load on each vertebra to evaluate the local minimum principal strain values. From these values, both the local average and the maximum strains are computed over the cross sections and along the height of the analysed bone region, to provide a parameter, named Strain Index of Bone (SIB), which could be considered as a bone fragility index. The procedure is initially validated on 33 cylindrical trabecular bone samples obtained from porcine lumbar vertebrae, experimentally tested under static compressive loading. Comparing the experimental mechanical parameters with the SIB, we could find a higher correlation of the ultimate stress, \u3c3ULT, with the SIB values (R2adj = 0.63) than that observed with the conventional DXA-based clinical parameters, i.e. Bone Mineral Density, BMD (R2adj = 0.34) and Trabecular Bone Score, TBS (R2adj = -0.03). The paper finally presents a few case studies of numerical simulations carried out on human lumbar vertebrae. If our results are confirmed in prospective studies, SIB could be used-together with BMD and TBS-to improve the fracture risk assessment and support the clinical decision to assume specific drugs for metabolic bone diseases
Increasing the efficacy of automated instruction set extension
The use of Instruction Set Extension (ISE) in customising embedded processors for a specific
application has been studied extensively in recent years. The addition of a set of complex
arithmetic instructions to a baseline core has proven to be a cost-effective means of meeting
design performance requirements. This thesis proposes and evaluates a reconfigurable ISE
implementation called “Configurable Flow Accelerators” (CFAs), a number of refinements to
an existing Automated ISE (AISE) algorithm called “ISEGEN”, and the effects of source form
on AISE.
The CFA is demonstrated repeatedly to be a cost-effective design for ISE implementation.
A temporal partitioning algorithm called “staggering” is proposed and demonstrated on average
to reduce the area of CFA implementation by 37% for only an 8% reduction in acceleration.
This thesis then turns to concerns within the ISEGEN AISE algorithm. A methodology
for finding a good static heuristic weighting vector for ISEGEN is proposed and demonstrated.
Up to 100% of merit is shown to be lost or gained through the choice of vector. ISEGEN
early-termination is introduced and shown to improve the runtime of the algorithm by up to
7.26x, and 5.82x on average. An extension to the ISEGEN heuristic to account for pipelining
is proposed and evaluated, increasing acceleration by up to an additional 1.5x. An energyaware
heuristic is added to ISEGEN, which reduces the energy used by a CFA implementation
of a set of ISEs by an average of 1.6x, up to 3.6x. This result directly contradicts the frequently
espoused notion that “bigger is better” in ISE.
The last stretch of work in this thesis is concerned with source-level transformation: the effect
of changing the representation of the application on the quality of the combined hardwaresoftware
solution. A methodology for combined exploration of source transformation and ISE
is presented, and demonstrated to improve the acceleration of the result by an average of 35%
versus ISE alone. Floating point is demonstrated to perform worse than fixed point, for all
design concerns and applications studied here, regardless of ISEs employed
Embedded DSP Processor Design using Coware Processor Designer and Magma Layout Tool
A Digital Signal Processing (DSP) application can be implemented in a variety of ways. The objective of this project is to design an Embedded DSP Processor. The desired processor is run by an instruction set. Such a processor is called an Application Specific Instruction Set Processor (ASIP). ASIP is becoming essential to convergent System on Chip (SoC) Design. Usually there are two approaches to design an ASIP. One of them is at Register Transfer Level (RTL) and another is at just higher level than RTL and is known as Electronic System Level (ESL). Application Description Languages (ADLs) are becoming popular recently because of its quick and optimal design convergence achievement capability during the design of ASIPs.
In this project we first concentrate on the implementation and optimization of an ASIP using an ADL known as Language for Instruction Set Architecture (LISA) and CoWare Processor Designer environment. We have written a LISA 2.0 description of the processor. Given a LISA code, the CoWare Processor Designer (PD) then generates Software Development tools like assembler, disassembler, linker and compiler. A particular application in assembly language to find out the convolution using FIR filter is then run on the processor. Provided that the functionality of the processor is correct, synthesizable RTL for the processor can be generated using Coware Processor Generator.
Using the RTL generated, we implemented our processor in the following IC Design technologies:
• Semi-Custom IC Design Technology
Here, the RTL is synthesized using Magma Blast Create Tool and the final Layout is drawn using Magma Blast Fusion Tool
• Programmable Logic Device IC Design Technology
Here, the processor is dumped to a Field Programmable Gate Array (FPGA). The FPGA used for this purpose is Xilinx Virtex II Pro
A Review of Natural Joint Systems and Numerical Investigation of Bio-Inspired GFRP-to-Steel Joints.
There are a great variety of joint types used in nature which can inspire engineering joints. In order to design such biomimetic joints, it is at first important to understand how biological joints work. A comprehensive literature review, considering natural joints from a mechanical point of view, was undertaken. This was used to develop a taxonomy based on the different methods/functions that nature successfully uses to attach dissimilar tissues. One of the key methods that nature uses to join dissimilar materials is a transitional zone of stiffness at the insertion site. This method was used to propose bio-inspired solutions with a transitional zone of stiffness at the joint site for several glass fibre reinforced plastic (GFRP) to steel adhesively bonded joint configurations. The transition zone was used to reduce the material stiffness mismatch of the joint parts. A numerical finite element model was used to identify the optimum variation in material stiffness that minimises potential failure of the joint. The best bio-inspired joints showed a 118% increase of joint strength compared to the standard joints.The authors acknowledge the financial support provided by the Engineering and Physical Sciences Research Council (EPSRC) and Dowty Propellers (part of GE Aviation) via an industrial CASE studentship.This is the final version of the article. It first appeared from MDPI via http://dx.doi.org/10.3390/ma9070566
- …