Search CORE

81 research outputs found

Recommended from our members

An efficient global resource constrained technique for exploiting instruction level parallelism

Author: Nicolau Alexandru
Novack Steven
Publication venue: eScholarship, University of California
Publication date: 21/01/1992
Field of study

A new Global Resource-constrained Percolation (GRiP) scheduling technique is presented for exploiting instruction level parallelism. Other techniques that have been proposed either have been prohibitively expensive in terms of computation or have limited parallelism. The GRiP technique has been implemented and simulation results are presented

eScholarship - University of California

An LMS Adaptive Filter Using Distributed Arithmetic - Algorithms and Architectures

Author: Honma Naoki
Takahashi Kyo
Tsunekawa Yoshitaka
Publication venue: 'IntechOpen'
Publication date: 06/09/2011
Field of study

IntechOpen

Crossref

Custom architectures for fuzzy and neural networks controllers

Author: Acosta Nelson
Tosini Marcelo Alejandro
Publication venue
Publication date: 06/02/2004
Field of study

Standard hardware, dedicated microcontroller or application specific circuits can implement fuzzy logic or neural network controllers. This paper presents efficient architecture approaches to develop controllers using specific circuits. A generator uses several tools that allow translating the initial problem specification to a specific circuit implementation, by using HDL descriptions. These HDL description files can be synthesized to get the FPGA configuration bit-stream.Facultad de Informátic

Servicio de Difusión de la Creación Intelectual

Hardware/Software Cost Analysis of Interrupt Processing Strategies

Author: Garalnabi Loai E.
Publication venue: 'Oklahoma State University Library'
Publication date: 01/05/1997
Field of study

The purpose of this work was to study and analyze interrupt processing strategies from a cost point of view. The cost referred to is the architectural cost of implementing a particular interrupt processing strategy. The scope of this study included five strategies. All the strategies under investigation were originally designed to make it possible for pipelined processors to support precise interrupts. To analyze the cost of each strategy, its design and implementation was carefully studied. Based on that it was possible to determine or closely estimate the amount of hardware, and the complexity of software needed to implement each strategy. On pipelined processors, interrupt processing can be broken down into six phases. Some phases such as detecting the interrupt, running the interrupt handler, and resuming the interrupted process (for precise interrupts), are common for all strategies. The strategies differ in whether they finish pending the instructions once an interrupt has occurred, or they just flush the pipeline. Also they differ in whether they undo state changes or maintain a precise state at all times. Hardware dominates the cost of many of the strategies, except for one, namely Checkpoint Repair, for which the cost varies from being mainly composed of hardware costs to being mainly composed of software costs according to the strategy's implementation

SHAREOK repository

Instruction Set Simulator for Transport Triggered Architectures

Author: Jääskeläinen Pekka
Publication venue
Publication date: 01/09/2005
Field of study

Due to speciﬁc requirements of some of embedded system applications, general purpose processors are usually not the most optimal ones for the task at hand. Thus, there is a need for application-speciﬁc processors, which are tailored for the application and requirements at hand. However, processor design is a demanding task. Therefore, the processor design ﬂow needs to be automated as completely as possible. TTA Codesign Environment (TCE) is a toolset that provides a semi-automated processor design ﬂow, which includes "design space exploration", which is a process that helps to ﬁnd an optimal processor architecture for the given application semiautomatically. The processor paradigm utilized in TCE design ﬂow is called transport triggered architecture (TTA). TTA is a relatively simple and highly modularized processor architecture which allows easy customization. One of the leading ideas of TTA is to move complexity from the processor hardware to the compiler. Consequently, the most complicated tool in TCE is the compiler. Instruction set simulation is mainly needed in verifying the compiler output and in design space exploration. The project completed for this thesis consisted of design, implementation, and veriﬁcation of an instruction set simulator for TCE. The thesis describes the main requirements and most important software design decisions of the TCE instruction set simulator. In addition, the veriﬁcation of simulation correctness is described and performance benchmarks are presented. Finally, several improvement ideas and brief plans for implementing them are presented. /Kir1

Trepo - Institutional Repository of Tampere University

Vector Operation Support for Transport Triggered Architectures

Author: Järvelä Mikko
Publication venue
Publication date: 04/06/2014
Field of study

High performance and low power consumption requirements usually restrict the design process of embedded processors. Traditional design solutions do not apply to the requirements today, but instead demands exploiting varying levels of parallelism. In order to reduce design time and effort, a powerful toolset is required to design new parallel processors effectively. TTA-based Co-design Environment (TCE) is a toolset developed in Tampere University of Technology for designing customized parallel processors. It is based on a modular Transport Triggered Architecture (TTA) processor architecture template, which provides easy customization and allows exploiting instruction-level parallelism for high performance execution. Single Instruction, Multiple Data (SIMD) paradigm provides powerful data-level parallel vector computation for many applications in embedded processing. It is one of the most common ways to exploit parallelism in today's processor designs in order to gain greater execution efficiency and, therefore, to meet the performance requirements. This work describes how data-level parallel SIMD support is introduced and integrated to the TCE design flow for more diverse parallelism support. The support allows designers to customize and program processors with wide vector operations. The work presents the required modification points along with the new tools that were added to the toolset. Much weight is given for the retargetable compiler, which must be able to adapt to all resources on TTA machines. The added tools were required to provide as much automatic behavior as possible to maintain effective design flow. In addition, the thesis presents how the modifications and new features were verified

University of Debrecen Electronic Archive

Trepo - Institutional Repository of Tampere University

Applications in Electronics Pervading Industry, Environment and Society

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

This book features the manuscripts accepted for the Special Issue “Applications in Electronics Pervading Industry, Environment and Society—Sensing Systems and Pervasive Intelligence” of the MDPI journal Sensors. Most of the papers come from a selection of the best papers of the 2019 edition of the “Applications in Electronics Pervading Industry, Environment and Society” (APPLEPIES) Conference, which was held in November 2019. All these papers have been significantly enhanced with novel experimental results. The papers give an overview of the trends in research and development activities concerning the pervasive application of electronics in industry, the environment, and society. The focus of these papers is on cyber physical systems (CPS), with research proposals for new sensor acquisition and ADC (analog to digital converter) methods, high-speed communication systems, cybersecurity, big data management, and data processing including emerging machine learning techniques. Physical implementation aspects are discussed as well as the trade-off found between functional performance and hardware/system costs

Directory of Open Access Books (DOAB)

Hardware/software co-design of fractal features based fall detection system

Author: Gibson Ryan M.
Morison Gordon
Skelton Dawn A.
Tahir Ahsen
Publication venue: 'MDPI AG'
Publication date: 18/04/2020
Field of study

Falls are a leading cause of death in older adults and result in high levels of mortality, morbidity and immobility. Fall Detection Systems (FDS) are imperative for timely medical aid and have been known to reduce death rate by 80%. We propose a novel wearable sensor FDS which exploits fractal dynamics of fall accelerometer signals. Fractal dynamics can be used as an irregularity measure of signals and our work shows that it is a key discriminant for classification of falls from other activities of life. We design, implement and evaluate a hardware feature accelerator for computation of fractal features through multi-level wavelet transform on a reconfigurable embedded System on Chip, Zynq device for evaluating wearable accelerometer sensors. The proposed FDS utilises a hardware/software co-design approach with hardware accelerator for fractal features and software implementation of Linear Discriminant Analysis on an embedded ARM core for high accuracy and energy efficiency. The proposed system achieves 99.38% fall detection accuracy, 7.3× speed-up and 6.53× improvements in power consumption, compared to the software only execution with an overall performance per Watt advantage of 47.6×, while consuming low reconfigurable resources at 28.67%

Multidisciplinary Digital Publishing Institute

ResearchOnline@GCU

Implementing carrier recovery for LTE 20 MHz on transport triggered architecture

Author: Mubashir Ali
Publication venue
Publication date: 08/05/2013
Field of study

Synchronization is a critical function in digital communications. Its failure may cause catastrophic effects on the transmission system performance. It is very important that the receiver is synchronised with the transmitter because it is not possible to correct frequencies/phases without any control mechanisms. Synchronization is different in Third Generation Partnership Project (3GPP) Long Term Evolution (LTE) for uplink and downlink because of the choice of multiple access scheme. Multiple access scheme for LTE downlink is Orthogonal Frequency Division Multiple Access (OFDMA) and Single Carrier-Frequency Division Multiple Access (SC-FDMA) for the uplink. OFDMA is susceptible to Carrier Frequency Offset (CFO). In case of a typical LTE system with a carrier frequency of 2.1 GHz, a frequency drift of 10ppm (10×10-6) of the local oscillator can cause an offset of 21 kHz. LTE system employs a fixed subcarrier spacing of 15 kHz. This offset caused by the local oscillator corresponds to 1.40 subcarrier spac-ings. The receiver extracts the information from the received signal to synchronise and compensate for any carrier frequency/phase offset. Increasing demand for data driven applications has put stress on communication systems to provide high data rates and increased bandwidth. This demand has ever been increasing and requires new standards to evolve and efficient hardware. It has been difficult to develop hardware at the pace new communication standards are developing. It also increases the cost of deployment of a technology for a brief period of time without covering the huge capital invested in the network. In order to meet the pace of evolving standards and covering the huge net-work costs, industry needs Software-Defined Radio (SDR). SDR is a radio communica-tion technology that is based on software defined wireless communication protocols instead of hardwired implementations. System components that are usually implemented in hardware are implemented by means of software on a computer or embedded system. LTE carrier recovery algorithm for LTE downlink with 20 MHz system bandwidth has been implemented in this thesis. The architecture chosen for implementation is Transport Triggered Architecture (TTA) with the goal to achieve real time constraints along with a certain flexibility and power consumption needed for an SDR platform. The target programming language is C with TTA specific extensions instead of hand optimized assembly with the aim to reduce the whole design time and still achieve the required optimizations and throughput. This design cycle time is also one of the im-portant aspects for product development in the industry

Trepo - Institutional Repository of Tampere University

Generation of Customized RISC-V Implementations

Author: Hepola Kari
Publication venue
Publication date: 15/02/2022
Field of study

Processor customization has become increasingly important for achieving better performance and energy efficiency in embedded systems. However, customizing processors is time-consuming and error-prone work. The design effort is reduced by describing the processor architecture with high-level languages that are then used to generate the processor implementation. In addition to processor customization, open source hardware and standardization have become increasingly more popular. RISC-V that is a relatively new open standard instruction set architecture, has gained traction both in academia and industry. This thesis work added a RISC-V extension to the OpenASIP toolset that is developed at Tampere University. OpenASIP has wide support for customizing and generating transport triggered architectures. Transport triggered architectures have an exposed datapath that is visible to the programmer, which allows a lower level programming interface. The hardware generation and customization features in OpenASIP were reused by utilizing a transport triggered architecture as the internal microarchitecture together with a microcode unit. The extension generates the RISC-V implementations from an architecture description, which reduces the design effort of customizing the implementation. The RISC-V generator developed in this thesis has customization points for the bypass network, amount of pipeline stages, operation latencies and an optional addition of the standard M extension. The generator was evaluated by generating RISC-V cores with different customization points and comparing their performance and post-synthesis properties with open source implementations. The generated cores with bypass network achieved better performance while consuming slightly more area than the smallest reference design. The microcode hardware only utilized 3.6% of the design area and did not affect the maximum clock frequency

Trepo - Institutional Repository of Tampere University