Search CORE

625 research outputs found

Design methodologies for instruction-set extensible processors

Author: PAN YU
Publication venue
Publication date: 08/04/2009
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

FPGA-aware techniques for rapid generation of profitable custom instructions

Author: Clarke C.T.
Lam S.-K.
Prakash A.
Srikanthan T.
Publication venue: 'Elsevier BV'
Publication date: 01/05/2013
Field of study

OPUS

Automated application-specific instruction set generation

Author: XU CE
Publication venue
Publication date: 09/02/2006
Field of study

Master'sMASTER OF ENGINEERIN

ScholarBank@NUS

CHIPS: Custom Hardware Instruction Processor Synthesis

Author: Can Ozturan
GÜnhan Dundar
Kubilay Atasu
Oskar Mencer
Wayne Luk
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref

Rapid evaluation of custom instruction selection approaches with FPGA estimation

Author: Clarke Christopher T.
Lam Siew Kei
Srikanthan Thambipillai
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/11/2014
Field of study

OPUS

Instruction-set customization for multi-tasking embedded systems

Author: HUYNH PHUNG HUYNH
Publication venue
Publication date: 07/10/2009
Field of study

Ph.DDOCTOR OF PHILOSOPH

ScholarBank@NUS

Efficient design-space exploration of custom instruction-set extensions

Author: Zuluaga Marcela
Publication venue: The University of Edinburgh
Publication date: 01/01/2010
Field of study

Customization of processors with instruction set extensions (ISEs) is a technique that improves performance through parallelization with a reasonable area overhead, in exchange for additional design effort. This thesis presents a collection of novel techniques that reduce the design effort and cost of generating ISEs by advancing automation and reconfigurability. In addition, these techniques maximize the perfomance gained as a function of the additional commited resources. Including ISEs into a processor design implies development at many levels. Most prior works on ISEs solve separate stages of the design: identification, selection, and implementation. However, the interations between these stages also hold important design trade-offs. In particular, this thesis addresses the lack of interaction between the hardware implementation stage and the two previous stages. Interaction with the implementation stage has been mostly limited to accurately measuring the area and timing requirements of the implementation of each ISE candidate as a separate hardware module. However, the need to independently generate a hardware datapath for each ISE limits the flexibility of the design and the performance gains. Hence, resource sharing is essential in order to create a customized unit with multi-function capabilities. Previously proposed resource-sharing techniques aggressively share resources amongst the ISEs, thus minimizing the area of the solution at any cost. However, it is shown that aggressively sharing resources leads to large ISE datapath latency. Thus, this thesis presents an original heuristic that can be parameterized in order to control the degree of resource sharing amongst a given set of ISEs, thereby permitting the exploration of the existing implementation trade-offs between instruction latency and area savings. In addition, this thesis introduces an innovative predictive model that is able to quickly expose the optimal trade-offs of this design space. Compared to an exhaustive exploration of the design space, the predictive model is shown to reduce by two orders of magnitude the number of executions of the resource-sharing algorithm that are required in order to find the optimal trade-offs. This thesis presents a technique that is the first one to combine the design spaces of ISE selection and resource sharing in ISE datapath synthesis, in order to offer the designer solutions that achieve maximum speedup and maximum resource utilization using the available area. Optimal trade-offs in the design space are found by guiding the selection process to favour ISE combinations that are likely to share resources with low speedup losses. Experimental results show that this combined approach unveils new trade-offs between speedup and area that are not identified by previous selection techniques; speedups of up to 238% over previous selection thecniques were obtained. Finally, multi-cycle ISEs can be pipelined in order to increase their throughput. However, it is shown that traditional ISE identification techniques do not allow this optimization due to control flow overhead. In order to obtain the benefits of overlapping loop executions, this thesis proposes to carefully insert loop control flow statements into the ISEs, thus allowing the ISE to control the iterations of the loop. The proposed ISEs broaden the scope of instruction-level parallelism and obtain higher speedups compared to traditional ISEs, primarily through pipelining, the exploitation of spatial parallelism, and reducing the overhead of control flow statements and branches. A detailed case study of a real application shows that the proposed method achieves 91% higher speedups than the state-of-the-art, with an area overhead of less than 8% in hardware implementation

CiteSeerX

Edinburgh Research Archive

Accelerated V2X provisioning with Extensible Processor Platform

Author: Diego F. Aranha
Harsh Kupwade-Patil
Helmiton Cunha
Henrique S. Ogawa
Jefferson E. Ricardini
Marcos Simplicio Jr.
Ruud Derwig
Thomas E. Luther
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 18/09/2019
Field of study

With the burgeoning Vehicle-to-Everything (V2X) communication, security and privacy concerns are paramount. Such concerns are usually mitigated by combining cryptographic mechanisms with suitable key management architecture. However, cryptographic operations may be quite resource-intensive, placing a considerable burden on the vehicle’s V2X computing unit. To assuage this issue, it is reasonable to use hardware acceleration for common cryptographic primitives, such as block ciphers, digital signature schemes, and key exchange protocols. In this scenario, custom extension instructions can be a plausible option, since they achieve fine-tune hardware acceleration with a low to moderate logic overhead, while also reducing code size. In this article, we apply this method along with dual-data memory banks for the hardware acceleration of the PRESENT block cipher, as well as for the

F_{2^{255}-19}

finite field arithmetic employed in cryptographic primitives based on Curve25519 (e.g., EdDSA and X25519). As a result, when compared with a state-of-the-art software-optimized implementation, the performance of PRESENT is improved by a factor of 17 to 34 and code size is reduced by 70%, with only a 4.37% increase in FPGA logic overhead. In addition, we improve the performance of operations over Curve25519 by a factor of ~2.5 when compared to an Assembly implementation on a comparable processor, with moderate logic overhead (namely, 9.1%). Finally, we achieve significant performance gains in the V2X provisioning process by leveraging our hardware-accelerated cryptographic primitive

Cryptology ePrint Archive

Recommended from our members

Research and development of accounting system in grid environment

Author: Chen Xiaoyu
Publication venue: Brunel University School of Engineering and Design PhD Theses
Publication date: 01/01/2010
Field of study

This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The Grid has been recognised as the next-generation distributed computing paradigm by seamlessly integrating heterogeneous resources across administrative domains as a single virtual system. There are an increasing number of scientific and business projects that employ Grid computing technologies for large-scale resource sharing and collaborations. Early adoptions of Grid computing technologies have custom middleware implemented to bridge gaps between heterogeneous computing backbones. These custom solutions form the basis to the emerging Open Grid Service Architecture (OGSA), which aims at addressing common concerns of Grid systems by defining a set of interoperable and reusable Grid services. One of common concerns as defined in OGSA is the Grid accounting service. The main objective of the Grid accounting service is to ensure resources to be shared within a Grid environment in an accountable manner by metering and logging accurate resource usage information. This thesis discusses the origins and fundamentals of Grid computing and accounting service in the context of OGSA profile. A prototype was developed and evaluated based on OGSA accounting-related standards enabling sharing accounting data in a multi-Grid environment, the World-wide Large Hadron Collider Grid (WLCG). Based on this prototype and lessons learned, a generic middleware solution was also implemented as a toolkit that eases migration of existing accounting system to be standard compatible.Engineering and Physical Sciences Research Council (EPSRC), Stanford Universit

Brunel University Research Archive