93 research outputs found

    Performance and Limitation Review of Secure Hash Function Algorithm

    Get PDF
    A cryptographic hash work is a phenomenal class of hash work that has certain properties which make it fitting for use in cryptography. It is a numerical figuring that maps information of emotional size to a bit string of a settled size (a hash) and is expected to be a confined limit, that is, a limit which is infeasible to adjust. Hash Functions are significant instrument in information security over the web. The hash functions that are utilized in different security related applications are called cryptographic hash functions. This property is additionally valuable in numerous different applications, for example, production of digital signature and arbitrary number age and so on. The vast majority of the hash functions depend on Merkle-Damgard development, for example, MD-2, MD-4, MD-5, SHA-1, SHA-2, SHA-3 and so on, which are not hundred percent safe from assaults. The paper talks about a portion of the secure hash function, that are conceivable on this development, and accordingly on these hash functions additionally face same attacks

    HAL-ASOS accelerator model: evolutive elasticity by design

    Get PDF
    To address the integration of software threads and hardware accelerators into the Linux Operating System (OS) programming models, an accelerator architecture is proposed, based on micro-programmable hardware system calls, which fully export these resources into the Linux OS user-space through a design-specific virtual file system. The proposed HAL-ASOS accelerator model is split into a user-defined Hardware Task and a parameterizable Hardware Kernel with three differentiated transfer channels, aiming to explore distinct BUS technology interfaces and promote the accelerator to a first-class computing unit. This paper focuses on the Hardware Kernel and mainly its microcode control unit, which will leverage the elasticity to naturally evolve with Linux OS through key differentiating capabilities of field programmable gate arrays (FPGAs) when compared to the state of the art. To comply with the evolutive nature of Linux OS, or any Hardware Task incremental features, the proposed model generates page-faults signaling runtime errors that are handled at the kernel level as part of the virtual file system runtime. To evaluate the accelerator model’s programmability and its performance, a client-side application based on the AES 128-bit algorithm was implemented. Experiments demonstrate a flexible design approach in terms of hardware and software reconfiguration and significant performance increases consistent with rising processing demands or clock design frequencies.This work has been supported by FCT-Fundação para a Ciência e Tecnologia within the R&D Units Project Scope: UIDB/00319/2020

    Identification of dynamic circuit specialization opportunities in RTL code

    Get PDF
    Dynamic Circuit Specialization (DCS) optimizes a Field-Programmable Gate Array (FPGA) design by assuming a set of its input signals are constant for a reasonable amount of time, leading to a smaller and faster FPGA circuit. When the signals actually change, a new circuit is loaded into the FPGA through runtime reconfiguration. The signals the design is specialized for are called parameters. For certain designs, parameters can be selected so the DCS implementation is both smaller and faster than the original implementation. However, DCS also introduces an overhead that is difficult for the designer to take into account, making it hard to determine whether a design is improved by DCS or not. This article presents extensive results on a profiling methodology that analyses Register-Transfer Level (RTL) implementations of applications to check if DCS would be beneficial. It proposes to use the functional density as a measure for the area efficiency of an implementation, as this measure contains both the overhead and the gains of a DCS implementation. The first step of the methodology is to analyse the dynamic behaviour of signals in the design, to find good parameter candidates. The overhead of DCS is highly dependent on this dynamic behaviour. A second stage calculates the functional density for each candidate and compares it to the functional density of the original design. The profiling methodology resulted in three implementations of a profiling tool, the DCS-RTL profiler. The execution time, accuracy, and the quality of each implementation is assessed based on data from 10 RTL designs. All designs, except for the two 16-bit adaptable Finite Impulse Response (FIR) filters, are analysed in 1 hour or less

    HAL-ASOS - Linux com aceleração em hardware para sistemas operativos dedicados à aplicação

    Get PDF
    Programa doutoral em Engenharia Eletrónica e de Computadores (PDEEC) (especialidade de Informática Industrial e Sistemas Embebidos)O ecossistema de sistemas embebidos de hoje tornou-se enorme, cobrindo vários e diferentes sistemas, exigindo desempenho e mobilidade completa enquanto atingem autonomias de bateria cada vez maiores. Mas a crescente frequência de relógio que resultou em dispositivos cada vez mais rápidos começou a estagnar antes dos transístores pararem de encolher. Plataformas Field Programmable Gate Array (FPGA) são uma solução alternativa para a implementação de sistemas completos e reconfiguráveis. Fornecem desempenho e eficiência computacional para satisfazer requisitos da aplicação e do sistema embebido. Vários Sistemas Operativos (SO) assistidos por FPGA foram propostos, mas ao estreitar seu foco na síntese do datapath do acelerador de hardware, a grande maioria ignora a integração semântica destes no SO. Ambientes de síntese de alto nível (HLS) elevaram a abstração além da linguagem de transferência de registo (RTL), seguindo uma abordagem específica de domínio enquanto misturam software e abstrações de hardware ad hoc, que dificultam as otimizações. Além disso, os modelos de programação para software e hardware reconfigurável carecem de semelhanças, o que com o tempo dificultará a Exploração do Ambiente de Design (DSE) e diminuirá o potencial de reutilização de código. Para responder a estas necessidades, propomos HAL-ASOS, uma ferramenta para implementar sistemas embebidos baseados em Linux que fornece (1) elasticidade no design em conformidade com a natureza evolutiva deste SO, (2) integração semântica profunda de tarefas de hardware nos modelos de programação do Linux, (3) facilidade na gestão de complexidade através de metodologia e ferramentas para apoiar o design, verificação e implementação, (4) orientada por princípios de design híbridos e eficiência no sistema. Para avaliar as funcionalidades da ferramenta, foi implementado um aplicativo criptográfico que demonstra alcance de desempenho enquanto se emprega a metodologia de design. Novos níveis de desempenho são atingidos numa aplicação de Visão por Computador que explora recursos de programação assíncrona-síncrona. Os resultados demonstram uma abordagem flexível na reconfiguração entre hardware e software, e desempenho que aumenta consistentemente com acréscimo de recursos ou frequência de relógio.Today’s embedded systems ecosystem became huge while covering several and different computer-based systems, demanding for performance and complete mobility while experiencing longer battery lives. But the rampant frequency that resulted in faster devices began hitting a wall even before transistors stopped shrinking. Field Programmable Gate Array (FPGA) platforms are an alternative solution towards implementing complete reconfigurable systems. They provide computational power, efficiency, in a lightweight solution to serve the application requirements and increase performance in the overall system. Several FPGA-assisted Operating Systems (OS) have been proposed, but by narrowing their focus on datapath synthesis of the hardware accelerator, they completely ignore the deep semantic integration of these accelerators into the OS. State-of-the-art High-Level Synthesis (HLS) environments have raised the level of abstraction beyond Register Transfer Language (RTL) by following a domain-specific approach while mixing ad hoc software and hardware abstractions, making harder for performance optimizations. Furthermore, the programming models for software and reconfigurable hardware lack commonalities, which in time will hinder the Design Space Exploration (DSE) and lower the potential for code reuse. To overcome these issues, we propose HAL-ASOS, a framework to implement Linux-based Embedded systems which provides (1) elasticity by design to comply with the evolutive nature of Linux, (2) deep semantic integration of the hardware tasks in the Linux programming models, (3) easy complexity management using methodology and tools to fully support design, verification and deployment, (4) hybrid and efficiency-oriented design principles. To evaluate the framework functionalities, a cryptographic application was implemented and demonstrates performance achievements while using the promoted application-driven design methodology. To demonstrate new levels of performance that can be achieved, a Computer Vision application explores several mixed asynchronous-synchronous programming features. Experiments demonstrate a flexible design approach in terms of hardware and software reconfiguration, and significant performance that increases consistently with the rising in processing resources or clock frequencies.Financial support received from Portuguese Foundation for Science and Technology (FCT) with the PhD grant SFRH/BD/82732/2011

    The Development of TIGRA: A Zero Latency Interface For Accelerator Communication in RISC-V Processors

    Get PDF
    Field programmable gate arrays (FPGA) give developers the ability to design application specific hardware by means of software, providing a method of accelerating algorithms with higher power efficiency when compared to CPU or GPU accelerated applications. FPGA accelerated applications tend to follow either a loosely coupled or tightly coupled design. Loosely coupled designs often use OpenCL to utilize the FPGA as an accelerator much like a GPU, which provides a simplifed design flow with the trade-off of increased overhead and latency due to bus communication. Tightly coupled designs modify an existing CPU to introduce instruction set extensions to provide a minimal latency accelerator at the cost of higher programming effort to include the custom design. This dissertation details the design of the Tightly Integrated, Generic RISC-V Accelerator (TIGRA) interface which provides the benefits of both loosely and tightly coupled accelerator designs. TIGRA enabled designs incur zero latency with a simple-to-use interface that reduces programming effort when implementing custom logic within a processor. This dissertation shows the incorporation of TIGRA into the simple PicoRV32 processor, the highly customizable Rocket Chip generator, and the FPGA optimized Taiga processor. Each processor design is tested with AES 128-bit encryption and posit arithmetic to demonstrate TIGRA functionality. After a one time programming cost to incorporate a TIGRA interface into an existing processor, new functional units can be added with up to a 75% reduction in the lines of code required when compared to non-TIGRA enabled designs. Additionally, each functional unit created is co-compatible with each processor as the TIGRA interface remains constant between each design. The results prove that using the TIGRA interface introduces no latency and is capable of incorporating existing custom logic designs without modification for all three processors tested. When compared to the PicoRV32 coprocessor interface (PCPI), TIGRA coupled designs complete one clock cycle faster. Similarly, TIGRA outperforms the Rocket Chip custom coprocessor (RoCC) interface by an average of 6.875 clock cycles per instruction. The Taiga processor\u27s decoupled execution units allow for instructions to execute concurrently and uses a tag management system that is similar to out-of-order processors. The inclusion of the TIGRA interface within this processor abstracts the tag management from the user and demonstrates that the TIGRA interface can be applied to out-of-order processors. When coupled with partial reconfiguration, the flexibility and modularity of TIGRA drastically increases. By creating a reprogrammable region for the custom logic connected via TIGRA, users can swap out the connected design at runtime to customize the processor for a given application. Further, partial reconfiguration allows users to only compile the custom logic design as opposed to the entire CPU, resulting in an 18.1% average reduction of compilation during the design process in the case studies. Paired with the programming effort saved by using TIGRA, partial reconfiguration improves the time to design and test new functionality timelines for a processor

    A Framework and Protocol for Dynamic Management of Fault Tolerant Systems in Harsh Environments

    Get PDF
    Robots can be used to deal with hazardous materials like nuclear waste. Unfortunately, electronic components are also susceptible to radiation effects. Current proposals to tackle this issue solve only parts of the problems for the specific scenarios and the specific types of radiation. At the same time, current computational devices should provide run-time capabilities to monitor and adapt to different situations. In this paper, we target a possible solution presenting a framework which provides the flexibility to employ fault-tolerant techniques on distributed systems. As proof of concept, we target a fault-tolerant technique to extend the operating time of systems in harsh environments. Results show a very low overhead, of few microseconds, to execute a majority voter with replicated tasks

    Design of secure and trustworthy system-on-chip architectures using hardware-based root-of-trust techniques

    Get PDF
    Cyber-security is now a critical concern in a wide range of embedded computing modules, communications systems, and connected devices. These devices are used in medical electronics, automotive systems, power grid systems, robotics, and avionics. The general consensus today is that conventional approaches and software-only schemes are not sufficient to provide desired security protections and trustworthiness. Comprehensive hardware-software security solutions so far have remained elusive. One major challenge is that in current system-on-chip (SoCs) designs, processing elements (PEs) and executable codes with varying levels of trust, are all integrated on the same computing platform to share resources. This interdependency of modules creates a fertile attack ground and represents the Achilles’ heel of heterogeneous SoC architectures. The salient research question addressed in this dissertation is “can one design a secure computer system out of non-secure or untrusted computing IP components and cores?”. In response to this question, we establish a generalized, user/designer-centric set of design principles which intend to advance the construction of secure heterogeneous multi-core computing systems. We develop algorithms, models of computation, and hardware security primitives to integrate secure and non-secure processing elements into the same chip design while aiming for: (a) maintaining individual core’s security; (b) preventing data leakage and corruption; (c) promoting data and resource sharing among the cores; and (d) tolerating malicious behaviors from untrusted processing elements and software applications. The key contributions of this thesis are: 1. The introduction of a new architectural model for integrating processing elements with different security and trust levels, i.e., secure and non-secure cores with trusted and untrusted provenances; 2. A generalized process isolation design methodology for the new architecture model that covers both the software and hardware layers to (i) create hardware-assisted virtual logical zones, and (ii) perform both static and runtime security, privilege level and trust authentication checks; 3. A set of secure protocols and hardware root-of-trust (RoT) primitives to support the process isolation design and to provide the following functionalities: (i) hardware immutable identities – using physical unclonable functions, (ii) core hijacking and impersonation resistance – through a blind signature scheme, (iii) threshold-based data access control – with a robust and adaptive secure secret sharing algorithm, (iv) privacy-preserving authorization verification – by proposing a group anonymous authentication algorithm, and (v) denial of resource or denial of service attack avoidance – by developing an interconnect network routing algorithm and a memory access mechanism according to user-defined security policies. 4. An evaluation of the security of the proposed hardware primitives in the post-quantum era, and possible extensions and algorithmic modifications for their post-quantum resistance. In this dissertation, we advance the practicality of secure-by-construction methodologies in SoC architecture design. The methodology allows for the use of unsecured or untrusted processing elements in the construction of these secure architectures and tries to extend their effectiveness into the post-quantum computing era

    Dataflow Computing with Polymorphic Registers

    Get PDF
    Heterogeneous systems are becoming increasingly popular for data processing. They improve performance of simple kernels applied to large amounts of data. However, sequential data loads may have negative impact. Data parallel solutions such as Polymorphic Register Files (PRFs) can potentially accelerate applications by facilitating high speed, parallel access to performance-critical data. Furthermore, by PRF customization, specific data path features are exposed to the programmer in a very convenient way. PRFs allow additional control over the registers dimensions, and the number of elements which can be simultaneously accessed by computational units. This paper shows how PRFs can be integrated in dataflow computational platforms. In particular, starting from an annotated source code, we present a compiler-based methodology that automatically generates the customized PRFs and the enhanced computational kernels that efficiently exploit them
    corecore