32 research outputs found
A First Practical Fully Homomorphic Crypto-Processor Design: The Secret Computer is Nearly Here
Following a sequence of hardware designs for a fully homomorphic
crypto-processor - a general purpose processor that natively runs encrypted
machine code on encrypted data in registers and memory, resulting in encrypted
machine states - proposed by the authors in 2014, we discuss a working
prototype of the first of those, a so-called `pseudo-homomorphic' design. This
processor is in principle safe against physical or software-based attacks by
the owner/operator of the processor on user processes running in it. The
processor is intended as a more secure option for those emerging computing
paradigms that require trust to be placed in computations carried out in remote
locations or overseen by untrusted operators.
The prototype has a single-pipeline superscalar architecture that runs
OpenRISC standard machine code in two distinct modes. The processor runs in the
encrypted mode (the unprivileged, `user' mode, with a long pipeline) at 60-70%
of the speed in the unencrypted mode (the privileged, `supervisor' mode, with a
short pipeline), emitting a completed encrypted instruction every 1.67-1.8
cycles on average in real trials.Comment: 6 pages, draf
An IoT Endpoint System-on-Chip for Secure and Energy-Efficient Near-Sensor Analytics
Near-sensor data analytics is a promising direction for IoT endpoints, as it
minimizes energy spent on communication and reduces network load - but it also
poses security concerns, as valuable data is stored or sent over the network at
various stages of the analytics pipeline. Using encryption to protect sensitive
data at the boundary of the on-chip analytics engine is a way to address data
security issues. To cope with the combined workload of analytics and encryption
in a tight power envelope, we propose Fulmine, a System-on-Chip based on a
tightly-coupled multi-core cluster augmented with specialized blocks for
compute-intensive data processing and encryption functions, supporting software
programmability for regular computing tasks. The Fulmine SoC, fabricated in
65nm technology, consumes less than 20mW on average at 0.8V achieving an
efficiency of up to 70pJ/B in encryption, 50pJ/px in convolution, or up to
25MIPS/mW in software. As a strong argument for real-life flexible application
of our platform, we show experimental results for three secure analytics use
cases: secure autonomous aerial surveillance with a state-of-the-art deep CNN
consuming 3.16pJ per equivalent RISC op; local CNN-based face detection with
secured remote recognition in 5.74pJ/op; and seizure detection with encrypted
data collection from EEG within 12.7pJ/op.Comment: 15 pages, 12 figures, accepted for publication to the IEEE
Transactions on Circuits and Systems - I: Regular Paper
Special Session: AutoSoC - A Suite of Open-Source Automotive SoC Benchmarks
The current demands for autonomous driving generated momentum for an increase in research in the different technologies required for these applications. Nonetheless, the limited access to representative designs and industrial methodologies poses a challenge to the research community. Considering this scenario, there is a high demand for an open-source solution that could support development of research targeting automotive applications. This paper presents the current status of AutoSoC, an automotive SoC benchmark suite that includes hardware and software elements and is entirely open-source. The objective is to provide researchers with an industrial-grade automotive SoC that includes all essential components, is fully customizable, and enables analysis of functional safety solutions and automotive SoC configurations. This paper describes the available configurations of the benchmark including an initial assessment for ASIL B to D configurations
Secure Execution Architecture based on PUF-driven Instruction Level Code Encryption
A persistent problem with program execution, despite numerous mitigation attempts, is its inherent vulnerability to the injection of malicious code. Equally unsolved is the susceptibility of firmware to reverse engineering, which undermines the manufacturer\u27s code confidentiality. We propose an approach that solves both kinds of security problems employing instruction-level code encryption combined with the use of a physical unclonable function (PUF). Our novel Secure Execution PUF-based Processor (SEPP) architecture is designed to minimize the attack surface, as well as performance impact, and requires no significant changes to the development process. This is possible based on a tight integration of a PUF directly into the processor\u27s instruction pipeline. Furthermore, cloud scenarios and distributed embedded systems alike inherently depend on remote execution; our approach supports this, as the secure execution environment needs not to be locally available at the developers site. We implemented an FPGA-based prototype based on the OpenRISC Reference Platform. To assess our results, we performed a security analysis of the processor and evaluated the performance impact of the encryption. We show that the attack surface is significantly reduced compared to previous approaches while the performance penalty is at a reasonable factor of about 1.5
A novel architecture for secure database processing in cloud computing
Security, particularly data privacy, is one of the biggest barriers to the adoption of Database-as-a-Service (DBaaS) in Cloud Computing. Recent security breaches demonstrate that a more powerful protection mechanism is needed to protect data confidentiality from any honest-but-curious administrator. Typical prior effort on addressing this security problem is either prohibitively slow or highly restrictive in operation.
In this thesis, a novel cloud system architecture CypherDB, which makes use of a secure processor, is proposed to protect the confidentiality of outsourced database processing. To achieve this, a framework is developed to use these secure processors in the cloud for secure database processing. This framework allows distributed and parallel processing of the encrypted data and exhibits virtualization features in Cloud Computing. The CypherDB architecture also relies on two major components to protect the privacy of an outsourced database against any honest-but-curious administrator of high performance.
Firstly, a novel database encryption scheme is developed to protect the outsourced database which can be executed under a CypherDB secure processor with high performance. Our proposed scheme makes use of custom instructions to hide the encryption latency from the program execution. This scheme is extensively validated through an integration with SQLite, a practical database application program.
Secondly, a novel secure processor architecture is also developed to provide architectural support to our proposed database encryption scheme and efficient protection mechanism to secure all intermediate data generated on-the-fly during query execution. The efficiency, robustness and the cost of our novel processor architecture are validated and evaluated through extensive simulations and implementation on a FPGA platform.
A fully-functional Field-Programmable Gate Array (FPGA) implementation of our CypherDB secure processor and simulation studies demonstrate that our proposed architecture is cost-effective and of high performance. Our experiment of running the TPC-H database benchmark on SQLite demonstrates 10 to 14 percent performance overhead on average. The security components in CypherDB consume about 21K Logic Elements and 54 Block RAMs on the FPGA. The modification of SQLite only consists of 208 lines of code (LOC).Open Acces
Unlocking Hardware Security Assurance: The Potential of LLMs
System-on-Chips (SoCs) form the crux of modern computing systems. SoCs enable
high-level integration through the utilization of multiple Intellectual
Property (IP) cores. However, the integration of multiple IP cores also
presents unique challenges owing to their inherent vulnerabilities, thereby
compromising the security of the entire system. Hence, it is imperative to
perform hardware security validation to address these concerns. The efficiency
of this validation procedure is contingent on the quality of the SoC security
properties provided. However, generating security properties with traditional
approaches often requires expert intervention and is limited to a few IPs,
thereby resulting in a time-consuming and non-robust process. To address this
issue, we, for the first time, propose a novel and automated Natural Language
Processing (NLP)-based Security Property Generator (NSPG). Specifically, our
approach utilizes hardware documentation in order to propose the first hardware
security-specific language model, HS-BERT, for extracting security properties
dedicated to hardware design. To evaluate our proposed technique, we trained
the HS-BERT model using sentences from RISC-V, OpenRISC, MIPS, OpenSPARC, and
OpenTitan SoC documentation. When assessedb on five untrained OpenTitan
hardware IP documents, NSPG was able to extract 326 security properties from
1723 sentences. This, in turn, aided in identifying eight security bugs in the
OpenTitan SoC design presented in the hardware hacking competition, Hack@DAC
2022
High-Speed Performance, Power and Thermal Co-simulation For SoC Design
This dissertation presents a multi-faceted effort at developing standard System Design Language based tools that allow designers to the model power and thermal behavior of SoCs, including heterogeneous SoCs that include non-digital components. The research contributions made in this dissertation include:
• SystemC-based power/performance co-simulation for the Intel XScale microprocessor. We performed detailed characterization of the power dissipation patterns of a variety of system components and used these results to build detailed power models, including a highly accurate, validated instruction-level power model of the XScale processor. We also proposed a scalable, efficient and validated methodology for incorporating fast, accurate power modeling capabilities into system description languages such as SystemC. This was validated against physical measurements of hardware power dissipation.
• Modeling the behavior of non-digital SoC components within standard System Design Languages. We presented an approach for modeling the functionality, performance, power, and thermal behavior of a complex class of non-digital components — MEMS microhotplate-based gas sensors — within a SystemC design framework. The components modeled include both digital components (such as microprocessors, busses and memory) and MEMS devices comprising a gas sensor SoC. The first SystemC models of a MEMS-based SoC and the first SystemC models of MEMS thermal behavior were described. Techniques for significantly improving simulation speed were proposed, and their impact quantified.
• Vertically Integrated Execution-Driven Power, Performance and Thermal Co-Simulation For SoCs. We adapted the above techniques and used numerical methods to model the system of differential equations that governs on-chip thermal diffusion. This allows a single high-speed simulation to span performance, power and thermal modeling of a design. It also allows feedback behaviors, such as the impact of temperature on power dissipation or performance, to be modeled seamlessly. We validated the thermal equation-solving engine on test layouts against detailed low-level tools, and illustrated the power of such a strategy by demonstrating a series of studies that designers can perform using such tools. We also assessed how simulation and accuracy are impacted by spatial and temporal resolution used for thermal modeling
DIVAS: An LLM-based End-to-End Framework for SoC Security Analysis and Policy-based Protection
Securing critical assets in a bus-based System-On-Chip (SoC) is imperative to
mitigate potential vulnerabilities and prevent unauthorized access, ensuring
the integrity, availability, and confidentiality of the system. Ensuring
security throughout the SoC design process is a formidable task owing to the
inherent intricacies in SoC designs and the dispersion of assets across diverse
IPs. Large Language Models (LLMs), exemplified by ChatGPT (OpenAI) and BARD
(Google), have showcased remarkable proficiency across various domains,
including security vulnerability detection and prevention in SoC designs. In
this work, we propose DIVAS, a novel framework that leverages the knowledge
base of LLMs to identify security vulnerabilities from user-defined SoC
specifications, map them to the relevant Common Weakness Enumerations (CWEs),
followed by the generation of equivalent assertions, and employ security
measures through enforcement of security policies. The proposed framework is
implemented using multiple ChatGPT and BARD models, and their performance was
analyzed while generating relevant CWEs from the SoC specifications provided.
The experimental results obtained from open-source SoC benchmarks demonstrate
the efficacy of our proposed framework.Comment: 15 pages, 7 figures, 8 table
Microarchitectural Low-Power Design Techniques for Embedded Microprocessors
With the omnipresence of embedded processing in all forms of electronics today, there is a strong trend towards wireless, battery-powered, portable embedded systems which have to operate under stringent energy constraints. Consequently, low power consumption and high energy efficiency have emerged as the two key criteria for embedded microprocessor design. In this thesis we present a range of microarchitectural low-power design techniques which enable the increase of performance for embedded microprocessors and/or the reduction of energy consumption, e.g., through voltage scaling. In the context of cryptographic applications, we explore the effectiveness of instruction set extensions (ISEs) for a range of different cryptographic hash functions (SHA-3 candidates) on a 16-bit microcontroller architecture (PIC24). Specifically, we demonstrate the effectiveness of light-weight ISEs based on lookup table integration and microcoded instructions using finite state machines for operand and address generation. On-node processing in autonomous wireless sensor node devices requires deeply embedded cores with extremely low power consumption. To address this need, we present TamaRISC, a custom-designed ISA with a corresponding ultra-low-power microarchitecture implementation. The TamaRISC architecture is employed in conjunction with an ISE and standard cell memories to design a sub-threshold capable processor system targeted at compressed sensing applications. We furthermore employ TamaRISC in a hybrid SIMD/MIMD multi-core architecture targeted at moderate to high processing requirements (> 1 MOPS). A range of different microarchitectural techniques for efficient memory organization are presented. Specifically, we introduce a configurable data memory mapping technique for private and shared access, as well as instruction broadcast together with synchronized code execution based on checkpointing. We then study an inherent suboptimality due to the worst-case design principle in synchronous circuits, and introduce the concept of dynamic timing margins. We show that dynamic timing margins exist in microprocessor circuits, and that these margins are to a large extent state-dependent and that they are correlated to the sequences of instruction types which are executed within the processor pipeline. To perform this analysis we propose a circuit/processor characterization flow and tool called dynamic timing analysis. Moreover, this flow is employed in order to devise a high-level instruction set simulation environment for impact-evaluation of timing errors on application performance. The presented approach improves the state of the art significantly in terms of simulation accuracy through the use of statistical fault injection. The dynamic timing margins in microprocessors are then systematically exploited for throughput improvements or energy reductions via our proposed instruction-based dynamic clock adjustment (DCA) technique. To this end, we introduce a 6-stage 32-bit microprocessor with cycle-by-cycle DCA. Besides a comprehensive design flow and simulation environment for evaluation of the DCA approach, we additionally present a silicon prototype of a DCA-enabled OpenRISC microarchitecture fabricated in 28 nm FD-SOI CMOS. The test chip includes a suitable clock generation unit which allows for cycle-by-cycle DCA over a wide range with fine granularity at frequencies exceeding 1 GHz. Measurement results of speedups and power reductions are provided