Search CORE

159 research outputs found

A Touch of Evil: High-Assurance Cryptographic Hardware from Untrusted Components

Author: Cerulli Andrea
Cvrcek Dan
Danezis George
Klinec Dusan
Mavroudis Vasilios
Svenda Petr
Publication venue
Publication date: 28/10/2017
Field of study

The semiconductor industry is fully globalized and integrated circuits (ICs) are commonly defined, designed and fabricated in different premises across the world. This reduces production costs, but also exposes ICs to supply chain attacks, where insiders introduce malicious circuitry into the final products. Additionally, despite extensive post-fabrication testing, it is not uncommon for ICs with subtle fabrication errors to make it into production systems. While many systems may be able to tolerate a few byzantine components, this is not the case for cryptographic hardware, storing and computing on confidential data. For this reason, many error and backdoor detection techniques have been proposed over the years. So far all attempts have been either quickly circumvented, or come with unrealistically high manufacturing costs and complexity. This paper proposes Myst, a practical high-assurance architecture, that uses commercial off-the-shelf (COTS) hardware, and provides strong security guarantees, even in the presence of multiple malicious or faulty components. The key idea is to combine protective-redundancy with modern threshold cryptographic techniques to build a system tolerant to hardware trojans and errors. To evaluate our design, we build a Hardware Security Module that provides the highest level of assurance possible with COTS components. Specifically, we employ more than a hundred COTS secure crypto-coprocessors, verified to FIPS140-2 Level 4 tamper-resistance standards, and use them to realize high-confidentiality random number generation, key derivation, public key decryption and signing. Our experiments show a reasonable computational overhead (less than 1% for both Decryption and Signing) and an exponential increase in backdoor-tolerance as more ICs are added

arXiv.org e-Print Archive

UCL Discovery

Manticore: Hardware-Accelerated RTL Simulation with Static Bulk-Synchronous Parallelism

Author: Emami Mahyar
Kamahori Keisuke
Kashani Sahand
Larus James R.
Pourghannad Mohammad Sepehr
Raj Ritik
Publication venue
Publication date: 23/01/2023
Field of study

The demise of Moore's Law and Dennard Scaling has revived interest in specialized computer architectures and accelerators. Verification and testing of this hardware heavily uses cycle-accurate simulation of register-transfer-level (RTL) designs. The best software RTL simulators can simulate designs at 1--1000~kHz, i.e., more than three orders of magnitude slower than hardware. Faster simulation can increase productivity by speeding design iterations and permitting more exhaustive exploration. One possibility is to use parallelism as RTL exposes considerable fine-grain concurrency. However, state-of-the-art RTL simulators generally perform best when single-threaded since modern processors cannot effectively exploit fine-grain parallelism. This work presents Manticore: a parallel computer designed to accelerate RTL simulation. Manticore uses a static bulk-synchronous parallel (BSP) execution model to eliminate runtime synchronization barriers among many simple processors. Manticore relies entirely on its compiler to schedule resources and communication. Because RTL code is practically free of long divergent execution paths, static scheduling is feasible. Communication and synchronization no longer incur runtime overhead, enabling efficient fine-grain parallelism. Moreover, static scheduling dramatically simplifies the physical implementation, significantly increasing the potential parallelism on a chip. Our 225-core FPGA prototype running at 475 MHz outperforms a state-of-the-art RTL simulator on an Intel Xeon processor running at

\approx

3.3 GHz by up to 27.9

\times

(geomean 5.3

\times

) in nine Verilog benchmarks

arXiv.org e-Print Archive

Design Space Exploration and Resource Management of Multi/Many-Core Systems

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

The increasing demand of processing a higher number of applications and related data on computing platforms has resulted in reliance on multi-/many-core chips as they facilitate parallel processing. However, there is a desire for these platforms to be energy-efficient and reliable, and they need to perform secure computations for the interest of the whole community. This book provides perspectives on the aforementioned aspects from leading researchers in terms of state-of-the-art contributions and upcoming trends

Directory of Open Access Books (DOAB)

Recommended from our members

Error-efficient computing systems

Author: Rinard M
Stanley-Marbell P
Publication venue: Foundations and Trends in Electronic Design Automation
Publication date: 01/01/2017
Field of study

This survey explores the theory and practice of techniques to make computing systems faster or more energy-efficient by allowing them to make controlled errors. In the same way that systems which only use as much energy as necessary are referred to as being energy-efficient, you can think of the class of systems addressed by this survey as being error-efficient: They only prevent as many errors as they need to. The definition of what constitutes an error varies across the parts of a system. And the errors which are acceptable depend on the application at hand. In computing systems, making errors, when behaving correctly would be too expensive, can conserve resources. The resources conserved may be time: By making some errors, systems may be faster. The resource may also be energy: A system may use less power from its batteries or from the electrical grid by only avoiding certain errors while tolerating benign errors that are associated with reduced power consumption. The resource in question may be an even more abstract quantity such as consistency of ordering of the outputs of a system. This survey is for anyone interested in an end-to-end view of one set of techniques that address the theory and practice of making computing systems more efficient by trading errors for improved efficiency

Apollo (Cambridge)

New Logic Synthesis As Nanotechnology Enabler (invited paper)

Author: Amarù Luca
De Micheli Giovanni
Gaillardon Pierre-Emmanuel
Mitra Subhasish
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 27/07/2015
Field of study

Nanoelectronics comprises a variety of devices whose electrical properties are more complex as compared to CMOS, thus enabling new computational paradigms. The potentially large space for innovation has to be explored in the search for technologies that can support large-scale and high- performance circuit design. Within this space, we analyze a set of emerging technologies characterized by a similar computational abstraction at the design level, i.e., a binary comparator or a majority voter. We demonstrate that new logic synthesis techniques, natively supporting this abstraction, are the technology enablers. We describe models and data-structures for logic design using emerging technologies and we show results of applying new synthesis algorithms and tools. We conclude that new logic synthesis methods are required to both evaluate emerging technologies and to achieve the best results in terms of area, power and performance

Infoscience - École polytechnique fédérale de Lausanne

Business analytics in industry 4.0: a systematic review

Author: Abdelmaguid T.
Abdirad M.
Abdous M.‐A.
Akhtari S.
Alasali F.
Albers A.
Alberto Sala D.
Ali S.
Ansari F.
Antomarioni S.
Apiletti D.
Armbrust M.
Arnott D.
Aydemir G.
Bagheri M.
Bakar N.
Banerjee A.
Barton D.
Birglen L.
Bordel B.
Bordeleau F.‐E.
Borgi T.
Bose S. K.
Bousdekis A.
Brik B.
Bruneo D.
Bányai T.
Calabrese M.
Candanedo I.
Canizo M.
Cao G.
Cao Q.
Charest M.
Chen H.
Chen Y.
Chen Y.‐J.
Chiang L.
Chi‐Hsien K.
Choi W.
Chong D.
Cicconi P.
Cisotto S.
Clegg D.
Costa C.
Costa R.
Davenport T. H.
Diez‐Olivan A.
Duan L.
Durakbasa N.
Dutta R.
Dwaraka R.
ESRTC
Essien A.
European Commission
Fu Y.
Gomes M.
Goodfellow I.
Guo Z.
Haffner O.
He M.
Hesser D. F.
Jugulum R.
Kabugo J. C.
Karakose M.
Kaupp L.
Kharwar P.
Khatri V.
Khayyam H.
Kiangala K.
Kim S.
Kirchen I.
Kitchenham B.
Klement N.
Koch R.
Kohlert M.
Krishnamoorthi S.
Krishnan K.
Kumar A.
Kuo C.‐J.
Kuo H.
Lasi H.
Lee J.
Lee J.
Lee W. J.
Lee Y.‐M.
Leite M.
Lenz J.
Li H.
Li S. C.
Li Y.
Li Z.
Liang Y.
Lin C.
Lin C.
Lin T.
Liulys K.
Lu Y.
Ma C.
Maggipinto M.
Martinek P.
Masoudinejad M.
Massaro A.
Massaro A.
Milošević M.
Miškuf M.
Mohanty S.
Mozgova I.
Muhuri P.
Mulrennan K.
Naskos A.
Negri E.
Neuböck T.
Nikolic B.
Niño M.
Nuzzi C.
O'Donovan P.
Packianather M. S.
Pane Y.
Park C. Y.
Peralta G.
Pierezan J.
Pinto R.
Plehiers P.
Ploennigs J.
Pradhan K.
Proto S.
Qi Q.
Qin J.
Qin J.
Qu S.
Rahman H.
Rendall R.
Richter J.
Rogier J.
Romeo L.
Rosli N.
Rousopoulou V.
Ruiz‐Sarmiento J.
Russom P.
Russom P.
Saldivar A. A. F.
Saldivar A. A. F.
Saldivar A. A. F.
Sanz E.
Saxena V. K.
Sellami C.
Senkerik R.
Sezer E.
Sharp M.
Shrouf F.
Silva D.
Soto J. A. C.
Spendla L.
Stein B. V.
Straus P.
Stürmlinger T.
Subakti H.
Subramaniyan M.
Sun I.
Susto G. A.
Swamy A. K.
Sá A.
Tan Y.
Tang D.
Teschemacher U.
Tieng H.
Tiwari K.
Tjahjono B.
Trunzer E.
Tsai S.
Tsourma M.
Uhlmann E.
Uriarte A. G.
Vathoopan M.
Vazan P.
Ventura F.
Wan J.
Wang Y.
Wang Y.‐M.
Wu W.
Xia F.
Xu L. D.
Xu X.
Xu X.
Yan H.
Yan J.
Yang H.
Yang J.
Yeh W.
Zenisek J.
Zhang Q.
Zhang T.
Zheng M.
Zhong R. Y.
Zhou H.
Öchsner A.
Publication venue: 'Wiley'
Publication date: 01/01/2021
Field of study

Recently, the term “Industry 4.0” has emerged to characterize several Information Technology and Communication (ICT) adoptions in production processes (e.g., Internet-of-Things, implementation of digital production support information technologies). Business Analytics is often used within the Industry 4.0, thus incorporating its data intelligence (e.g., statistical analysis, predictive modelling, optimization) expert system component. In this paper, we perform a Systematic Literature Review (SLR) on the usage of Business Analytics within the Industry 4.0 concept, covering a selection of 169 papers obtained from six major scientific publication sources from 2010 to March 2020. The selected papers were first classified in three major types, namely, Practical Application, Reviews and Framework Proposal. Then, we analysed with more detail the practical application studies which were further divided into three main categories of the Gartner analytical maturity model, Descriptive Analytics, Predictive Analytics and Prescriptive Analytics. In particular, we characterized the distinct analytics studies in terms of the industry application and data context used, impact (in terms of their Technology Readiness Level) and selected data modelling method. Our SLR analysis provides a mapping of how data-based Industry 4.0 expert systems are currently used, disclosing also research gaps and future research opportunities.The work of P. Cortez was supported by FCT - Fundação para a Ciência e Tecnologia within the R&D Units Project Scope: UIDB/00319/2020. We would like to thank to the three anonymous reviewers for their helpful suggestions

Universidade do Minho: RepositoriUM

Crossref

Techniques of Energy-Efficient VLSI Chip Design for High-Performance Computing

Author: Zhao Zhou
Publication venue: LSU Digital Commons
Publication date: 13/09/2018
Field of study

How to implement quality computing with the limited power budget is the key factor to move very large scale integration (VLSI) chip design forward. This work introduces various techniques of low power VLSI design used for state of art computing. From the viewpoint of power supply, conventional in-chip voltage regulators based on analog blocks bring the large overhead of both power and area to computational chips. Motivated by this, a digital based switchable pin method to dynamically regulate power at low circuit cost has been proposed to make computing to be executed with a stable voltage supply. For one of the widely used and time consuming arithmetic units, multiplier, its operation in logarithmic domain shows an advantageous performance compared to that in binary domain considering computation latency, power and area. However, the introduced conversion error reduces the reliability of the following computation (e.g. multiplication and division.). In this work, a fast calibration method suppressing the conversion error and its VLSI implementation are proposed. The proposed logarithmic converter can be supplied by dc power to achieve fast conversion and clocked power to reduce the power dissipated during conversion. Going out of traditional computation methods and widely used static logic, neuron-like cell is also studied in this work. Using multiple input floating gate (MIFG) metal-oxide semiconductor field-effect transistor (MOSFET) based logic, a 32-bit, 16-operation arithmetic logic unit (ALU) with zipped decoding and a feedback loop is designed. The proposed ALU can reduce the switching power and has a strong driven-in capability due to coupling capacitors compared to static logic based ALU. Besides, recent neural computations bring serious challenges to digital VLSI implementation due to overload matrix multiplications and non-linear functions. An analog VLSI design which is compatible to external digital environment is proposed for the network of long short-term memory (LSTM). The entire analog based network computes much faster and has higher energy efficiency than the digital one

Louisiana State University

Dynamic task scheduling and binding for many-core systems through stream rewriting

Author: Middendorf Lars (gnd: 1071804979)
Publication venue: Universität Rostock Rostock
Publication date
Field of study

This thesis proposes a novel model of computation, called stream rewriting, for the specification and implementation of highly concurrent applications. Basically, the active tasks of an application and their dependencies are encoded as a token stream, which is iteratively modified by a set of rewriting rules at runtime. In order to estimate the performance and scalability of stream rewriting, a large number of experiments have been evaluated on many-core systems and the task management has been implemented in software and hardware.In dieser Dissertation wurde Stream Rewriting als eine neue Methode entwickelt, um Anwendungen mit einer großen Anzahl von dynamischen Tasks zu beschreiben und effizient zur Laufzeit verwalten zu können. Dabei werden die aktiven Tasks in einem Datenstrom verpackt, der zur Laufzeit durch wiederholtes Suchen und Ersetzen umgeschrieben wird. Um die Performance und Skalierbarkeit zu bestimmen, wurde eine Vielzahl von Experimenten mit Many-Core-Systemen durchgeführt und die Verwaltung von Tasks über Stream Rewriting in Software und Hardware implementiert

Rostocker Dokumentenserver

REED: Chiplet-Based Scalable Hardware Accelerator for Fully Homomorphic Encryption

Author: Ahmet Can Mert
Aikata Aikata
Maxim Deryabin
Sujoy Sinha Roy
Sunmin Kwon
Publication venue: International Association for Cryptologic Research (IACR)
Publication date: 05/08/2023
Field of study

Fully Homomorphic Encryption (FHE) has emerged as a promising technology for processing encrypted data without the need for decryption. Despite its potential, its practical implementation has faced challenges due to substantial computational overhead. To address this issue, we propose the

first

chiplet-based FHE accelerator design `REED\u27, which enables scalability and offers high throughput, thereby enhancing homomorphic encryption deployment in real-world scenarios. It incorporates well-known wafer yield issues during fabrication which significantly impacts production costs. In contrast to state-of-the-art approaches, we also address data exchange overhead by proposing a non-blocking inter-chiplet communication strategy. We incorporate novel pipelined Number Theoretic Transform and automorphism techniques, leveraging parallelism and providing high throughput. Experimental results demonstrate that REED 2.5D integrated circuit consumes 177 mm

^2

chip area, 82.5 W average power in 7nm technology, and achieves an impressive speedup of up to 5,982

\times

compared to a CPU (24-core 2

\times

Intel X5690), and 2

\times

better energy efficiency and 50\% lower development cost than state-of-the-art ASIC accelerator. To evaluate its practical impact, we are the

first

to benchmark an encrypted deep neural network training. Overall, this work successfully enhances the practicality and deployability of fully homomorphic encryption in real-world scenarios

Cryptology ePrint Archive

SSP: Eliminating Redundant Writes in Failure-Atomic NVRAMs via Shadow Sub-Paging

Author: Bittman Daniel
Coburn Joel
Hitz Dave
Kolli Aasheesh
Kwon Youngjin
Lee Changman
Lee Se Kwon
Minh Chi Cao
Ni Yuanjiang
Pelley Steven
Talluri Madhusudhan
Venkataraman Shivaram
Volos Haris
Xu Jian
Yang Jun
Zhao Jishen
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Crossref

eScholarship - University of California