393 research outputs found
Real-time encryption and authentication of medical video streams on FPGA
This work presents an FPGA-based solution for the
encryption and authentication of video streams of surgeries. The
most important is minimal latency. To achieve this, a block cipher
with an authenticated mode of operation is used. We choose
to use AES128 with Galois/Counter Mode (GCM), because the
this mode of operation is patent-free and it allows for random
read access. This solution minimizes the overhead on the existing
critical path to a single XOR operation.
Our solution supports the broadcasting of the video stream.
When a new receiver announces itself, it should receive the active
keys of the sender. Therefore, a key transport protocol is used to
establish a key between the sender and the announcing receiver.
A proof-of-concept implementation of the proposed solution
has been implemented and tested. While the complete video
stream is encrypted and authenticated, the demonstrator confirms
that the added latency, which is around 23 s, could not
be noticed by the human eye. Random read access and the key
establishment protocol provide a flexible solution
Efficient Pipelining for Modular Multiplication Architectures in Prime Fields
This paper presents a pipelined architecture of a modular Montgomery multiplier, which is suitable to be used in public key coprocessors. Starting from a baseline implementation of the Montgomery algorithm, a more compact pipelined version is derived. The design makes use of 16bit integer multiplication blocks that are available on recently manufactured FPGAs. The critical path is optimized by omitting the exact computation of intermediate results in the Montgomery algorithm using a 6-2 carry-save notation. This results in a high-speed architecture, which outperforms previously designed Montgomery multipliers. Because a very popular application of Montgomery multiplication is public key cryptography, we compare our implementation to the state-of-the-art in Montgomery multipliers on the basis of performance results for 1024-bit RSA
An FPGA Implementation of a Montgomery Multiplier Over GF(2^m)
This paper describes an efficient FPGA implementation for modular multiplication in the finite field GF(2^m) that is suitable for implementing Elliptic Curve Cryptosystems. We have developed a systolic array implementation of a~Montgomery modular multiplication. Our solution is efficient for large finite fields (m=160-193), that offer a high security level, and it can be scaled easily to larger values of m. The clock frequency of the implementation is independent of the field size. In contrast to earlier work, the design is not restricted to field representations using irreducible trinomials, all one polynomials or equally spaced polynomials
On-chip jitter measurement for true random number generators
Applications of true random number generators (TRNGs) span from art to numerical computing and system
security. In cryptographic applications, TRNGs are used for generating new keys, nonces and masks. For this reason, a TRNG is an essential building block and often a point of failure for embedded security systems. One type of primitives that are widely used as source of randomness are ring oscillators. For a ring-oscillator-based TRNG, the true randomness originates from its timing jitter. Therefore, determining the jitter strength is essential to estimate the quality of a TRNG. In this paper, we propose a method to measure the jitter strength of a ring oscillator implemented on an FPGA. The fast tapped delay chain is utilized to perform the on-chip measurement with a high resolution. The proposed method is implemented on
both a Xilinx FPGA and an Intel FPGA. Fast carry logic components on different FPGAs are used to implement the fast delay line. This carry logic component is designed to be fast and has dedicated routing, which enables a precise measurement. The differential structure of the delay chain is used to thwart
the influence of undesirable noise from the measurement. The proposed methodology can be applied to other FPGA families and ASIC designs
Secure remote reconfiguration of FPGAs
This paper presents a solution for secure remote reconfiguration of FPGAs. Communicating the bitstream has to be done in a secure manner to prevent an attacker from reading or altering the bitstream. We propose a setup in which the FPGA is the single device in the system\u27s zone-of-trust. The result is an FPGA architecture that is divided into a static and a dynamic region. The static region holds the communication, security and reconfiguration facilities, while the dynamic region contains the targeted application
The Monte Carlo PUF
Physically unclonable functions are used for IP protection, hardware authentication and supply chain security. While many PUF constructions have been put forward in the past decade, only few of them are applicable to FPGA platforms. Strict constraints on the placement and routing are the main disadvantages of the existing PUFs on FPGAs, because they place a high effort on the designer. In this paper we propose a new delay-based PUF construction called Monte Carlo PUF, that does not require low-level placement and routing control. This construction relies on the on-chip Monte Carlo method that is applied for measuring the delays of logic elements in order to extract a unique device fingerprint. The proposed construction allows a trade-off between the evaluation time and the error rate.
The Monte Carlo PUF is implemented and evaluated on Xilinx Spartan-6 FPGAs
Quantization-aware Neural Architectural Search for Intrusion Detection
Deploying machine learning-based intrusion detection systems (IDSs) on
hardware devices is challenging due to their limited computational resources,
power consumption, and network connectivity. Hence, there is a significant need
for robust, deep learning models specifically designed with such constraints in
mind. In this paper, we present a design methodology that automatically trains
and evolves quantized neural network (NN) models that are a thousand times
smaller than state-of-the-art NNs but can efficiently analyze network data for
intrusion at high accuracy. In this regard, the number of LUTs utilized by this
network when deployed to an FPGA is between 2.3x and 8.5x smaller with
performance comparable to prior work
ALBUS: a Probabilistic Monitoring Algorithm to Counter Burst-Flood Attacks
Modern DDoS defense systems rely on probabilistic monitoring algorithms to
identify flows that exceed a volume threshold and should thus be penalized.
Commonly, classic sketch algorithms are considered sufficiently accurate for
usage in DDoS defense. However, as we show in this paper, these algorithms
achieve poor detection accuracy under burst-flood attacks, i.e., volumetric
DDoS attacks composed of a swarm of medium-rate sub-second traffic bursts.
Under this challenging attack pattern, traditional sketch algorithms can only
detect a high share of the attack bursts by incurring a large number of false
positives.
In this paper, we present ALBUS, a probabilistic monitoring algorithm that
overcomes the inherent limitations of previous schemes: ALBUS is highly
effective at detecting large bursts while reporting no legitimate flows, and
therefore improves on prior work regarding both recall and precision. Besides
improving accuracy, ALBUS scales to high traffic rates, which we demonstrate
with an FPGA implementation, and is suitable for programmable switches, which
we showcase with a P4 implementation.Comment: Accepted at the 42nd International Symposium on Reliable Distributed
Systems (SRDS 2023
Maximizing the Potential of Custom RISC-V Vector Extensions for Speeding up SHA-3 Hash Functions
SHA-3 is considered to be one of the most secure standardized hash functions. It relies on the Keccak-f[1 600] permutation, which operates on an internal state of 1 600 bits, mostly represented as a 5Ă—5Ă—64-bit matrix. While software implementations process the state sequentially in chunks of typically 32 or 64 bits, the Keccak-f[1 600] permutation can benefit a lot from speedup through parallelization. This paper is the first to explore the full potential of parallelization of Keccak-f[1 600] in RISC-V based processors through custom vector extensions on 32-bit and 64-bit architectures.
%Such a structure is suitable to work under vector instructions in data-parallel operation mode. This paper uses the RISC-V vector extensions to explore its performance in 64-bit and 32-bit architectures.
We analyze the Keccak-f[1 600] permutation, composed of five different step mappings, and propose ten custom vector instructions to speed up the computation. We realize these extensions in a SIMD processor described in SystemVerilog. We compare the performance of our hardware/software co-design to a software-only implementation on the one hand and to existing architectures based on (vectorized) hardware/software co-design on the other hand. We show that our design outperforms all related work thanks to our carefully selected custom vector instructions
Enhanced end-to-end security through symmetric-key cryptography in wearable medical sensor networks
Computer Science
- …