371 research outputs found
Securing IoT Applications through Decentralised and Distributed IoT-Blockchain Architectures
The integration of blockchain into IoT can provide reliable control of the IoT network's
ability to distribute computation over a large number of devices. It also allows the AI
system to use trusted data for analysis and forecasts while utilising the available IoT
hardware to coordinate the execution of tasks in parallel, using a fully distributed
approach.
This thesis's  rst contribution is a practical implementation of a real world IoT-
blockchain application,
ood detection use case, is demonstrated using Ethereum proof
of authority (PoA). This includes performance measurements of the transaction con-
 rmation time, the system end-to-end latency, and the average power consumption.
The study showed that blockchain can be integrated into IoT applications, and that
Ethereum PoA can be used within IoT for permissioned implementation. This can be
achieved while the average energy consumption of running the
ood detection system
including the Ethereum Geth client is small (around 0.3J).
The second contribution is a novel IoT-centric consensus protocol called honesty-
based distributed proof of authority (HDPoA) via scalable work. HDPoA was analysed
and then deployed and tested. Performance measurements and evaluation along with
the security analyses of HDPoA were conducted using a total of 30 di erent IoT de-
vices comprising Raspberry Pis, ESP32, and ESP8266 devices. These measurements
included energy consumption, the devices' hash power, and the transaction con rma-
tion time. The measured values of hash per joule (h/J) for mining were 13.8Kh/J,
54Kh/J, and 22.4Kh/J when using the Raspberry Pi, the ESP32 devices, and the
ESP8266 devices, respectively, this achieved while there is limited impact on each de-
vice's power. In HDPoA the transaction con rmation time was reduced to only one
block compared to up to six blocks in bitcoin.
The third contribution is a novel, secure, distributed and decentralised architecture
for supporting the implementation of distributed arti cial intelligence (DAI) using
hardware platforms provided by IoT. A trained DAI system was implemented over the
IoT, where each IoT device hosts one or more neurons within the DAI layers. This
is accomplished through the utilisation of blockchain technology that allows trusted
interaction and information exchange between distributed neurons. Three di erent
datasets were tested and the system achieved a similar accuracy as when testing on a
standalone system; both achieved accuracies of 92%-98%. The system accomplished
that while ensuring an overall latency of as low as two minutes. This showed the secure architecture capabilities of facilitating the implementation of DAI within IoT
while ensuring the accuracy of the system is preserved.
The fourth contribution is a novel and secure architecture that integrates the ad-
vantages o ered by edge computing, arti cial intelligence (AI), IoT end-devices, and
blockchain. This new architecture has the ability to monitor the environment, collect
data, analyse it, process it using an AI-expert engine, provide predictions and action-
able outcomes, and  nally share it on a public blockchain platform. The pandemic
caused by the wide and rapid spread of the novel coronavirus COVID-19 was used as
a use-case implementation to test and evaluate the proposed system. While providing
the AI-engine trusted data, the system achieved an accuracy of 95%,. This is achieved
while the AI-engine only requires a 7% increase in power consumption. This demon-
strate the system's ability to protect the data and support the AI system, and improves
the IoT overall security with limited impact on the IoT devices.
The  fth and  nal contribution is enhancing the security of the HDPoA through
the integration of a hardware secure module (HSM) and a hardware wallet (HW). A
performance evaluation regarding the energy consumption of nodes that are equipped
with HSM and HW and a security analysis were conducted. In addition to enhancing
the nodes' security, the HSM can be used to sign more than 120 bytes/joule and
encrypt up to 100 bytes/joule, while the HW can be used to sign up to 90 bytes/joule
and encrypt up to 80 bytes/joule. The result and analyses demonstrated that the HSM
and HW enhance the security of HDPoA, and also can be utilised within IoT-blockchain
applications while providing much needed security in terms of con dentiality, trust in
devices, and attack deterrence.
The above contributions showed that blockchain can be integrated into IoT systems.
It showed that blockchain can successfully support the integration of other technolo-
gies such as AI, IoT end devices, and edge computing into one system thus allowing
organisations and users to bene t greatly from a resilient, distributed, decentralised,
self-managed, robust, and secure systems
Improving low latency applications for reconfigurable devices
This thesis seeks to improve low latency application performance via architectural improvements in reconfigurable devices. This is achieved by improving resource utilisation and access, and by exploiting the different environments within which reconfigurable devices are deployed.
Our first contribution leverages devices deployed at the network level to enable the low latency processing of financial market data feeds. Financial exchanges transmit messages via two identical data feeds to reduce the chance of message loss. We present an approach to arbitrate these redundant feeds at the network level using a Field-Programmable Gate Array (FPGA). With support for any messaging protocol, we evaluate our design using the NASDAQ TotalView-ITCH, OPRA, and ARCA data feed protocols, and provide two simultaneous outputs: one prioritising low latency, and one prioritising high reliability with three dynamically configurable windowing methods.
Our second contribution is a new ring-based architecture for low latency, parallel access to FPGA memory. Traditional FPGA memory is formed by grouping block memories (BRAMs) together and accessing them as a single device. Our architecture accesses these BRAMs independently and in parallel. Targeting memory-based computing, which stores pre-computed function results in memory, we benefit low latency applications that rely on: highly-complex functions; iterative computation; or many parallel accesses to a shared resource. We assess square root, power, trigonometric, and hyperbolic functions within the FPGA, and provide a tool to convert Python functions to our new architecture.
Our third contribution extends the ring-based architecture to support any FPGA processing element. We unify E heterogeneous processing elements within compute pools, with each element implementing the same function, and the pool serving D parallel function calls. Our implementation-agnostic approach supports processing elements with different latencies, implementations, and pipeline lengths, as well as non-deterministic latencies. Compute pools evenly balance access to processing elements across the entire application, and are evaluated by implementing eight different neural network activation functions within an FPGA.Open Acces
Automated tailoring of system software stacks
In many industrial sectors, device manufacturers are moving away from expensive special-purpose hardware units and consolidate their systems on commodity hardware. As part of this change, developers are enabled to run their applications on general-purpose operating systems like Linux, which already supports thousands of different devices out of the box and can be used in a wide range of target scenarios. Furthermore, the Linux ecosystem allows them to integrate existing implementations of standard functionality in the form of shared libraries.
However, as the libraries and the Linux kernel are designed as generic building blocks in order to support as many applications as possible, they cannot make assumptions about specific use cases for a single-purpose device. This generality leads to unnecessary overheads in narrowly defined target scenarios, as unneeded components do not only take up space on the target system but have to be maintained over the lifetime of the device as well. While the Linux kernel provides a configuration system to disable unneeded functionality like device drivers, determining the required features from over 16000 options is an infeasible task. Even worse, most shared libraries cannot be customized even though only around 10 percent of their functions are ever used by applications.
In this thesis, I present my approaches for the automated identification and removal of unnecessary components in all layers of the software stack. As the configuration system is an integral part of the Linux kernel, we embrace its presence and automatically generate custom-fitted configurations for observed target scenarios with the help of an extracted variability model. For the much more diverse realm of shared libraries, with different programming languages, build systems, and a lack of configurability, I demonstrate a different approach. By identifying individual functions as logically distinct units, we construct a symbol-level dependency graph across the applications and all their required libraries. We then remove unneeded code at the binary level and rearrange the remaining parts to take up minimal space in the binary file by formulating their placement as an optimization problem. To lower the number of unnecessary updates to unused components in a deployed system, I lastly present an automated method to determine the impact of software changes on a target scenario and provide guidance for developers on whether they need to update their systems.
Applying these techniques to different target systems, I demonstrate that we can disable up to 87 percent of configuration options in a Debian Linux kernel, shrink the size of an embedded OpenWrt kernel by 59 percent, and speed up the boot process of the embedded system by 21 percent. As part of the shared library tailoring process, we can remove 13060 functions from all libraries in OpenWrt and reduce their total size by 31 percent. In the memcached Docker container, we identify 381 entirely unneeded shared libraries and shrink the container image size by 82 percent. An analysis of the development history of two large library projects over the course of more than two years further shows that between 68 and 82 percent of all changes are not required for an OpenWrt appliance, reducing the number of patch days by up to 69 percent.
These results demonstrate the broad applicability of our automated methods for both the Linux kernel and shared libraries to a wide range of scenarios. From embedded systems to server applications, custom-tailored system software stacks contribute to the reduction of overheads in space and time
Energy-Sustainable IoT Connectivity: Vision, Technological Enablers, Challenges, and Future Directions
Technology solutions must effectively balance economic growth, social equity,
and environmental integrity to achieve a sustainable society. Notably, although
the Internet of Things (IoT) paradigm constitutes a key sustainability enabler,
critical issues such as the increasing maintenance operations, energy
consumption, and manufacturing/disposal of IoT devices have long-term negative
economic, societal, and environmental impacts and must be efficiently
addressed. This calls for self-sustainable IoT ecosystems requiring minimal
external resources and intervention, effectively utilizing renewable energy
sources, and recycling materials whenever possible, thus encompassing energy
sustainability. In this work, we focus on energy-sustainable IoT during the
operation phase, although our discussions sometimes extend to other
sustainability aspects and IoT lifecycle phases. Specifically, we provide a
fresh look at energy-sustainable IoT and identify energy provision, transfer,
and energy efficiency as the three main energy-related processes whose
harmonious coexistence pushes toward realizing self-sustainable IoT systems.
Their main related technologies, recent advances, challenges, and research
directions are also discussed. Moreover, we overview relevant performance
metrics to assess the energy-sustainability potential of a certain technique,
technology, device, or network and list some target values for the next
generation of wireless systems. Overall, this paper offers insights that are
valuable for advancing sustainability goals for present and future generations.Comment: 25 figures, 12 tables, submitted to IEEE Open Journal of the
Communications Societ
PUF for the Commons: Enhancing Embedded Security on the OS Level
Security is essential for the Internet of Things (IoT). Cryptographic
operations for authentication and encryption commonly rely on random input of
high entropy and secure, tamper-resistant identities, which are difficult to
obtain on constrained embedded devices. In this paper, we design and analyze a
generic integration of physically unclonable functions (PUFs) into the IoT
operating system RIOT that supports about 250 platforms. Our approach leverages
uninitialized SRAM to act as the digital fingerprint for heterogeneous devices.
We ground our design on an extensive study of PUF performance in the wild,
which involves SRAM measurements on more than 700 IoT nodes that aged naturally
in the real-world. We quantify static SRAM bias, as well as the aging effects
of devices and incorporate the results in our system. This work closes a
previously identified gap of missing statistically significant sample sizes for
testing the unpredictability of PUFs. Our experiments on COTS devices of 64 kB
SRAM indicate that secure random seeds derived from the SRAM PUF provide 256
Bits-, and device unique keys provide more than 128 Bits of security. In a
practical security assessment we show that SRAM PUFs resist moderate attack
scenarios, which greatly improves the security of low-end IoT devices.Comment: 18 pages, 12 figures, 3 table
Efficient and Side-Channel Resistant Implementations of Next-Generation Cryptography
The rapid development of emerging information technologies, such as quantum computing and the Internet of Things (IoT), will have or have already had a huge impact on the world. These technologies can not only improve industrial productivity but they could also bring more convenience to people’s daily lives. However, these techniques have “side effects” in the world of cryptography – they pose new difficulties and challenges from theory to practice. Specifically, when quantum computing capability (i.e., logical qubits) reaches a certain level, Shor’s algorithm will be able to break almost all public-key cryptosystems currently in use. On the other hand, a great number of devices deployed in IoT environments have very constrained computing and storage resources, so the current widely-used cryptographic algorithms may not run efficiently on those devices. A new generation of cryptography has thus emerged, including Post-Quantum Cryptography (PQC), which remains secure under both classical and quantum attacks, and LightWeight Cryptography (LWC), which is tailored for resource-constrained devices. Research on next-generation cryptography is of importance and utmost urgency, and the US National Institute of Standards and Technology in particular has initiated the standardization process for PQC and LWC in 2016 and in 2018 respectively.
Since next-generation cryptography is in a premature state and has developed rapidly in recent years, its theoretical security and practical deployment are not very well explored and are in significant need of evaluation. This thesis aims to look into the engineering aspects of next-generation cryptography, i.e., the problems concerning implementation efficiency (e.g., execution time and memory consumption) and security (e.g., countermeasures against timing attacks and power side-channel attacks). In more detail, we first explore efficient software implementation approaches for lattice-based PQC on constrained devices. Then, we study how to speed up isogeny-based PQC on modern high-performance processors especially by using their powerful vector units. Moreover, we research how to design sophisticated yet low-area instruction set extensions to further accelerate software implementations of LWC and long-integer-arithmetic-based PQC. Finally, to address the threats from potential power side-channel attacks, we present a concept of using special leakage-aware instructions to eliminate overwriting leakage for masked software implementations (of next-generation cryptography)
Flexible Hardware-based Security-aware Mechanisms and Architectures
For decades, software security has been the primary focus in securing our computing platforms. Hardware was always assumed trusted, and inherently served as the foundation, and thus the root of trust, of our systems. This has been further leveraged in developing hardware-based dedicated security extensions and architectures to protect software from attacks exploiting software vulnerabilities such as memory corruption. However, the recent outbreak of microarchitectural attacks has shaken these long-established trust assumptions in hardware entirely, thereby threatening the security of all of our computing platforms and bringing hardware and microarchitectural security under scrutiny. These attacks have undeniably revealed the grave consequences of hardware/microarchitecture security flaws to the entire platform security, and how they can even subvert the security guarantees promised by dedicated security architectures. Furthermore, they shed light on the sophisticated challenges particular to hardware/microarchitectural security; it is more critical (and more challenging) to extensively analyze the hardware for security flaws prior to production, since hardware, unlike software, cannot be patched/updated once fabricated.
Hardware cannot reliably serve as the root of trust anymore, unless we develop and adopt new design paradigms where security is proactively addressed and scrutinized across the full stack of our computing platforms, at all hardware design and implementation layers. Furthermore, novel flexible security-aware design mechanisms are required to be incorporated in processor microarchitecture and hardware-assisted security architectures, that can practically address the inherent conflict between performance and security by allowing that the trade-off is configured to adapt to the desired requirements.
In this thesis, we investigate the prospects and implications at the intersection of hardware and security that emerge across the full stack of our computing platforms and System-on-Chips (SoCs). On one front, we investigate how we can leverage hardware and its advantages, in contrast to software, to build more efficient and effective security extensions that serve security architectures, e.g., by providing execution attestation and enforcement, to protect the software from attacks exploiting software vulnerabilities. We further propose that they are microarchitecturally configured at runtime to provide different types of security services, thus adapting flexibly to different deployment requirements. On another front, we investigate how we can protect these hardware-assisted security architectures and extensions themselves from microarchitectural and software attacks that exploit design flaws that originate in the hardware, e.g., insecure resource sharing in SoCs. More particularly, we focus in this thesis on cache-based side-channel attacks, where we propose sophisticated cache designs, that fundamentally mitigate these attacks, while still preserving performance by enabling that the performance security trade-off is configured by design. We also investigate how these can be incorporated into flexible and customizable security architectures, thus complementing them to further support a wide spectrum of emerging applications with different performance/security requirements. Lastly, we inspect our computing platforms further beneath the design layer, by scrutinizing how the actual implementation of these mechanisms is yet another potential attack surface. We explore how the security of hardware designs and implementations is currently analyzed prior to fabrication, while shedding light on how state-of-the-art hardware security analysis techniques are fundamentally limited, and the potential for improved and scalable approaches
Applications
Volume 3 describes how resource-aware machine learning methods and techniques are used to successfully solve real-world problems. The book provides numerous specific application examples: in health and medicine for risk modelling, diagnosis, and treatment selection for diseases in electronics, steel production and milling for quality control during manufacturing processes in traffic, logistics for smart cities and for mobile communications
Towards Scalable OLTP Over Fast Networks
Online Transaction Processing (OLTP) underpins real-time data processing in many mission-critical applications, from banking to e-commerce.
These applications typically issue short-duration, latency-sensitive transactions that demand immediate processing.
High-volume applications, such as Alibaba's e-commerce platform, achieve peak transaction rates as high as 70 million transactions per second, exceeding the capacity of a single machine.
Instead, distributed OLTP database management systems (DBMS) are deployed across multiple powerful machines.
Historically, such distributed OLTP DBMSs have been primarily designed to avoid network communication, a paradigm largely unchanged since the 1980s.
However, fast networks challenge the conventional belief that network communication is the main bottleneck.
In particular, emerging network technologies, like Remote Direct Memory Access (RDMA), radically alter how data can be accessed over a network.
RDMA's primitives allow direct access to the memory of a remote machine within an order of magnitude of local memory access.
This development invalidates the notion that network communication is the primary bottleneck.
Given that traditional distributed database systems have been designed with the premise that the network is slow, they cannot efficiently exploit these fast network primitives, which requires us to reconsider how we design distributed OLTP systems.
This thesis focuses on the challenges RDMA presents and its implications on the design of distributed OLTP systems.
First, we examine distributed architectures to understand data access patterns and scalability in modern OLTP systems.
Drawing on these insights, we advocate a distributed storage engine optimized for high-speed networks.
The storage engine serves as the foundation of a database, ensuring efficient data access through three central components: indexes, synchronization primitives, and buffer management (caching).
With the introduction of RDMA, the landscape of data access has undergone a significant transformation.
This requires a comprehensive redesign of the storage engine components to exploit the potential of RDMA and similar high-speed network technologies.
Thus, as the second contribution, we design RDMA-optimized tree-based indexes — especially applicable for disaggregated databases to access remote data efficiently.
We then turn our attention to the unique challenges of RDMA.
One-sided RDMA, one of the network primitives introduced by RDMA, presents a performance advantage in enabling remote memory access while bypassing the remote CPU and the operating system.
This allows the remote CPU to process transactions uninterrupted, with no requirement to be on hand for network communication. However, that way, specialized one-sided RDMA synchronization primitives are required since traditional CPU-driven primitives are bypassed.
We found that existing RDMA one-sided synchronization schemes are unscalable or, even worse, fail to synchronize correctly, leading to hard-to-detect data corruption.
As our third contribution, we address this issue by offering guidelines to build scalable and correct one-sided RDMA synchronization primitives.
Finally, recognizing that maintaining all data in memory becomes economically unattractive, we propose a distributed buffer manager design that efficiently utilizes cost-effective NVMe flash storage.
By leveraging low-latency RDMA messages, our buffer manager provides a transparent memory abstraction, accessing the aggregated DRAM and NVMe storage across nodes.
Central to our approach is a distributed caching protocol that dynamically caches data.
With this approach, our system can outperform RDMA-enabled in-memory distributed databases while managing larger-than-memory datasets efficiently
Models, methods, and tools for developing MMOG backends on commodity clouds
Online multiplayer games have grown to unprecedented scales, attracting millions of players
worldwide. The revenue from this industry has already eclipsed well-established entertainment
industries like music and films and is expected to continue its rapid growth in the future.
Massively Multiplayer Online Games (MMOGs) have also been extensively used in research
studies and education, further motivating the need to improve their development process.
The development of resource-intensive, distributed, real-time applications like MMOG backends
involves a variety of challenges. Past research has primarily focused on the development and
deployment of MMOG backends on dedicated infrastructures such as on-premise data centers
and private clouds, which provide more flexibility but are expensive and hard to set up and
maintain. A limited set of works has also focused on utilizing the Infrastructure-as-a-Service
(IaaS) layer of public clouds to deploy MMOG backends. These clouds can offer various advantages
like a lower barrier to entry, a larger set of resources, etc. but lack resource elasticity,
standardization, and focus on development effort, from which MMOG backends can greatly
benefit.
Meanwhile, other research has also focused on solving various problems related to consistency,
performance, and scalability. Despite major advancements in these areas, there is no standardized
development methodology to facilitate these features and assimilate the development of
MMOG backends on commodity clouds. This thesis is motivated by the results of a systematic
mapping study that identifies a gap in research, evident from the fact that only a handful
of studies have explored the possibility of utilizing serverless environments within commodity
clouds to host these types of backends. These studies are mostly vision papers and do
not provide any novel contributions in terms of methods of development or detailed analyses
of how such systems could be developed. Using the knowledge gathered from this mapping
study, several hypotheses are proposed and a set of technical challenges is identified, guiding
the development of a new methodology.
The peculiarities of MMOG backends have so far constrained their development and deployment
on commodity clouds despite rapid advancements in technology. To explore whether such
environments are viable options, a feasibility study is conducted with a minimalistic MMOG
prototype to evaluate a limited set of public clouds in terms of hosting MMOG backends. Foli
lowing encouraging results from this study, this thesis first motivates toward and then presents
a set of models, methods, and tools with which scalable MMOG backends can be developed
for and deployed on commodity clouds. These are encapsulated into a software development
framework called Athlos which allows software engineers to leverage the proposed development
methodology to rapidly create MMOG backend prototypes that utilize the resources of
these clouds to attain scalable states and runtimes. The proposed approach is based on a dynamic
model which aims to abstract the data requirements and relationships of many types of
MMOGs. Based on this model, several methods are outlined that aim to solve various problems
and challenges related to the development of MMOG backends, mainly in terms of performance
and scalability. Using a modular software architecture, and standardization in common development
areas, the proposed framework aims to improve and expedite the development process
leading to higher-quality MMOG backends and a lower time to market. The models and methods
proposed in this approach can be utilized through various tools during the development
lifecycle.
The proposed development framework is evaluated qualitatively and quantitatively. The thesis
presents three case study MMOG backend prototypes that validate the suitability of the proposed
approach. These case studies also provide a proof of concept and are subsequently used
to further evaluate the framework. The propositions in this thesis are assessed with respect to
the performance, scalability, development effort, and code maintainability of MMOG backends
developed using the Athlos framework, using a variety of methods such as small and large-scale
simulations and more targeted experimental setups. The results of these experiments uncover
useful information about the behavior of MMOG backends. In addition, they provide evidence
that MMOG backends developed using the proposed methodology and hosted on serverless
environments can: (a) support a very high number of simultaneous players under a given latency
threshold, (b) elastically scale both in terms of processing power and memory capacity
and (c) significantly reduce the amount of development effort. The results also show that this
methodology can accelerate the development of high-performance, distributed, real-time applications
like MMOG backends, while also exposing the limitations of Athlos in terms of code
maintainability.
Finally, the thesis provides a reflection on the research objectives, considerations on the hypotheses
and technical challenges, and outlines plans for future work in this domain
- …