3,340 research outputs found
Hidden in the Cloud : Advanced Cryptographic Techniques for Untrusted Cloud Environments
In the contemporary digital age, the ability to search and perform operations on encrypted data has become increasingly important. This significance is primarily due to the exponential growth of data, often referred to as the "new oil," and the corresponding rise in data privacy concerns. As more and more data is stored in the cloud, the need for robust security measures to protect this data from unauthorized access and misuse has become paramount.
One of the key challenges in this context is the ability to perform meaningful operations on the data while it remains encrypted. Traditional encryption techniques, while providing a high level of security, render the data unusable for any practical purpose other than storage. This is where advanced cryptographic protocols like Symmetric Searchable Encryption (SSE), Functional Encryption (FE), Homomorphic Encryption (HE), and Hybrid Homomorphic Encryption (HHE) come into play. These protocols not only ensure the confidentiality of data but also allow computations on encrypted data, thereby offering a higher level of security and privacy.
The ability to search and perform operations on encrypted data has several practical implications. For instance, it enables efficient Boolean queries on encrypted databases, which is crucial for many "big data" applications. It also allows for the execution of phrase searches, which are important for many machine learning applications, such as intelligent medical data analytics. Moreover, these capabilities are particularly relevant in the context of sensitive data, such as health records or financial information, where the privacy and security of user data are of utmost importance.
Furthermore, these capabilities can help build trust in digital systems. Trust is a critical factor in the adoption and use of digital services. By ensuring the confidentiality, integrity, and availability of data, these protocols can help build user trust in cloud services. This trust, in turn, can drive the wider adoption of digital services, leading to a more inclusive digital society.
However, it is important to note that while these capabilities offer significant advantages, they also present certain challenges. For instance, the computational overhead of these protocols can be substantial, making them less suitable for scenarios where efficiency is a critical requirement. Moreover, these protocols often require sophisticated key management mechanisms, which can be challenging to implement in practice. Therefore, there is a need for ongoing research to address these challenges and make these protocols more efficient and practical for real-world applications.
The research publications included in this thesis offer a deep dive into the intricacies and advancements in the realm of cryptographic protocols, particularly in the context of the challenges and needs highlighted above.
Publication I presents a novel approach to hybrid encryption, combining the strengths of ABE and SSE. This fusion aims to overcome the inherent limitations of both techniques, offering a more secure and efficient solution for key sharing and access control in cloud-based systems. Publication II further expands on SSE, showcasing a dynamic scheme that emphasizes forward and backward privacy, crucial for ensuring data integrity and confidentiality. Publication III and Publication IV delve into the potential of MIFE, demonstrating its applicability in real-world scenarios, such as designing encrypted private databases and additive reputation systems. These publications highlight the transformative potential of MIFE in bridging the gap between theoretical cryptographic concepts and practical applications. Lastly, Publication V underscores the significance of HE and HHE as a foundational element for secure protocols, emphasizing its potential in devices with limited computational capabilities.
In essence, these publications not only validate the importance of searching and performing operations on encrypted data but also provide innovative solutions to the challenges mentioned. They collectively underscore the transformative potential of advanced cryptographic protocols in enhancing data security and privacy, paving the way for a more secure digital future
On the Generation of Realistic and Robust Counterfactual Explanations for Algorithmic Recourse
This recent widespread deployment of machine learning algorithms presents many new challenges. Machine learning algorithms are usually opaque and can be particularly difficult to interpret. When humans are involved, algorithmic and automated decisions can negatively impact people’s lives. Therefore, end users would like to be insured against potential harm. One popular way to achieve this is to provide end users access to algorithmic recourse, which gives end users negatively affected by algorithmic decisions the opportunity to reverse unfavorable decisions, e.g., from a loan denial to a loan acceptance. In this thesis, we design recourse algorithms to meet various end user needs. First, we propose methods for the generation of realistic recourses. We use generative models to suggest recourses likely to occur under the data distribution. To this end, we shift the recourse action from the input space to the generative model’s latent space, allowing to generate counterfactuals that lie in regions with data support. Second, we observe that small changes applied to the recourses prescribed to end users likely invalidate the suggested recourse after being nosily implemented in practice. Motivated by this observation, we design methods for the generation of robust recourses and for assessing the robustness of recourse algorithms to data deletion requests. Third, the lack of a commonly used code-base for counterfactual explanation and algorithmic recourse algorithms and the vast array of evaluation measures in literature make it difficult to compare the per formance of different algorithms. To solve this problem, we provide an open source benchmarking library that streamlines the evaluation process and can be used for benchmarking, rapidly developing new methods, and setting up new
experiments. In summary, our work contributes to a more reliable interaction of end users and machine learned models by covering fundamental aspects of the recourse process and suggests new solutions towards generating realistic and robust counterfactual explanations for algorithmic recourse
Measuring the Health and Development of School-age Zimbabwean Children
Health, growth and development during mid-childhood (from 5 to 14 years) are poorly characterised, and this period has been termed the ‘missing middle’. This thesis describes the piloting and application of the School-Age Health, Activity, Resilience, Anthropometry and Neurocognitive (SAHARAN) toolbox to measure growth, cognitive and physical function amongst the SHINE cohort in rural Zimbabwe. The SHINE cluster-randomised trial tested the effects of a household WASH intervention and/or infant and young child feeding (IYCF) on child stunting and anaemia at age 18 months in rural Zimbabwe. SHINE showed that IYCF modestly increased linear growth and reduced stunting by age 18 months, while WASH had no effects. The SAHARAN toolbox was used to measure 1000 HIV-unexposed children (250 in each intervention arm), and 275 HIV-exposed children within the SHINE cohort to evaluate long-term outcomes. Children were re-enrolled at age seven years to evaluate growth, body composition, cognitive and physical function. Four main findings are presented from the SAHARAN toolbox measurements of this cohort. Firstly, child sex, growth and contemporary environmental conditions are associated with school-age physical and cognitive function at seven years. Secondly, early-life growth and baseline environmental conditions suggest the impact of early-life trajectories on multiple aspects of school-age growth, physical and cognitive function. Thirdly, the long-term impact of HIV-exposure in pregnancy is explored, which indicate reduced cognitive function, cardiovascular fitness and head circumference by age 7 years. Finally, associations with the SHINE trial early life interventions are explored, demonstrating that the SHINE early-life nutrition intervention has minimal impact by 7 years of age, except marginally stronger handgrip strength. The public health implications advocate that child interventions need to be earlier (including antenatal), broader (incorporating nurturing care), deeper (providing transformational WASH) and longer (supporting throughout childhood), as well as targeting particularly vulnerable groups such as children born HIV-free
Structuring the State’s Voice of Contention in Harmonious Society: How Party Newspapers Cover Social Protests in China
During the Chinese Communist Party’s (CCP) campaign of building a ‘harmonious society’, how do the official newspapers cover the instances of social contention on the ground? Answering this question will shed light not only on how the party press works but also on how the state and the society interact in today’s China. This thesis conceptualises this phenomenon with a multi-faceted and multi-levelled notion of ‘state-initiated contentious public sphere’ to capture the complexity of mediated relations between the state and social contention in the party press. Adopting a relational approach, this thesis analyses 1758 news reports of ‘mass incident’ in the People’s Daily and the Guangming Daily between 2004 and 2020, employing cluster analysis, qualitative comparative analysis, and social network analysis. The thesis finds significant differences in the patterns of contentious coverage in the party press at the level of event and province and an uneven distribution of attention to social contention across incidents and regions. For ‘reported regions’, the thesis distinguishes four types of coverage and presents how party press responds differently to social contention in different scenarios at the provincial level. For ‘identified incidents’, the thesis distinguishes a cumulative type of visibility based on the quantity of coverage from a relational visibility based on the structure emerging from coverage and explains how different news-making rationales determine whether instances receive similar amounts of coverage or occupy similar positions within coverage. Eventually, by demonstrating how the Chinese state strategically uses party press to respond to social contention and how social contention is journalistically placed in different positions in the state’s eyes, this thesis argues that what social contention leads to is the establishment of complex state-contention relations channelled through the party press
Deep Learning Techniques for Electroencephalography Analysis
In this thesis we design deep learning techniques for training deep neural networks on electroencephalography (EEG) data and in particular on two problems, namely EEG-based motor imagery decoding and EEG-based affect recognition, addressing challenges associated with them. Regarding the problem of motor imagery (MI) decoding, we first consider the various kinds of domain shifts in the EEG signals, caused by inter-individual differences (e.g. brain anatomy, personality and cognitive profile). These domain shifts render multi-subject training a challenging task and impede robust cross-subject generalization. We build a two-stage model ensemble architecture and propose two objectives to train it, combining the strengths of curriculum learning and collaborative training. Our subject-independent experiments on the large datasets of Physionet and OpenBMI, verify the effectiveness of our approach. Next, we explore the utilization of the spatial covariance of EEG signals through alignment techniques, with the goal of learning domain-invariant representations. We introduce a Riemannian framework that concurrently performs covariance-based signal alignment and data augmentation, while training a convolutional neural network (CNN) on EEG time-series. Experiments on the BCI IV-2a dataset show that our method performs superiorly over traditional alignment, by inducing regularization to the weights of the CNN. We also study the problem of EEG-based affect recognition, inspired by works suggesting that emotions can be expressed in relative terms, i.e. through ordinal comparisons between different affective state levels. We propose treating data samples in a pairwise manner to infer the ordinal relation between their corresponding affective state labels, as an auxiliary training objective. We incorporate our objective in a deep network architecture which we jointly train on the tasks of sample-wise classification and pairwise ordinal ranking. We evaluate our method on the affective datasets of DEAP and SEED and obtain performance improvements over deep networks trained without the additional ranking objective
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
The Application of Data Analytics Technologies for the Predictive Maintenance of Industrial Facilities in Internet of Things (IoT) Environments
In industrial production environments, the maintenance of equipment has a decisive influence on costs and on the plannability of production capacities. In particular, unplanned failures during production times cause high costs, unplanned downtimes and possibly additional collateral damage. Predictive Maintenance starts here and tries to predict a possible failure and its cause so early that its prevention can be prepared and carried out in time. In order to be able to predict malfunctions and failures, the industrial plant with its characteristics, as well as wear and ageing processes, must be modelled. Such modelling can be done by replicating its physical properties. However, this is very complex and requires enormous expert knowledge about the plant and about wear and ageing processes of each individual component. Neural networks and machine learning make it possible to train such models using data and offer an alternative, especially when very complex and non-linear behaviour is evident.
In order for models to make predictions, as much data as possible about the condition of a plant and its environment and production planning data is needed. In Industrial Internet of Things (IIoT) environments, the amount of available data is constantly increasing. Intelligent sensors and highly interconnected production facilities produce a steady stream of data. The sheer volume of data, but also the steady stream in which data is transmitted, place high demands on the data processing systems. If a participating system wants to perform live analyses on the incoming data streams, it must be able to process the incoming data at least as fast as the continuous data stream delivers it. If this is not the case, the system falls further and further behind in processing and thus in its analyses. This also applies to Predictive Maintenance systems, especially if they use complex and computationally intensive machine learning models. If sufficiently scalable hardware resources are available, this may not be a problem at first. However, if this is not the case or if the processing takes place on decentralised units with limited hardware resources (e.g. edge devices), the runtime behaviour and resource requirements of the type of neural network used can become an important criterion.
This thesis addresses Predictive Maintenance systems in IIoT environments using neural networks and Deep Learning, where the runtime behaviour and the resource requirements are relevant. The question is whether it is possible to achieve better runtimes with similarly result quality using a new type of neural network. The focus is on reducing the complexity of the network and improving its parallelisability. Inspired by projects in which complexity was distributed to less complex neural subnetworks by upstream measures, two hypotheses presented in this thesis emerged: a) the distribution of complexity into simpler subnetworks leads to faster processing overall, despite the overhead this creates, and b) if a neural cell has a deeper internal structure, this leads to a less complex network. Within the framework of a qualitative study, an overall impression of Predictive Maintenance applications in IIoT environments using neural networks was developed. Based on the findings, a novel model layout was developed named Sliced Long Short-Term Memory Neural Network (SlicedLSTM). The SlicedLSTM implements the assumptions made in the aforementioned hypotheses in its inner model architecture.
Within the framework of a quantitative study, the runtime behaviour of the SlicedLSTM was compared with that of a reference model in the form of laboratory tests. The study uses synthetically generated data from a NASA project to predict failures of modules of aircraft gas turbines. The dataset contains 1,414 multivariate time series with 104,897 samples of test data and 160,360 samples of training data.
As a result, it could be proven for the specific application and the data used that the SlicedLSTM delivers faster processing times with similar result accuracy and thus clearly outperforms the reference model in this respect. The hypotheses about the influence of complexity in the internal structure of the neuronal cells were confirmed by the study carried out in the context of this thesis
NEMISA Digital Skills Conference (Colloquium) 2023
The purpose of the colloquium and events centred around the central role that data plays
today as a desirable commodity that must become an important part of massifying digital
skilling efforts. Governments amass even more critical data that, if leveraged, could
change the way public services are delivered, and even change the social and economic
fortunes of any country. Therefore, smart governments and organisations increasingly
require data skills to gain insights and foresight, to secure themselves, and for improved
decision making and efficiency. However, data skills are scarce, and even more
challenging is the inconsistency of the associated training programs with most curated for
the Science, Technology, Engineering, and Mathematics (STEM) disciplines.
Nonetheless, the interdisciplinary yet agnostic nature of data means that there is
opportunity to expand data skills into the non-STEM disciplines as well.College of Engineering, Science and Technolog
Pairwise versus mutual independence: visualisation, actuarial applications and central limit theorems
Accurately capturing the dependence between risks, if it exists, is an increasingly relevant topic of actuarial research. In recent years, several authors have started to relax the traditional 'independence assumption', in a variety of actuarial settings. While it is known that 'mutual independence' between random variables is not equivalent to their 'pairwise independence', this thesis aims to provide a better understanding of the materiality of this difference. The distinction between mutual and pairwise independence matters because, in practice, dependence is often assessed via pairs only, e.g., through correlation matrices, rank-based measures of association, scatterplot matrices, heat-maps, etc. Using such pairwise methods, it is possible to miss some forms of dependence. In this thesis, we explore how material the difference between pairwise and mutual independence is, and from several angles.
We provide relevant background and motivation for this thesis in Chapter 1, then conduct a literature review in Chapter 2.
In Chapter 3, we focus on visualising the difference between pairwise and mutual independence. To do so, we propose a series of theoretical examples (some of them new) where random variables are pairwise independent but (mutually) dependent, in short, PIBD. We then develop new visualisation tools and use them to illustrate what PIBD variables can look like. We showcase that the dependence involved is possibly very strong. We also use our visualisation tools to identify subtle forms of dependence, which would otherwise be hard to detect.
In Chapter 4, we review common dependence models (such has elliptical distributions and Archimedean copulas) used in actuarial science and show that they do not allow for the possibility of PIBD data. We also investigate concrete consequences of the 'nonequivalence' between pairwise and mutual independence. We establish that many results which hold for mutually independent variables do not hold under sole pairwise independent. Those include results about finite sums of random variables, extreme value theory and bootstrap methods. This part thus illustrates what can potentially 'go wrong' if one assumes mutual independence where only pairwise independence holds.
Lastly, in Chapters 5 and 6, we investigate the question of what happens for PIBD variables 'in the limit', i.e., when the sample size goes to infi nity. We want to see if the 'problems' caused by dependence vanish for sufficiently large samples. This is a broad question, and we concentrate on the important classical Central Limit Theorem (CLT), for which we fi nd that the answer is largely negative. In particular, we construct new sequences of PIBD variables (with arbitrary margins) for which a CLT does not hold. We derive explicitly the asymptotic distribution of the standardised mean of our sequences, which allows us to illustrate the extent of the 'failure' of a CLT for PIBD variables. We also propose a general methodology to construct dependent K-tuplewise independent (K an arbitrary integer) sequences of random variables with arbitrary margins. In the case K = 3, we use this methodology to derive explicit examples of triplewise independent sequences for which no CLT hold. Those results illustrate that mutual independence is a crucial assumption within CLTs, and that having larger samples is not always a viable solution to the problem of non-independent data
Analytical validation of innovative magneto-inertial outcomes: a controlled environment study.
peer reviewe
- …