1,004 research outputs found
Bayesian inference for challenging scientific models
Advances in technology and computation have led to ever more complicated
scientific models of phenomena across a wide variety of fields. Many of these
models present challenges for Bayesian inference, as a result of computationally
intensive likelihoods, high-dimensional parameter spaces or large dataset sizes.
In this thesis we show how we can apply developments in probabilistic machine
learning and statistics to do inference with examples of these types of models.
As a demonstration of an applied inference problem involving a non-trivial
likelihood computation, we show how a combination of optimisation and
MCMC methods along with careful consideration of priors can be used to infer
the parameters of an ODE model of the cardiac action potential.
We then consider the problem of pileup, a phenomenon that occurs in
astronomy when using CCD detectors to observe bright sources. It complicates
the fitting of even simple spectral models by introducing an observation model
with a large number of continuous and discrete latent variables that scales with
the size of the dataset. We develop an MCMC-based method that can work in
the presence of pileup by explicitly marginalising out discrete variables and
using adaptive HMC on the remaining continuous variables. We show with
synthetic experiments that it allows us to fit spectral models in the presence
of pileup without biasing the results. We also compare it to neural Simulation-
Based Inference approaches, and find that they perform comparably to the
MCMC-based approach whilst being able to scale to larger datasets.
As an example of a problem where we wish to do inference with extremely
large datasets, we consider the Extreme Deconvolution method. The method
fits a probability density to a dataset where each observation has Gaussian
noise added with a known sample-specific covariance, originally intended
for use with astronomical datasets. The existing fitting method is batch EM,
which would not normally be applied to large datasets such as the Gaia catalog
containing noisy observations of a billion stars. In this thesis we propose two
minibatch variants of extreme deconvolution, based on an online variation of
the EM algorithm, and direct gradient-based optimisation of the log-likelihood,
both of which can run on GPUs. We demonstrate that these methods provide
faster fitting, whilst being able to scale to much larger models for use with
larger datasets.
We then extend the extreme deconvolution approach to work with non-
Gaussian noise, and to use more flexible density estimators such as normalizing
flows. Since both adjustments lead to an intractable likelihood, we resort to
amortized variational inference in order to fit them. We show that for some
datasets that flows can outperform Gaussian mixtures for extreme deconvolution,
and that fitting with non-Gaussian noise is now possible
Assessing the Role and Regulatory Impact of Digital Assets in Decentralizing Finance
This project will explore the development of decentralized financial (DeFi) markets since the first introduction of digital assets created through the application of a form of distributed ledger technology (DLT), known as blockchain, in 2008. More specifically, a qualitative inquiry of the role of digital assets in relation to traditional financial markets infrastructure will be conducted in order to answer the following questions:
(i) can the digital asset and decentralized financial markets examined in this thesis co-exist with traditional assets and financial markets, and, if so,
(ii) are traditional or novel forms of regulation (whether financial or otherwise) needed or desirable for the digital asset and decentralized financial markets examined herein?
The aim of this project will be to challenge a preliminary hypothesis that traditional and decentralized finance can be compatible; provided, that governments and other centralized authorities approach market innovations as an opportunity to improve existing monetary infrastructure and delivery of financial services (both in the public and private sector), rather than as an existential threat. Thus, this thesis seeks to establish that, through collaborating with private markets to identify the public good to which DeFi markets contribute, the public sector can foster an appropriate environment which is both promotive and protective of the public interest without unduly stifling innovation and progress
Backpropagation Beyond the Gradient
Automatic differentiation is a key enabler of deep learning: previously, practitioners were limited to models
for which they could manually compute derivatives. Now, they can create sophisticated models with almost
no restrictions and train them using first-order, i. e. gradient, information. Popular libraries like PyTorch
and TensorFlow compute this gradient efficiently, automatically, and conveniently with a single line of
code. Under the hood, reverse-mode automatic differentiation, or gradient backpropagation, powers the
gradient computation in these libraries. Their entire design centers around gradient backpropagation.
These frameworks are specialized around one specific task—computing the average gradient in a mini-batch.
This specialization often complicates the extraction of other information like higher-order statistical moments
of the gradient, or higher-order derivatives like the Hessian. It limits practitioners and researchers to methods
that rely on the gradient. Arguably, this hampers the field from exploring the potential of higher-order
information and there is evidence that focusing solely on the gradient has not lead to significant recent
advances in deep learning optimization.
To advance algorithmic research and inspire novel ideas, information beyond the batch-averaged gradient
must be made available at the same level of computational efficiency, automation, and convenience.
This thesis presents approaches to simplify experimentation with rich information beyond the gradient
by making it more readily accessible. We present an implementation of these ideas as an extension to the
backpropagation procedure in PyTorch. Using this newly accessible information, we demonstrate possible use
cases by (i) showing how it can inform our understanding of neural network training by building a diagnostic
tool, and (ii) enabling novel methods to efficiently compute and approximate curvature information.
First, we extend gradient backpropagation for sequential feedforward models to Hessian backpropagation
which enables computing approximate per-layer curvature. This perspective unifies recently proposed block-
diagonal curvature approximations. Like gradient backpropagation, the computation of these second-order
derivatives is modular, and therefore simple to automate and extend to new operations.
Based on the insight that rich information beyond the gradient can be computed efficiently and at the
same time, we extend the backpropagation in PyTorch with the BackPACK library. It provides efficient and
convenient access to statistical moments of the gradient and approximate curvature information, often at a
small overhead compared to computing just the gradient.
Next, we showcase the utility of such information to better understand neural network training. We build
the Cockpit library that visualizes what is happening inside the model during training through various
instruments that rely on BackPACK’s statistics. We show how Cockpit provides a meaningful statistical
summary report to the deep learning engineer to identify bugs in their machine learning pipeline, guide
hyperparameter tuning, and study deep learning phenomena.
Finally, we use BackPACK’s extended automatic differentiation functionality to develop ViViT, an approach
to efficiently compute curvature information, in particular curvature noise. It uses the low-rank structure
of the generalized Gauss-Newton approximation to the Hessian and addresses shortcomings in existing
curvature approximations. Through monitoring curvature noise, we demonstrate how ViViT’s information
helps in understanding challenges to make second-order optimization methods work in practice.
This work develops new tools to experiment more easily with higher-order information in complex deep
learning models. These tools have impacted works on Bayesian applications with Laplace approximations,
out-of-distribution generalization, differential privacy, and the design of automatic differentia-
tion systems. They constitute one important step towards developing and establishing more efficient deep
learning algorithms
Design and Implementation of a Portable Framework for Application Decomposition and Deployment in Edge-Cloud Systems
The emergence of cyber-physical systems has brought about a significant increase in complexity and heterogeneity in the infrastructure on which these systems are deployed. One particular example of this complexity is the interplay between cloud, fog, and edge computing. However, the complexity of these systems can pose challenges when it comes to implementing self-organizing mechanisms, which are often designed to work on flat networks. Therefore, it is essential to separate the application logic from the specific deployment aspects to promote reusability and flexibility in infrastructure exploitation.
To address this issue, a novel approach called "pulverization" has been proposed. This approach involves breaking down the system into smaller computational units, which can then be deployed on the available infrastructure.
In this thesis, the design and implementation of a portable framework that enables the "pulverization" of cyber-physical systems are presented.
The main objective of the framework is to pave the way for the deployment of cyber-physical systems in the edge-cloud continuum by reducing the complexity of the infrastructure and exploit opportunistically the heterogeneous resources available on it. Different scenarios are presented to highlight the effectiveness of the framework in different heterogeneous infrastructures and devices.
Current limitations and future work are examined to identify improvement areas for the framework
Optical ground receivers for satellite based quantum communications
Cryptography has always been a key technology in security, privacy and defence.
From ancient Roman times, where messages were sent cyphered with simple encoding techniques, to modern times and the complex security protocols of the Internet.
During the last decades, security of information has been assumed, since classical
computers do not have the power to break the passwords used every day (if they are
generated properly). However, in 1984, a new threat emerged when Peter Shor presented the Shor’s algorithm, an algorithm that could be used in quantum computers
to break many of the secure communication protocols nowadays. Current quantum
computers are still in their early stages, with not enough qubits to perform this
algorithm in reasonable times. However, the threat is present, not future, since the
messages that are being sent by important institutions can be stored, and decoded
in the future once quantum computers are available.
Quantum key distribution (QKD) is one of the solutions proposed for this threat,
and the only one mathematically proven to be secure with no assumptions on the
eavesdropper power. This optical technology has recently gained interest to be performed with satellite communications, the main reason being the relative ease to
deploy a global network in this way. In satellite QKD, the parameter space and
available technology to optimise are very big, so there is still a lot of work to be
done to understand which is the optimal way to exploit this technology.
This dissertation investigates one of these parameters, the encoding scheme.
Most satellite QKD systems use polarisation schemes nowadays. This thesis presents
for the first time an experimental work of a time-bin encoding scheme for free-space
receivers within a full QKD system in the second chapter. The third and fourth
chapter explore the advantages of having multi-protocol free-space receivers that
can boost the interoperability between systems, polarisation filtering techniques to
reduce background. Finally, the last chapter presents a new technology that can
help increase communications rates
Paraphrasing evades detectors of AI-generated text, but retrieval is an effective defense
The rise in malicious usage of large language models, such as fake content
creation and academic plagiarism, has motivated the development of approaches
that identify AI-generated text, including those based on watermarking or
outlier detection. However, the robustness of these detection algorithms to
paraphrases of AI-generated text remains unclear. To stress test these
detectors, we build a 11B parameter paraphrase generation model (DIPPER) that
can paraphrase paragraphs, condition on surrounding context, and control
lexical diversity and content reordering. Using DIPPER to paraphrase text
generated by three large language models (including GPT3.5-davinci-003)
successfully evades several detectors, including watermarking, GPTZero,
DetectGPT, and OpenAI's text classifier. For example, DIPPER drops detection
accuracy of DetectGPT from 70.3% to 4.6% (at a constant false positive rate of
1%), without appreciably modifying the input semantics.
To increase the robustness of AI-generated text detection to paraphrase
attacks, we introduce a simple defense that relies on retrieving
semantically-similar generations and must be maintained by a language model API
provider. Given a candidate text, our algorithm searches a database of
sequences previously generated by the API, looking for sequences that match the
candidate text within a certain threshold. We empirically verify our defense
using a database of 15M generations from a fine-tuned T5-XXL model and find
that it can detect 80% to 97% of paraphrased generations across different
settings while only classifying 1% of human-written sequences as AI-generated.
We open-source our models, code and data.Comment: NeurIPS 2023 camera ready (32 pages). Code, models, data available in
https://github.com/martiansideofthemoon/ai-detection-paraphrase
Optimisation for Optical Data Centre Switching and Networking with Artificial Intelligence
Cloud and cluster computing platforms have become standard across almost every domain of business, and their scale quickly approaches servers in a single warehouse. However, the tier-based opto-electronically packet switched network infrastructure that is standard across these systems gives way to several scalability bottlenecks including resource fragmentation and high energy requirements. Experimental results show that optical circuit switched networks pose a promising alternative that could avoid these.
However, optimality challenges are encountered at realistic commercial scales. Where exhaustive optimisation techniques are not applicable for problems at the scale of Cloud-scale computer networks, and expert-designed heuristics are performance-limited and typically biased in their design, artificial intelligence can discover more scalable and better performing optimisation strategies.
This thesis demonstrates these benefits through experimental and theoretical work spanning all of component, system and commercial optimisation problems which stand in the way of practical Cloud-scale computer network systems. Firstly, optical components are optimised to gate in and are demonstrated in a proof-of-concept switching architecture for optical data centres with better wavelength and component scalability than previous demonstrations. Secondly, network-aware resource allocation schemes for optically composable data centres are learnt end-to-end with deep reinforcement learning and graph neural networks, where less networking resources are required to achieve the same resource efficiency compared to conventional methods. Finally, a deep reinforcement learning based method for optimising PID-control parameters is presented which generates tailored parameters for unseen devices in . This method is demonstrated on a market leading optical switching product based on piezoelectric actuation, where switching speed is improved with no compromise to optical loss and the manufacturing yield of actuators is improved. This method was licensed to and integrated within the manufacturing pipeline of this company. As such, crucial public and private infrastructure utilising these products will benefit from this work
Adaptive Microarchitectural Optimizations to Improve Performance and Security of Multi-Core Architectures
With the current technological barriers, microarchitectural optimizations are increasingly important to ensure performance scalability of computing systems. The shift to multi-core architectures increases the demands on the memory system, and amplifies the role of microarchitectural optimizations in performance improvement. In a multi-core system, microarchitectural resources are usually shared, such as the cache, to maximize utilization but sharing can also lead to contention and lower performance. This can be mitigated through partitioning of shared caches.However, microarchitectural optimizations which were assumed to be fundamentally secure for a long time, can be used in side-channel attacks to exploit secrets, as cryptographic keys. Timing-based side-channels exploit predictable timing variations due to the interaction with microarchitectural optimizations during program execution. Going forward, there is a strong need to be able to leverage microarchitectural optimizations for performance without compromising security. This thesis contributes with three adaptive microarchitectural resource management optimizations to improve security and/or\ua0performance\ua0of multi-core architectures\ua0and a systematization-of-knowledge of timing-based side-channel attacks.\ua0We observe that to achieve high-performance cache partitioning in a multi-core system\ua0three requirements need to be met: i) fine-granularity of partitions, ii) locality-aware placement and iii) frequent changes. These requirements lead to\ua0high overheads for current centralized partitioning solutions, especially as the number of cores in the\ua0system increases. To address this problem, we present an adaptive and scalable cache partitioning solution (DELTA) using a distributed and asynchronous allocation algorithm. The\ua0allocations occur through core-to-core challenges, where applications with larger performance benefit will gain cache capacity. The\ua0solution is implementable in hardware, due to low computational complexity, and can scale to large core counts.According to our analysis, better performance can be achieved by coordination of multiple optimizations for different resources, e.g., off-chip bandwidth and cache, but is challenging due to the increased number of possible allocations which need to be evaluated.\ua0Based on these observations, we present a solution (CBP) for coordinated management of the optimizations: cache partitioning, bandwidth partitioning and prefetching.\ua0Efficient allocations, considering the inter-resource interactions and trade-offs, are achieved using local resource managers to limit the solution space.The continuously growing number of\ua0side-channel attacks leveraging\ua0microarchitectural optimizations prompts us to review attacks and defenses to understand the vulnerabilities of different microarchitectural optimizations. We identify the four root causes of timing-based side-channel attacks: determinism, sharing, access violation\ua0and information flow.\ua0Our key insight is that eliminating any of the exploited root causes, in any of the attack steps, is enough to provide protection.\ua0Based on our framework, we present a systematization of the attacks and defenses on a wide range of microarchitectural optimizations, which highlights their key similarities.\ua0Shared caches are an attractive attack surface for side-channel attacks, while defenses need to be efficient since the cache is crucial for performance.\ua0To address this issue, we present an adaptive and scalable cache partitioning solution (SCALE) for protection against cache side-channel attacks. The solution leverages randomness,\ua0and provides quantifiable and information theoretic security guarantees using differential privacy. The solution closes the performance gap to a state-of-the-art non-secure allocation policy for a mix of secure and non-secure applications
Cybersecurity applications of Blockchain technologies
With the increase in connectivity, the popularization of cloud services, and the rise
of the Internet of Things (IoT), decentralized approaches for trust management
are gaining momentum. Since blockchain technologies provide a distributed ledger,
they are receiving massive attention from the research community in different application
fields. However, this technology does not provide cybersecurity by itself.
Thus, this thesis first aims to provide a comprehensive review of techniques and
elements that have been proposed to achieve cybersecurity in blockchain-based systems.
The analysis is intended to target area researchers, cybersecurity specialists
and blockchain developers. We present a series of lessons learned as well. One of
them is the rise of Ethereum as one of the most used technologies.
Furthermore, some intrinsic characteristics of the blockchain, like permanent
availability and immutability made it interesting for other ends, namely as covert
channels and malicious purposes.
On the one hand, the use of blockchains by malwares has not been characterized
yet. Therefore, this thesis also analyzes the current state of the art in this area. One
of the lessons learned is that covert communications have received little attention.
On the other hand, although previous works have analyzed the feasibility of
covert channels in a particular blockchain technology called Bitcoin, no previous
work has explored the use of Ethereum to establish a covert channel considering all
transaction fields and smart contracts.
To foster further defence-oriented research, two novel mechanisms are presented
on this thesis. First, Zephyrus takes advantage of all Ethereum fields and smartcontract
bytecode. Second, Smart-Zephyrus is built to complement Zephyrus by
leveraging smart contracts written in Solidity. We also assess the mechanisms feasibility
and cost. Our experiments show that Zephyrus, in the best case, can embed
40 Kbits in 0.57 s. for US 1.82 per bit), the provided stealthiness might be worth the price for attackers. Furthermore,
these two mechanisms can be combined to increase capacity and reduce
costs.Debido al aumento de la conectividad, la popularización de los servicios en la nube
y el auge del Internet de las cosas (IoT), los enfoques descentralizados para la
gestión de la confianza están cobrando impulso. Dado que las tecnologÃas de cadena
de bloques (blockchain) proporcionan un archivo distribuido, están recibiendo
una atención masiva por parte de la comunidad investigadora en diferentes campos
de aplicación. Sin embargo, esta tecnologÃa no proporciona ciberseguridad por sÃ
misma. Por lo tanto, esta tesis tiene como primer objetivo proporcionar una revisión
exhaustiva de las técnicas y elementos que se han propuesto para lograr la ciberseguridad
en los sistemas basados en blockchain. Este análisis está dirigido a investigadores
del área, especialistas en ciberseguridad y desarrolladores de blockchain. A
su vez, se presentan una serie de lecciones aprendidas, siendo una de ellas el auge
de Ethereum como una de las tecnologÃas más utilizadas.
Asimismo, algunas caracterÃsticas intrÃnsecas de la blockchain, como la disponibilidad
permanente y la inmutabilidad, la hacen interesante para otros fines, concretamente
como canal encubierto y con fines maliciosos.
Por una parte, aún no se ha caracterizado el uso de la blockchain por parte
de malwares. Por ello, esta tesis también analiza el actual estado del arte en este
ámbito. Una de las lecciones aprendidas al analizar los datos es que las comunicaciones
encubiertas han recibido poca atención.
Por otro lado, aunque trabajos anteriores han analizado la viabilidad de los
canales encubiertos en una tecnologÃa blockchain concreta llamada Bitcoin, ningún
trabajo anterior ha explorado el uso de Ethereum para establecer un canal encubierto
considerando todos los campos de transacción y contratos inteligentes.
Con el objetivo de fomentar una mayor investigación orientada a la defensa,
en esta tesis se presentan dos mecanismos novedosos. En primer lugar, Zephyrus
aprovecha todos los campos de Ethereum y el bytecode de los contratos inteligentes.
En segundo lugar, Smart-Zephyrus complementa Zephyrus aprovechando los contratos inteligentes escritos en Solidity. Se evalúa, también, la viabilidad y el coste
de ambos mecanismos. Los resultados muestran que Zephyrus, en el mejor de los
casos, puede ocultar 40 Kbits en 0,57 s. por 1,64 US$, y recuperarlos en 2,8 s.
Smart-Zephyrus, por su parte, es capaz de ocultar un secreto de 4 Kb en 41 s. Si
bien es cierto que es caro (alrededor de 1,82 dólares por bit), el sigilo proporcionado
podrÃa valer la pena para los atacantes. Además, estos dos mecanismos pueden
combinarse para aumentar la capacidad y reducir los costesPrograma de Doctorado en Ciencia y TecnologÃa Informática por la Universidad Carlos III de MadridPresidente: José Manuel Estévez Tapiador.- Secretario: Jorge Blasco AlÃs.- Vocal: Luis Hernández Encina
- …