8,862 research outputs found
Active Coverage for PAC Reinforcement Learning
Collecting and leveraging data with good coverage properties plays a crucial
role in different aspects of reinforcement learning (RL), including reward-free
exploration and offline learning. However, the notion of "good coverage" really
depends on the application at hand, as data suitable for one context may not be
so for another. In this paper, we formalize the problem of active coverage in
episodic Markov decision processes (MDPs), where the goal is to interact with
the environment so as to fulfill given sampling requirements. This framework is
sufficiently flexible to specify any desired coverage property, making it
applicable to any problem that involves online exploration. Our main
contribution is an instance-dependent lower bound on the sample complexity of
active coverage and a simple game-theoretic algorithm, CovGame, that nearly
matches it. We then show that CovGame can be used as a building block to solve
different PAC RL tasks. In particular, we obtain a simple algorithm for PAC
reward-free exploration with an instance-dependent sample complexity that, in
certain MDPs which are "easy to explore", is lower than the minimax one. By
further coupling this exploration algorithm with a new technique to do implicit
eliminations in policy space, we obtain a computationally-efficient algorithm
for best-policy identification whose instance-dependent sample complexity
scales with gaps between policy values.Comment: Accepted at COLT 202
Reinforcement learning in large state action spaces
Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios.
This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory).
In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications
Seamless Multimodal Biometrics for Continuous Personalised Wellbeing Monitoring
Artificially intelligent perception is increasingly present in the lives of
every one of us. Vehicles are no exception, (...) In the near future, pattern
recognition will have an even stronger role in vehicles, as self-driving cars
will require automated ways to understand what is happening around (and within)
them and act accordingly. (...) This doctoral work focused on advancing
in-vehicle sensing through the research of novel computer vision and pattern
recognition methodologies for both biometrics and wellbeing monitoring. The
main focus has been on electrocardiogram (ECG) biometrics, a trait well-known
for its potential for seamless driver monitoring. Major efforts were devoted to
achieving improved performance in identification and identity verification in
off-the-person scenarios, well-known for increased noise and variability. Here,
end-to-end deep learning ECG biometric solutions were proposed and important
topics were addressed such as cross-database and long-term performance,
waveform relevance through explainability, and interlead conversion. Face
biometrics, a natural complement to the ECG in seamless unconstrained
scenarios, was also studied in this work. The open challenges of masked face
recognition and interpretability in biometrics were tackled in an effort to
evolve towards algorithms that are more transparent, trustworthy, and robust to
significant occlusions. Within the topic of wellbeing monitoring, improved
solutions to multimodal emotion recognition in groups of people and
activity/violence recognition in in-vehicle scenarios were proposed. At last,
we also proposed a novel way to learn template security within end-to-end
models, dismissing additional separate encryption processes, and a
self-supervised learning approach tailored to sequential data, in order to
ensure data security and optimal performance. (...)Comment: Doctoral thesis presented and approved on the 21st of December 2022
to the University of Port
Active Self-Supervised Learning: A Few Low-Cost Relationships Are All You Need
Self-Supervised Learning (SSL) has emerged as the solution of choice to learn
transferable representations from unlabeled data. However, SSL requires to
build samples that are known to be semantically akin, i.e. positive views.
Requiring such knowledge is the main limitation of SSL and is often tackled by
ad-hoc strategies e.g. applying known data-augmentations to the same input. In
this work, we generalize and formalize this principle through Positive Active
Learning (PAL) where an oracle queries semantic relationships between samples.
PAL achieves three main objectives. First, it unveils a theoretically grounded
learning framework beyond SSL, that can be extended to tackle supervised and
semi-supervised learning depending on the employed oracle. Second, it provides
a consistent algorithm to embed a priori knowledge, e.g. some observed labels,
into any SSL losses without any change in the training pipeline. Third, it
provides a proper active learning framework yielding low-cost solutions to
annotate datasets, arguably bringing the gap between theory and practice of
active learning that is based on simple-to-answer-by-non-experts queries of
semantic relationships between inputs.Comment: 8 main pages, 20 totals, 10 figure
Robustness and Interpretability of Neural Networks’ Predictions under Adversarial Attacks
Le reti neurali profonde (DNNs) sono potenti modelli predittivi, che superano le capacità umane in una varietà di task. Imparano sistemi decisionali complessi e flessibili dai dati a disposizione e raggiungono prestazioni eccezionali in molteplici campi di apprendimento automatico, dalle applicazioni dell'intelligenza artificiale, come il riconoscimento di immagini, parole e testi, alle scienze più tradizionali, tra cui medicina, fisica e biologia. Nonostante i risultati eccezionali, le prestazioni elevate e l’alta precisione predittiva non sono sufficienti per le applicazioni nel mondo reale, specialmente in ambienti critici per la sicurezza, dove l'utilizzo dei DNNs è fortemente limitato dalla loro natura black-box. Vi è una crescente necessità di comprendere come vengono eseguite le predizioni, fornire stime di incertezza, garantire robustezza agli attacchi avversari e prevenire comportamenti indesiderati.
Anche le migliori architetture sono vulnerabili a piccole perturbazioni nei dati di input, note come attacchi avversari: manipolazioni malevole degli input che sono percettivamente indistinguibili dai campioni originali ma sono in grado di ingannare il modello in predizioni errate. In questo lavoro, dimostriamo che tale fragilità è correlata alla geometria del manifold dei dati ed è quindi probabile che sia una caratteristica intrinseca delle predizioni dei DNNs. Questa
condizione suggerisce una possibile direzione al fine di ottenere robustezza agli attacchi: studiamo la geometria degli attacchi avversari nel limite di un numero infinito di dati e di pesi per le reti neurali Bayesiane, dimostrando che, in questo limite, sono immuni agli attacchi avversari gradient-based. Inoltre, proponiamo alcune tecniche di training per migliorare la robustezza delle architetture deterministiche. In particolare, osserviamo sperimentalmente che ensembles di reti neurali addestrati su proiezioni casuali degli input originali in spazi basso-dimensionali sono più resistenti agli attacchi.
Successivamente, ci concentriamo sul problema dell'interpretabilità delle predizioni delle reti nel contesto delle saliency-based explanations. Analizziamo la stabilità delle explanations soggette ad attacchi avversari e dimostriamo che, nel limite di un numero infinito di dati e di pesi, le interpretazioni Bayesiane sono più stabili di quelle fornite dalle reti deterministiche. Confermiamo questo comportamento in modo sperimentale nel regime di un numero finito di dati.
Infine, introduciamo il concetto di attacco avversario alle sequenze di amminoacidi per protein Language Models (LM). I modelli di Deep Learning per la predizione della struttura delle proteine, come AlphaFold2, sfruttano le architetture Transformer e il loro meccanismo di attention per catturare le proprietà strutturali e funzionali delle sequenze di amminoacidi. Nonostante l'elevata precisione delle predizioni, perturbazioni biologicamente piccole delle sequenze di input, o anche mutazioni di un singolo amminoacido, possono portare a strutture 3D sostanzialmente diverse. Al contempo, i protein LMs sono insensibili alle mutazioni che inducono misfolding o disfunzione (ad esempio le missense mutations). In particolare, le predizioni delle coordinate 3D non rivelano l'effetto di unfolding indotto da queste mutazioni. Pertanto, esiste un'evidente incoerenza tra l'importanza biologica delle mutazioni e il conseguente cambiamento nella predizione strutturale. Ispirati da questo problema, introduciamo il concetto di perturbazione avversaria delle sequenze proteiche negli embedding continui dei protein LMs. Il nostro metodo utilizza i valori di attention per rilevare le posizioni degli amminoacidi più vulnerabili nelle sequenze di input. Le mutazioni avversarie sono biologicamente diverse dalle sequenze di riferimento e sono in grado di alterare in modo significativo le strutture 3D.Deep Neural Networks (DNNs) are powerful predictive models, exceeding human capabilities in a variety of tasks. They learn complex and flexible decision systems from the available data and achieve exceptional performances in multiple machine learning fields, spanning from applications in artificial intelligence, such as image, speech and text recognition, to the more traditional sciences, including medicine, physics and biology. Despite the outstanding achievements, high performance and high predictive accuracy are not sufficient for real-world applications, especially in safety-critical settings, where the usage of DNNs is severely limited by their black-box nature. There is an increasing need to understand how predictions are performed, to provide uncertainty estimates, to guarantee robustness to malicious attacks and to prevent unwanted behaviours.
State-of-the-art DNNs are vulnerable to small perturbations in the input data, known as adversarial attacks: maliciously crafted manipulations of the inputs that are perceptually indistinguishable from the original samples but are capable of fooling the model into incorrect predictions. In this work, we prove that such brittleness is related to the geometry of the data manifold and is therefore likely to be an intrinsic feature of DNNs’ predictions. This negative
condition suggests a possible direction to overcome such limitation: we study the geometry of adversarial attacks in the large-data, overparameterized limit for Bayesian Neural Networks and prove that, in this limit, they are immune to gradient-based adversarial attacks. Furthermore, we propose some training techniques to improve the adversarial robustness of deterministic architectures. In particular, we experimentally observe that ensembles of NNs trained on random projections of the original inputs into lower dimensional spaces are more resilient to the attacks.
Next, we focus on the problem of interpretability of NNs’ predictions in the setting of saliency-based explanations. We analyze the stability of the explanations under adversarial attacks on the inputs and we prove that, in the large-data and overparameterized limit, Bayesian interpretations are more stable than those provided by deterministic networks. We validate this behaviour in multiple experimental settings in the finite data regime.
Finally, we introduce the concept of adversarial perturbations of amino acid sequences for protein Language Models (LMs). Deep Learning models for protein structure prediction, such as AlphaFold2, leverage Transformer architectures and their attention mechanism to capture structural and functional properties of amino acid sequences. Despite the high accuracy of predictions, biologically small perturbations of the input sequences, or even single point mutations, can lead to substantially different 3d structures. On the other hand, protein language models are insensitive to mutations that induce misfolding or dysfunction (e.g. missense mutations). Precisely, predictions of the 3d coordinates do not reveal the structure-disruptive effect of these mutations. Therefore, there is an evident inconsistency between the biological importance of mutations and the resulting change in structural prediction. Inspired by this problem, we introduce the concept of adversarial perturbation of protein sequences in continuous embedding spaces of protein language models. Our method relies on attention scores to detect the most vulnerable amino acid positions in the input sequences. Adversarial mutations are biologically diverse from their references and are able to significantly alter the resulting 3D structures
Complexity Science in Human Change
This reprint encompasses fourteen contributions that offer avenues towards a better understanding of complex systems in human behavior. The phenomena studied here are generally pattern formation processes that originate in social interaction and psychotherapy. Several accounts are also given of the coordination in body movements and in physiological, neuronal and linguistic processes. A common denominator of such pattern formation is that complexity and entropy of the respective systems become reduced spontaneously, which is the hallmark of self-organization. The various methodological approaches of how to model such processes are presented in some detail. Results from the various methods are systematically compared and discussed. Among these approaches are algorithms for the quantification of synchrony by cross-correlational statistics, surrogate control procedures, recurrence mapping and network models.This volume offers an informative and sophisticated resource for scholars of human change, and as well for students at advanced levels, from graduate to post-doctoral. The reprint is multidisciplinary in nature, binding together the fields of medicine, psychology, physics, and neuroscience
Economic and Social Consequences of the COVID-19 Pandemic in Energy Sector
The purpose of the Special Issue was to collect the results of research and experience on the consequences of the COVID-19 pandemic for the energy sector and the energy market, broadly understood, that were visible after a year. In particular, the impact of COVID-19 on the energy sector in the EU, including Poland, and the US was examined. The topics concerned various issues, e.g., the situation of energy companies, including those listed on the stock exchange, mining companies, and those dealing with renewable energy. The topics related to the development of electromobility, managerial competences, energy expenditure of local government units, sustainable development of energy, and energy poverty during a pandemic were also discussed
Determination of the strong coupling αs from transverse energy-energy correlations in multi-jet events at √ s = 13 TeV with the ATLAS detector
Tesis Doctoral inédita leÃda en la Universidad Autónoma de Madrid, Facultad de Ciencias, Departamento de FÃsica Teórica. Fecha de Lectura: 24-02-202
- …