182 research outputs found
Causal Discovery from Temporal Data: An Overview and New Perspectives
Temporal data, representing chronological observations of complex systems,
has always been a typical data structure that can be widely generated by many
domains, such as industry, medicine and finance. Analyzing this type of data is
extremely valuable for various applications. Thus, different temporal data
analysis tasks, eg, classification, clustering and prediction, have been
proposed in the past decades. Among them, causal discovery, learning the causal
relations from temporal data, is considered an interesting yet critical task
and has attracted much research attention. Existing casual discovery works can
be divided into two highly correlated categories according to whether the
temporal data is calibrated, ie, multivariate time series casual discovery, and
event sequence casual discovery. However, most previous surveys are only
focused on the time series casual discovery and ignore the second category. In
this paper, we specify the correlation between the two categories and provide a
systematical overview of existing solutions. Furthermore, we provide public
datasets, evaluation metrics and new perspectives for temporal data casual
discovery.Comment: 52 pages, 6 figure
Differentially private Markov chain Monte Carlo
Peer reviewe
Learning Adversarial Low-rank Markov Decision Processes with Unknown Transition and Full-information Feedback
In this work, we study the low-rank MDPs with adversarially changed losses in
the full-information feedback setting. In particular, the unknown transition
probability kernel admits a low-rank matrix decomposition \citep{REPUCB22}, and
the loss functions may change adversarially but are revealed to the learner at
the end of each episode. We propose a policy optimization-based algorithm POLO,
and we prove that it attains the
regret
guarantee, where is rank of the transition kernel (and hence the dimension
of the unknown representations), is the cardinality of the action space,
is the cardinality of the model class, and is the discounted
factor. Notably, our algorithm is oracle-efficient and has a regret guarantee
with no dependence on the size of potentially arbitrarily large state space.
Furthermore, we also prove an
regret lower bound for this problem, showing that low-rank MDPs are
statistically more difficult to learn than linear MDPs in the regret
minimization setting. To the best of our knowledge, we present the first
algorithm that interleaves representation learning, exploration, and
exploitation to achieve the sublinear regret guarantee for RL with nonlinear
function approximation and adversarial losses
MUBen: Benchmarking the Uncertainty of Pre-Trained Models for Molecular Property Prediction
Large Transformer models pre-trained on massive unlabeled molecular data have
shown great success in predicting molecular properties. However, these models
can be prone to overfitting during fine-tuning, resulting in over-confident
predictions on test data that fall outside of the training distribution. To
address this issue, uncertainty quantification (UQ) methods can be used to
improve the models' calibration of predictions. Although many UQ approaches
exist, not all of them lead to improved performance. While some studies have
used UQ to improve molecular pre-trained models, the process of selecting
suitable backbone and UQ methods for reliable molecular uncertainty estimation
remains underexplored. To address this gap, we present MUBen, which evaluates
different combinations of backbone and UQ models to quantify their performance
for both property prediction and uncertainty estimation. By fine-tuning various
backbone molecular representation models using different molecular descriptors
as inputs with UQ methods from different categories, we critically assess the
influence of architectural decisions and training strategies. Our study offers
insights for selecting UQ and backbone models, which can facilitate research on
uncertainty-critical applications in fields such as materials science and drug
discovery
Sparse Gaussian Processes Revisited: Bayesian Approaches to Inducing-Variable Approximations
Variational inference techniques based on inducing variables provide an
elegant framework for scalable posterior estimation in Gaussian process (GP)
models. Besides enabling scalability, one of their main advantages over sparse
approximations using direct marginal likelihood maximization is that they
provide a robust alternative for point estimation of the inducing inputs, i.e.
the location of the inducing variables. In this work we challenge the common
wisdom that optimizing the inducing inputs in the variational framework yields
optimal performance. We show that, by revisiting old model approximations such
as the fully-independent training conditionals endowed with powerful
sampling-based inference methods, treating both inducing locations and GP
hyper-parameters in a Bayesian way can improve performance significantly. Based
on stochastic gradient Hamiltonian Monte Carlo, we develop a fully Bayesian
approach to scalable GP and deep GP models, and demonstrate its
state-of-the-art performance through an extensive experimental campaign across
several regression and classification problems
my Human Brain Project (mHBP)
How can we make an agent that thinks like us humans? An agent that can have
proprioception, intrinsic motivation, identify deception, use small amounts of energy, transfer
knowledge between tasks and evolve? This is the problem that this thesis is focusing on.
Being able to create a piece of software that can perform tasks like a human being, is
a goal that, if achieved, will allow us to extend our own capabilities to a very high level, and
have more tasks performed in a predictable fashion. This is one of the motivations for this
thesis.
To address this problem, we have proposed a modular architecture for
Reinforcement Learning computation and developed an implementation to have this
architecture exercised. This software, that we call mHBP, is created in Python using Webots
as an environment for the agent, and Neo4J, a graph database, as memory. mHBP takes
the sensory data or other inputs, and produces, based on the body parts / tools that the
agent has available, an output consisting of actions to perform.
This thesis involves experimental design with several iterations, exploring a
theoretical approach to RL based on graph databases. We conclude, with our work in this
thesis, that it is possible to represent episodic data in a graph, and is also possible to
interconnect Webots, Python and Neo4J to support a stable architecture for Reinforcement
Learning. In this work we also find a way to search for policies using the Neo4J querying
language: Cypher. Another key conclusion of this work is that state representation needs to
have further research to find a state definition that enables policy search to produce more
useful policies.
The article “REINFORCEMENT LEARNING: A LITERATURE REVIEW (2020)” at
Research Gate with doi 10.13140/RG.2.2.30323.76327 is an outcome of this thesis.Como podemos criar um agente que pense como nós humanos? Um agente que tenha
propriocepção, motivação intrínseca, seja capaz de identificar ilusão, usar pequenas
quantidades de energia, transferir conhecimento entre tarefas e evoluir? Este é o problema
em que se foca esta tese.
Ser capaz de criar uma peça de software que desempenhe tarefas como um ser
humano é um objectivo que, se conseguido, nos permitirá estender as nossas capacidades
a um nível muito alto, e conseguir realizar mais tarefas de uma forma previsível. Esta é uma
das motivações desta tese.
Para endereçar este problema, propomos uma arquitectura modular para
computação de aprendizagem por reforço e desenvolvemos uma implementação para
exercitar esta arquitetura. Este software, ao qual chamamos mHBP, foi criado em Python
usando o Webots como um ambiente para o agente, e o Neo4J, uma base de dados de
grafos, como memória. O mHBP recebe dados sensoriais ou outros inputs, e produz,
baseado nas partes do corpo / ferramentas que o agente tem disponíveis, um output que
consiste em ações a desempenhar.
Uma boa parte desta tese envolve desenho experimental com diversas iterações,
explorando uma abordagem teórica assente em bases de dados de grafos. Concluímos,
com o trabalho nesta tese, que é possível representar episódios em um grafo, e que é,
também, possível interligar o Webots, com o Python e o Neo4J para suportar uma
arquitetura estável para a aprendizagem por reforço. Neste trabalho, também, encontramos
uma forma de procurar políticas usando a linguagem de pesquisa do Neo4J: Cypher. Outra
conclusão chave deste trabalho é que a representação de estados necessita de mais
investigação para encontrar uma definição de estado que permita à pesquisa de políticas
produzir políticas que sejam mais úteis.
O artigo “REINFORCEMENT LEARNING: A LITERATURE REVIEW (2020)” no
Research Gate com o doi 10.13140/RG.2.2.30323.76327 é um sub-produto desta tese
- …