Search CORE

18 research outputs found

The Wasserstein Believer: Learning Belief Updates for Partially Observable Environments through Reliable Latent Space Models

Author: Avalos Raphael
Delgrange Florent
Nowé Ann
Pérez Guillermo A.
Roijers Diederik M.
Publication venue
Publication date: 26/10/2023
Field of study

Partially Observable Markov Decision Processes (POMDPs) are used to model environments where the full state cannot be perceived by an agent. As such the agent needs to reason taking into account the past observations and actions. However, simply remembering the full history is generally intractable due to the exponential growth in the history space. Maintaining a probability distribution that models the belief over what the true state is can be used as a sufficient statistic of the history, but its computation requires access to the model of the environment and is often intractable. While SOTA algorithms use Recurrent Neural Networks to compress the observation-action history aiming to learn a sufficient statistic, they lack guarantees of success and can lead to sub-optimal policies. To overcome this, we propose the Wasserstein Belief Updater, an RL algorithm that learns a latent model of the POMDP and an approximation of the belief update. Our approach comes with theoretical guarantees on the quality of our approximation ensuring that our outputted beliefs allow for learning the optimal value function

arXiv.org e-Print Archive

Time to harmonize dengue nomenclature and classification

Author: Cuypers Lize
Gilberto A. Santiago,
Jorge Luis Muñoz-Jordán,
Luiz Carlos Júnior Alcântara,
Nowé Ann
Peter Simmonds,
Pieter J.K. Libin,
Theys Kristof
Vandamme AM
Publication venue: 'MDPI AG'
Publication date: 18/10/2018
Field of study

Dengue virus (DENV) is estimated to cause 390 million infections per year worldwide. A quarter of these infections manifest clinically and are associated with a morbidity and mortality that put a significant burden on the affected regions. Reports of increased frequency, intensity, and extended geographical range of outbreaks highlight the virus's ongoing global spread. Persistent transmission in endemic areas and the emergence in territories formerly devoid of transmission have shaped DENV's current genetic diversity and divergence. This genetic layout is hierarchically organized in serotypes, genotypes, and sub-genotypic clades. While serotypes are well defined, the genotype nomenclature and classification system lack consistency, which complicates a broader analysis of their clinical and epidemiological characteristics. We identify five key challenges: (1) Currently, there is no formal definition of a DENV genotype; (2) Two different nomenclature systems are used in parallel, which causes significant confusion; (3) A standardized classification procedure is lacking so far; (4) No formal definition of sub-genotypic clades is in place; (5) There is no consensus on how to report antigenic diversity. Therefore, we believe that the time is right to re-evaluate DENV genetic diversity in an essential effort to provide harmonization across DENV studies.publishersversionpublishe

Repositório da Universidade Nova de Lisboa

On the equilibrium of query reformulation and document retrieval

Author: Burges Christopher JC
Mikolov Tomas
Miller George A
Nowé Ann
Robertson Stephen E
Slantchev Branislav L
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/07/2018
Field of study

In this paper, we study jointly query reformulation and document relevance estimation, the two essential aspects of information retrieval (IR). Their interactions are modelled as a two-player strategic game: one player, a query formulator, taking actions to produce the optimal query, is expected to maximize its own utility with respect to the relevance estimation of documents produced by the other player, a retrieval modeler; simultaneously, the retrieval modeler, taking actions to produce the document relevance scores, needs to optimize its likelihood from the training data with respect to the refined query produced by the query formulator. Their equilibrium or equilibria will be reached when both are the best responses to each other. We derive our equilibrium theory of IR using normal-form representations: when a standard relevance feedback algorithm is coupled with a retrieval model, they would share the same objective function and thus form a partnership game; by contrast, pseudo relevance feedback pursues a rather different objective than that of retrieval models, therefore the interaction between them would lead to a general-sum game (though implicitly collaborative). Our game-theoretical analyses not only yield useful insights into the two major aspects of IR, but also offer new practical algorithms for achieving the equilibrium state of retrieval which have been shown to bring consistent performance improvements in both text retrieval and item recommendation

arXiv.org e-Print Archive

Crossref

Birkbeck Institutional Research Online

A Practical Guide to Multi-Objective Reinforcement Learning and Planning

Author: Bargiacchi Eugenio
Dazeley Richard
Hayes Conor F.
Heintz Fredrik
Howley Enda
Irissappane Athirai A.
Källström Johan
Macfarlane Matthew
Mannion Patrick
Nowé Ann
Ramos Gabriel
Restelli Marcello
Reymond Mathieu
Roijers Diederik M.
Rădulescu Roxana
Vamplew Peter
Verstraeten Timothy
Zintgraf Luisa M.
Publication venue
Publication date: 17/03/2021
Field of study

Real-world decision-making tasks are generally complex, requiring trade-offs between multiple, often conflicting, objectives. Despite this, the majority of research in reinforcement learning and decision-theoretic planning either assumes only a single objective, or that multiple objectives can be adequately handled via a simple linear combination. Such approaches may oversimplify the underlying problem and hence produce suboptimal results. This paper serves as a guide to the application of multi-objective methods to difficult problems, and is aimed at researchers who are already familiar with single-objective reinforcement learning and planning methods who wish to adopt a multi-objective perspective on their research, as well as practitioners who encounter multi-objective decision problems in practice. It identifies the factors that may influence the nature of the desired solution, and illustrates by example how these influence the design of multi-objective decision-making systems for complex problems

arXiv.org e-Print Archive

Publikationer från Linköpings universitet

Deakin Research Online

Federation ResearchOnline

Digitala Vetenskapliga Arkivet - Academic Archive On-line

A computational method for the identification of dengue, zika and chikungunya virus species and genotypes

Author: Abecasis AB
Alcantara L. C. J.
Azevedo Vasco Ariston De Carvalho
Cuypers Lize
da Cunha Rivaldo Venâncio
de Filippis Ana Maria Bispo
de Oliveira Túlio
de Siqueira Isadora Cristina
Deforche Koen
Faria Nuno Rodrigues
Fonseca Vagner S.
Freire Murilo
Giovanetti Marta
Libin Pieter J. K.
Machado Kaliane C.B.
Nowé Ann
Nunes Márcio Roberto Texeira
Pybus Oliver George
Restovic Maria I.
San Emmanuel J.
Santiago Gilberto A.
Theys Kristof
Vandamme AM
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 08/05/2019
Field of study

In recent years, an increasing number of outbreaks of Dengue, Chikungunya and Zika viruses have been reported in Asia and the Americas. Monitoring virus genotype diversity is crucial to understand the emergence and spread of outbreaks, both aspects that are vital to develop effective prevention and treatment strategies. Hence, we developed an efficient method to classify virus sequences with respect to their species and sub-species (i.e. serotype and/or genotype). This tool provides an easy-to-use software implementation of this new method and was validated on a large dataset assessing the classification performance with respect to whole-genome sequences and partial-genome sequences.publishersversionpublishe

Repositório da Universidade Nova de Lisboa

Unlocking the potential of publicly available microarray data using inSilicoDb and inSilicoMerging R/Bioconductor packages

Author: A (Ed) Scherer
A Coletta
A Sims
AA Shabalin
AH Sims
Alain Coletta
Ann Nowé
C Lazar
Colin Molter
Cosmin Lazar
D Sean
David Steenhoff
David Y Weiss Solís
E Parzen
ES Han
H Huang
H Parkinson
Hugues Bersini
J Brettschneider
J Rudy
J Taminau
Jonatan Taminau
JS Brown
JT Leek
KK Dobbin
M Bakay
M Benito
MN McCall
O Larsson
R Edgar
RC Gentleman
Robin Duque
S Zakharkin
Stijn Meganck
T Barrett
TM Chu
Virginie de Schaetzen
WE Johnson
Publication venue: Springer Nature
Publication date: 01/12/2012
Field of study

BACKGROUND: With an abundant amount of microarray gene expression data sets available through public repositories, new possibilities lie in combining multiple existing data sets. In this new context, analysis itself is no longer the problem, but retrieving and consistently integrating all this data before delivering it to the wide variety of existing analysis tools becomes the new bottleneck. RESULTS: We present the newly released inSilicoMerging R/Bioconductor package which, together with the earlier released inSilicoDb R/Bioconductor package, allows consistent retrieval, integration and analysis of publicly available microarray gene expression data sets. Inside the inSilicoMerging package a set of five visual and six quantitative validation measures are available as well. CONCLUSIONS: By providing (i) access to uniformly curated and preprocessed data, (ii) a collection of techniques to remove the batch effects between data sets from different sources, and (iii) several validation tools enabling the inspection of the integration process, these packages enable researchers to fully explore the potential of combining gene expression data for downstream analysis. The power of using both packages is demonstrated by programmatically retrieving and integrating gene expression studies from the InSilico DB repository [https://insilicodb.org/app/]

Crossref

Springer - Publisher Connector

PubMed Central

DI-fusion

Unlocking the potential of publicly available microarray data using inSilicoDb and inSilicoMerging R/Bioconductor packages

Author: A (Ed) Scherer
A Coletta
A Sims
AA Shabalin
AH Sims
Alain Coletta
Ann Nowé
C Lazar
Colin Molter
Cosmin Lazar
D Sean
David Steenhoff
David Y Weiss Solís
E Parzen
ES Han
H Huang
H Parkinson
Hugues Bersini
J Brettschneider
J Rudy
J Taminau
Jonatan Taminau
JS Brown
JT Leek
KK Dobbin
M Bakay
M Benito
MN McCall
O Larsson
R Edgar
RC Gentleman
Robin Duque
S Zakharkin
Stijn Meganck
T Barrett
TM Chu
Virginie de Schaetzen
WE Johnson
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Designing multi-objective multi-armed bandits algorithms : a study

Author: Drugan MM Madalina
Nowé A Ann
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

We propose an algorithmic framework for multi-objective multi-armed bandits with multiple rewards. Different partial order relationships from multi-objective optimization can be considered for a set of reward vectors, such as scalarization functions and Pareto search. A scalarization function transforms the multi-objective environment into a single objective environment and are a popular choice in multi-objective reinforcement learning. Scalarization techniques can be straightforwardly implemented into the current multi-armed bandit framework, but the efficiency of these algorithms depends very much on their type, linear or non-linear (e.g. Chebyshev), and their parameters. Using Pareto dominance order relationship allows to explore the multi-objective environment directly, however this can result in large sets of Pareto optimal solutions. In this paper we propose and evaluate the performance of multi-objective MABs using three regret metric criteria. The standard UCB1 is extended to scalarized multi-objective UCB1 and we propose a Pareto UCB1 algorithm. Both algorithms are proven to have a logarithmic upper bound for their expected regret. We also introduce a variant of the scalarized multi-objective UCB1 that removes online inefficient scalarizations in order to improve the algorithm's efficiency. These algorithms are experimentally compared on multi-objective Bernoulli distributions, Pareto UCB1 being the algorithm with the best empirical performance

Repository TU/e

Hypervolume-based multi-objective reinforcement learning

Author: Drugan MM Madalina
Nowé A Ann
Van Moffaert K Kristof
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2013
Field of study

Indicator-based evolutionary algorithms are amongst the best performing methods for solving multi-objective optimization (MOO) problems. In reinforcement learning (RL), introducing a quality indicator in an algorithm’s decision logic was not attempted before. In this paper, we propose a novel on-line multi-objective reinforcement learning (MORL) algorithm that uses the hypervolume indicator as an action selection strategy. We call this algorithm the hypervolume-based MORL algorithm or HB-MORL and conduct an empirical study of the performance of the algorithm using multiple quality assessment metrics from multi-objective optimization. We compare the hypervolume-based learning algorithm on different environments to two multi-objective algorithms that rely on scalarization techniques, such as the linear scalarization and the weighted Chebyshev function. We conclude that HB-MORL significantly outperforms the linear scalarization method and performs similarly to the Chebyshev algorithm without requiring any user-specified emphasis on particular objectives

Repository TU/e

Scalarized multi-objective reinforcement learning : novel design techniques

Author: Drugan MM Madalina
Nowé A Ann
Van Moffaert K Kristof
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2013
Field of study

In multi-objective problems, it is key to find compromising solutions that balance different objectives. The linear scalarization function is often utilized to translate the multi-objective nature of a problem into a standard, single-objective problem. Generally, it is noted that such as linear combination can only find solutions in convex areas of the Pareto front, therefore making the method inapplicable in situations where the shape of the front is not known beforehand, as is often the case. We propose a non-linear scalarization function, called the Chebyshev scalarization function, as a basis for action selection strategies in multi-objective reinforcement learning. The Chebyshev scalarization method overcomes the flaws of the linear scalarization function as it can (i) discover Pareto optimal solutions regardless of the shape of the front, i.e. convex as well as non-convex , (ii) obtain a better spread amongst the set of Pareto optimal solutions and (iii) is not particularly dependent on the actual weights used

Repository TU/e

Crossref

Pure OAI Repository