14 research outputs found
Control Strategies for COVID-19 Epidemic with Vaccination, Shield Immunity and Quarantine: A Metric Temporal Logic Approach
Ever since the outbreak of the COVID-19 epidemic, various public health
control strategies have been proposed and tested against the coronavirus
SARS-CoV-2. We study three specific COVID-19 epidemic control models: the
susceptible, exposed, infectious, recovered (SEIR) model with vaccination
control; the SEIR model with shield immunity control; and the susceptible,
un-quarantined infected, quarantined infected, confirmed infected (SUQC) model
with quarantine control. We express the control requirement in metric temporal
logic (MTL) formulas (a type of formal specification languages) which can
specify the expected control outcomes such as "the deaths from the infection
should never exceed one thousand per day within the next three months" or "the
population immune from the disease should eventually exceed 200 thousand within
the next 100 to 120 days". We then develop methods for synthesizing control
strategies with MTL specifications. To the best of our knowledge, this is the
first paper to systematically synthesize control strategies based on the
COVID-19 epidemic models with formal specifications. We provide simulation
results in three different case studies: vaccination control for the COVID-19
epidemic with model parameters estimated from data in Lombardy, Italy; shield
immunity control for the COVID-19 epidemic with model parameters estimated from
data in Lombardy, Italy; and quarantine control for the COVID-19 epidemic with
model parameters estimated from data in Wuhan, China. The results show that the
proposed synthesis approach can generate control inputs such that the
time-varying numbers of individuals in each category (e.g., infectious, immune)
satisfy the MTL specifications. The results also show that early intervention
is essential in mitigating the spread of COVID-19, and more control effort is
needed for more stringent MTL specifications
Noisy Symbolic Abstractions for Deep RL: A case study with Reward Machines
Natural and formal languages provide an effective mechanism for humans to
specify instructions and reward functions. We investigate how to generate
policies via RL when reward functions are specified in a symbolic language
captured by Reward Machines, an increasingly popular automaton-inspired
structure. We are interested in the case where the mapping of environment state
to a symbolic (here, Reward Machine) vocabulary -- commonly known as the
labelling function -- is uncertain from the perspective of the agent. We
formulate the problem of policy learning in Reward Machines with noisy symbolic
abstractions as a special class of POMDP optimization problem, and investigate
several methods to address the problem, building on existing and new
techniques, the latter focused on predicting Reward Machine state, rather than
on grounding of individual symbols. We analyze these methods and evaluate them
experimentally under varying degrees of uncertainty in the correct
interpretation of the symbolic vocabulary. We verify the strength of our
approach and the limitation of existing methods via an empirical investigation
on both illustrative, toy domains and partially observable, deep RL domains.Comment: NeurIPS Deep Reinforcement Learning Workshop 202