2,837 research outputs found
Model-based Dynamic Shielding for Safe and Efficient Multi-Agent Reinforcement Learning
Multi-Agent Reinforcement Learning (MARL) discovers policies that maximize
reward but do not have safety guarantees during the learning and deployment
phases. Although shielding with Linear Temporal Logic (LTL) is a promising
formal method to ensure safety in single-agent Reinforcement Learning (RL), it
results in conservative behaviors when scaling to multi-agent scenarios.
Additionally, it poses computational challenges for synthesizing shields in
complex multi-agent environments. This work introduces Model-based Dynamic
Shielding (MBDS) to support MARL algorithm design. Our algorithm synthesizes
distributive shields, which are reactive systems running in parallel with each
MARL agent, to monitor and rectify unsafe behaviors. The shields can
dynamically split, merge, and recompute based on agents' states. This design
enables efficient synthesis of shields to monitor agents in complex
environments without coordination overheads. We also propose an algorithm to
synthesize shields without prior knowledge of the dynamics model. The proposed
algorithm obtains an approximate world model by interacting with the
environment during the early stage of exploration, making our MBDS enjoy formal
safety guarantees with high probability. We demonstrate in simulations that our
framework can surpass existing baselines in terms of safety guarantees and
learning performance.Comment: Accepted in AAMAS 202
Failing with Grace: Learning Neural Network Controllers that are Boundedly Unsafe
In this work, we consider the problem of learning a feed-forward neural
network (NN) controller to safely steer an arbitrarily shaped planar robot in a
compact and obstacle-occluded workspace. Unlike existing methods that depend
strongly on the density of data points close to the boundary of the safe state
space to train NN controllers with closed-loop safety guarantees, we propose an
approach that lifts such assumptions on the data that are hard to satisfy in
practice and instead allows for graceful safety violations, i.e., of a bounded
magnitude that can be spatially controlled. To do so, we employ reachability
analysis methods to encapsulate safety constraints in the training process.
Specifically, to obtain a computationally efficient over-approximation of the
forward reachable set of the closed-loop system, we partition the robot's state
space into cells and adaptively subdivide the cells that contain states which
may escape the safe set under the trained control law. To do so, we first
design appropriate under- and over-approximations of the robot's footprint to
adaptively subdivide the configuration space into cells. Then, using the
overlap between each cell's forward reachable set and the set of infeasible
robot configurations as a measure for safety violations, we introduce penalty
terms into the loss function that penalize this overlap in the training
process. As a result, our method can learn a safe vector field for the
closed-loop system and, at the same time, provide numerical worst-case bounds
on safety violation over the whole configuration space, defined by the overlap
between the over-approximation of the forward reachable set of the closed-loop
system and the set of unsafe states. Moreover, it can control the tradeoff
between computational complexity and tightness of these bounds. Finally, we
provide a simulation study that verifies the efficacy of the proposed scheme
How to Certify Machine Learning Based Safety-critical Systems? A Systematic Literature Review
Context: Machine Learning (ML) has been at the heart of many innovations over
the past years. However, including it in so-called 'safety-critical' systems
such as automotive or aeronautic has proven to be very challenging, since the
shift in paradigm that ML brings completely changes traditional certification
approaches.
Objective: This paper aims to elucidate challenges related to the
certification of ML-based safety-critical systems, as well as the solutions
that are proposed in the literature to tackle them, answering the question 'How
to Certify Machine Learning Based Safety-critical Systems?'.
Method: We conduct a Systematic Literature Review (SLR) of research papers
published between 2015 to 2020, covering topics related to the certification of
ML systems. In total, we identified 217 papers covering topics considered to be
the main pillars of ML certification: Robustness, Uncertainty, Explainability,
Verification, Safe Reinforcement Learning, and Direct Certification. We
analyzed the main trends and problems of each sub-field and provided summaries
of the papers extracted.
Results: The SLR results highlighted the enthusiasm of the community for this
subject, as well as the lack of diversity in terms of datasets and type of
models. It also emphasized the need to further develop connections between
academia and industries to deepen the domain study. Finally, it also
illustrated the necessity to build connections between the above mention main
pillars that are for now mainly studied separately.
Conclusion: We highlighted current efforts deployed to enable the
certification of ML based software systems, and discuss some future research
directions.Comment: 60 pages (92 pages with references and complements), submitted to a
journal (Automated Software Engineering). Changes: Emphasizing difference
traditional software engineering / ML approach. Adding Related Works, Threats
to Validity and Complementary Materials. Adding a table listing papers
reference for each section/subsection
Tracking Foodborne Pathogens from Farm to Table: Data Needs to Evaluate Control Options
Food safety policymakers and scientists came together at a conference in January 1995 to evaluate data available for analyzing control of foodborne microbial pathogens. This proceedings starts with data regarding human illnesses associated with foodborne pathogens and moves backwards in the food chain to examine pathogen data in the processing sector and at the farm level. Of special concern is the inability to link pathogen data throughout the food chain. Analytical tools to evaluate the impact of changing production and consumption practices on foodborne disease risks and their economic consequences are presented. The available data are examined to see how well they meet current analytical needs to support policy analysis. The policymaker roundtable highlights the tradeoffs involved in funding databases, the economic evaluation of USDA's Hazard Analysis Critical Control Point (HACCP) proposal and other food safety policy issues, and the necessity of a multidisciplinary approach toward improving food safety databases.food safety, cost benefit analysis, foodborne disease risk, foodborne pathogens, Hazard Analysis Critical Control Point (HACCP), probabilistic scenario analysis, fault-tree analysis, Food Consumption/Nutrition/Food Safety,
Bowtie models as preventive models in maritime safety
Aquest treball ha sorgit d’una proposta del Dr. Rodrigo de Larrucea que ha acabat de publicar un llibre ambiciĂłs sobre Seguretat MarĂtima. Com ell mateix diu, el tema “excedeix amb molt les potencialitats de l’autor”, aixĂ que en el meu cas això Ă©s mĂ©s cert. Es pot aspirar, però, a fer una modesta contribuciĂł a l’estudi i difusiĂł de la seguretat de la cultura marĂtima, que nomĂ©s apareix a les notĂcies quan tenen lloc desastres molt puntuals.
En qualsevol cas, el professor em va proposar que em centrĂ©s en els Bowtie Models, models en corbatĂ, que integren l’arbre de causes y el de conseqüències (en anglès el Fault Tree Analysis, FTA, i l’Event Tree Analysis, ETA). Certament, existeixen altres metodologies i aproximacions (i en el seu llibre en presenta vĂ ries, resumides), però per la seva senzillesa conceptual i possibilitat de generalitzaciĂł i integraciĂł dels resultats era una bona aposta. AixĂ, desprĂ©s d’una fase de meditaciĂł i recopilaciĂł de informaciĂł, em vaig decidir a presentar un model en corbatĂ molt general on caben les principals causes d’accidents (factores ambientals, error humĂ i fallada mecĂ nica), comptant tambĂ© que pot existir una combinaciĂł de causes.
De tota manera, a l’hora d’explotar aquest model existeix la gran dificultat de donar una probabilitat de ocurrència, un nombre entre 0 i 1, a cada branca. Normalment les probabilitats d’ocurrència sĂłn petites i degut a això difĂcils d’estimar. Cada accident Ă©s diferent, de grans catĂ strofes n’hi ha poques, i cada accident ja Ă©s estudiat de manera exhaustiva (mĂ©s exhaustiva quan mĂ©s greu Ă©s). Un altre factor que dificulta l’estima de la probabilitat de fallada Ă©s l’evoluciĂł constant del mĂłn marĂtim, tant des del punt de vista tècnic, de formaciĂł, legal i fins i tot generacional doncs cada generaciĂł de marins Ă©s diferent. Els esforços estan doncs enfocats a augmentar la seguretat, encara que sempre amb un ull posat sobre els costs. AixĂ, he presentat un model en corbatĂ pel seu valor didĂ ctic i grĂ fic però sense entrar en detalls numèrics, que si s’escau ja anirĂ© afinant i interioritzant en l’exercici de la professiĂł.
En aquest treball tambĂ© he intentat no mantenir-me totalment al costat de la teoria (ja se sap que si tot es fa bĂ©, tot surt perfecte, etc…) sinĂł presentar amb cert detall 2 casos ben coneguts d’accidents marĂtims: el petroler Exxon Valdez, el 1989 i el ferry Estonia en 1994, entre altres esmentats. SĂłn casos ja una mica vells però que van contribuir a augmentar la cultura de la seguretat, fins a arribar al nivell del que gaudim actualment, al menys als paĂŻsos occidentals. Doncs la seguretat, com esmenta Rodrigo de Larrucea “és una actitud i mai Ă©s fortuĂŻta; sempre Ă©s el resultat d’una voluntat decidida, un esforç sincer, una direcciĂł intel·ligent i una execuciĂł acurada. Sens lloc a dubtes, sempre suposa la millor alternativa”.
The work has been inspired in its initial aspects by the book of my tutor Jaime Rodrigo de Larrucea, that presents a state of the art of all the maritime aspects related to safety. Evidently, since it covers all the topics, it cannot deepen on every topic. It was my opportunity to deepen in the Bowtie Model but finally I have also covered a wide variety of topics.
Later, when I began to study the topics, I realized that the people in the maritime world usually do not understand to a great extent statistics. Everybody is concerned about safety but few nautical students take a probabilistic approach to the accidents. For this it is extremely important to study the population that is going to be studied: in our case the SOLAS ships
Also, during my time at Riga, I have been very concerned with the most diverse accidents, some of them studied during the courses at Barcelona. I have seen that it is difficult to model mathematically the accidents, since each one has different characteristics, angles, and surely there are not 2 equal.
Finally, it was accorded that I should concentrate on the Bowtie Model, which is not very complex from a statistical point of view. It is simply a fault tree of events model and a tree of effects. I present some examples in this Chapter 2. The difficulty I point out is to try to estimate the probabilities of occurrence of events that are unusual.
We concentrated at major accidents, those that may cause victims or heavy losses. Then, for the sake of generality, at Chapter 4, I have divided the causes in 4 great classes: Natural hazards, human factor, mechanical failure and attacks (piracy and terrorism). The last concern maybe should not be included beside the others since terrorism and piracy acts are not accidents, but since there is an important code dedicated to prevent security threats, ISPS, it is example of design of barriers to prevent an undesired event (although it gives mainly guidelines to follow by the States, Port Terminals and Shipping Companies). I have presented a detailed study of the tragedy of the Estonia, showing how a mechanical failure triggered the failure of the ferry, by its nature a delicate ship, but there were other factors such as poor maintenance and heavy seas.
At the next Chapter, certain characteristics of error chains are analyzed. Finally, the conclusions are drawn, offering a pretty optimistic view of the safety (and security) culture at the Western World but that may not easily permeate the entire World, due to the associated costs
- …