Search CORE

5 research outputs found

Predictable Artificial Intelligence

Author: Burden John
Burnell Ryan
Cheke Lucy
Ferri Cèsar
Hernández-Orallo José
hÉigeartaigh Seán Ó
Marcoci Alexandru
Martínez-Plumed Fernando
Mehrbakhsh Behzad
Moreno-Casares Pablo A.
Moros-Daval Yael
Rutar Danaja
Schellaert Wout
Voudouris Konstantinos
Zhou Lexin
Publication venue
Publication date: 09/10/2023
Field of study

We introduce the fundamental ideas and challenges of Predictable AI, a nascent research area that explores the ways in which we can anticipate key indicators of present and future AI ecosystems. We argue that achieving predictability is crucial for fostering trust, liability, control, alignment and safety of AI ecosystems, and thus should be prioritised over performance. While distinctive from other areas of technical and non-technical AI research, the questions, hypotheses and challenges relevant to Predictable AI were yet to be clearly described. This paper aims to elucidate them, calls for identifying paths towards AI predictability and outlines the potential impact of this emergent field.Comment: 11 pages excluding references, 4 figures, and 2 tables. Paper Under Revie

arXiv.org e-Print Archive

Training on the Test Set: Mapping the System-Problem Space in AI

Author: Hernández-Orallo José
Martínez-Plumed Fernando
Schellaert Wout
Publication venue: Association for the Advancement of Artificial Intelligence
Publication date: 28/06/2022
Field of study

Many present and future problems associated with artificial intelligence are not due to its limitations, but to our poor assessment of its behaviour. Our evaluation procedures produce aggregated performance metrics that lack detail and quantified uncertainty about the following question: how will an AI system, with a particular profile \pi, behave for a new problem, characterised by a particular situation \mu? Instead of just aggregating test results, we can use machine learning methods to fully capitalise on this evaluation information. In this paper, we introduce the concept of an assessor model, \hat{R}(r|\pi,\mu), a conditional probability estimator trained on test data. We discuss how these assessors can be built by using information of the full system-problem space and illustrate a broad range of applications that derive from varied inferences and aggregations from \hat{R}. Building good assessor models will change the predictive and explanatory power of AI evaluation and will lead to new research directions for building and using them. We propose accompanying every deployed AI system with its own assessor

Association for the Advancement of Artificial Intelligence: AAAI Publications

Comunica

Author: Schellaert Wout
Taelman Ruben
Van Herwegen Joachim
Publication venue: Zenodo
Publication date: 27/10/2023
Field of study

A knowledge graph querying framework for JavaScriptIf you use this software, please cite the article

ZENODO

Recommended from our members

Investigating Object Permanence in Deep Reinforcement Learning Agents

Author: Cheke Lucy G
Liu Jason Darwin
Schellaert Wout
Siwinska Natasza
Voudouris Konstantinos
Publication venue: eScholarship, University of California
Publication date: 01/01/2024
Field of study

Object Permanence (OP) is the understanding that objects continue to exist when not directly observable. To date, this ability has proven difficult to build into AI systems, with Deep Reinforcement Learning (DRL) systems performing significantly worse than human children. Here, DRL Agents, PPO and Dreamer-v3 were tested against a number of comparators (Human children, random agents and hard coded Heuristic agents) on three object permanence tasks (OP) and a range of control tasks. As expected, the children performed well across all tasks, while performance of the DRL agents was mixed. Overall the pattern of performance across OP and control tasks did not suggest that any agent tested except children showed evidence of robust OP

eScholarship - University of California

Recommended from our members

Rethink reporting of evaluation results in AI.

Author: Burden John
Burnell Ryan
Cheke Lucy G
Cohn Anthony G
Hernandez-Orallo Jose
Kiela Douwe
Leibo Joel Z
Martinez-Plumed Fernando
Mitchell Melanie
Rutar Danaja
Schellaert Wout
Shanahan Murray
Sohl-Dickstein Jascha
Tenenbaum Joshua B
Ullman Tomer D
Voorhees Ellen M
Publication venue: Science
Publication date: 14/04/2023
Field of study

Aggregate metrics and lack of access to results limit understanding

Apollo (Cambridge)