437 research outputs found
Affinity-Based Reinforcement Learning : A New Paradigm for Agent Interpretability
The steady increase in complexity of reinforcement learning (RL) algorithms is accompanied by a corresponding increase in opacity that obfuscates insights into their devised strategies. Methods in explainable artificial intelligence seek to mitigate this opacity by either creating transparent algorithms or extracting explanations post hoc. A third category exists that allows the developer to affect what agents learn: constrained RL has been used in safety-critical applications and prohibits agents from visiting certain states; preference-based RL agents have been used in robotics applications and learn state-action preferences instead of traditional reward functions. We propose a new affinity-based RL paradigm in which agents learn strategies that are partially decoupled from reward functions. Unlike entropy regularisation, we regularise the objective function with a distinct action distribution that represents a desired behaviour; we encourage the agent to act according to a prior while learning to maximise rewards. The result is an inherently interpretable agent that solves problems with an intrinsic affinity for certain actions. We demonstrate the utility of our method in a financial application: we learn continuous time-variant compositions of prototypical policies, each interpretable by its action affinities, that are globally interpretable according to customers’ financial personalities.
Our method combines advantages from both constrained RL and preferencebased RL: it retains the reward function but generalises the policy to match a defined behaviour, thus avoiding problems such as reward shaping and hacking. Unlike Boolean task composition, our method is a fuzzy superposition of different prototypical strategies to arrive at a more complex, yet interpretable, strategy.publishedVersio
Reinforcement learning with intrinsic affinity for personalized prosperity management
The purpose of applying reinforcement learning (RL) to portfolio management is commonly the maximization of profit. The extrinsic reward function used to learn an optimal strategy typically does not take into account any other preferences or constraints. We have developed a regularization method that ensures that strategies have global intrinsic affinities, i.e., different personalities may have preferences for certain asset classes which may change over time. We capitalize on these intrinsic policy affinities to make our RL model inherently interpretable. We demonstrate how RL agents can be trained to orchestrate such individual policies for particular personality profiles and still achieve high returns.publishedVersio
Can Interpretable Reinforcement Learning Manage Prosperity Your Way?
Personalisation of products and services is fast becoming the driver of success in banking and commerce. Machine learning holds the promise of gaining a deeper understanding of and tailoring to customers’ needs and preferences. Whereas traditional solutions to financial decision problems frequently rely on model assumptions, reinforcement learning is able to exploit large amounts of data to improve customer modelling and decision-making in complex financial environments with fewer assumptions. Model explainability and interpretability present challenges from a regulatory perspective which demands transparency for acceptance; they also offer the opportunity for improved insight into and understanding of customers. Post-hoc approaches are typically used for explaining pretrained reinforcement learning models. Based on our previous modeling of customer spending behaviour, we adapt our recent reinforcement learning algorithm that intrinsically characterizes desirable behaviours and we transition to the problem of prosperity management. We train inherently interpretable reinforcement learning agents to give investment advice that is aligned with prototype financial personality traits which are combined to make a final recommendation. We observe that the trained agents’ advice adheres to their intended characteristics, they learn the value of compound growth, and, without any explicit reference, the notion of risk as well as improved policy convergence.publishedVersio
Building Ethically Bounded AI
The more AI agents are deployed in scenarios with possibly unexpected
situations, the more they need to be flexible, adaptive, and creative in
achieving the goal we have given them. Thus, a certain level of freedom to
choose the best path to the goal is inherent in making AI robust and flexible
enough. At the same time, however, the pervasive deployment of AI in our life,
whether AI is autonomous or collaborating with humans, raises several ethical
challenges. AI agents should be aware and follow appropriate ethical principles
and should thus exhibit properties such as fairness or other virtues. These
ethical principles should define the boundaries of AI's freedom and creativity.
However, it is still a challenge to understand how to specify and reason with
ethical boundaries in AI agents and how to combine them appropriately with
subjective preferences and goal specifications. Some initial attempts employ
either a data-driven example-based approach for both, or a symbolic rule-based
approach for both. We envision a modular approach where any AI technique can be
used for any of these essential ingredients in decision making or decision
support systems, paired with a contextual approach to define their combination
and relative weight. In a world where neither humans nor AI systems work in
isolation, but are tightly interconnected, e.g., the Internet of Things, we
also envision a compositional approach to building ethically bounded AI, where
the ethical properties of each component can be fruitfully exploited to derive
those of the overall system. In this paper we define and motivate the notion of
ethically-bounded AI, we describe two concrete examples, and we outline some
outstanding challenges.Comment: Published at AAAI Blue Sky Track, winner of Blue Sky Awar
A Survey on Explainable AI for 6G O-RAN: Architecture, Use Cases, Challenges and Research Directions
The recent O-RAN specifications promote the evolution of RAN architecture by
function disaggregation, adoption of open interfaces, and instantiation of a
hierarchical closed-loop control architecture managed by RAN Intelligent
Controllers (RICs) entities. This paves the road to novel data-driven network
management approaches based on programmable logic. Aided by Artificial
Intelligence (AI) and Machine Learning (ML), novel solutions targeting
traditionally unsolved RAN management issues can be devised. Nevertheless, the
adoption of such smart and autonomous systems is limited by the current
inability of human operators to understand the decision process of such AI/ML
solutions, affecting their trust in such novel tools. eXplainable AI (XAI) aims
at solving this issue, enabling human users to better understand and
effectively manage the emerging generation of artificially intelligent schemes,
reducing the human-to-machine barrier. In this survey, we provide a summary of
the XAI methods and metrics before studying their deployment over the O-RAN
Alliance RAN architecture along with its main building blocks. We then present
various use-cases and discuss the automation of XAI pipelines for O-RAN as well
as the underlying security aspects. We also review some projects/standards that
tackle this area. Finally, we identify different challenges and research
directions that may arise from the heavy adoption of AI/ML decision entities in
this context, focusing on how XAI can help to interpret, understand, and
improve trust in O-RAN operational networks.Comment: 33 pages, 13 figure
EMOTE: An Explainable architecture for Modelling the Other Through Empathy
We can usually assume others have goals analogous to our own. This assumption
can also, at times, be applied to multi-agent games - e.g. Agent 1's attraction
to green pellets is analogous to Agent 2's attraction to red pellets. This
"analogy" assumption is tied closely to the cognitive process known as empathy.
Inspired by empathy, we design a simple and explainable architecture to model
another agent's action-value function. This involves learning an "Imagination
Network" to transform the other agent's observed state in order to produce a
human-interpretable "empathetic state" which, when presented to the learning
agent, produces behaviours that mimic the other agent. Our approach is
applicable to multi-agent scenarios consisting of a single learning agent and
other (independent) agents acting according to fixed policies. This
architecture is particularly beneficial for (but not limited to) algorithms
using a composite value or reward function. We show our method produces better
performance in multi-agent games, where it robustly estimates the other's model
in different environment configurations. Additionally, we show that the
empathetic states are human interpretable, and thus verifiable
- …