Search CORE

49 research outputs found

Explainability in Deep Reinforcement Learning

Author: Couthouis Fabien
Díaz-Rodríguez Natalia
Heuillet Alexandre
Publication venue
Publication date: 18/12/2020
Field of study

A large set of the explainable Artificial Intelligence (XAI) literature is emerging on feature relevance techniques to explain a deep neural network (DNN) output or explaining models that ingest image source data. However, assessing how XAI techniques can help understand models beyond classification tasks, e.g. for reinforcement learning (RL), has not been extensively studied. We review recent works in the direction to attain Explainable Reinforcement Learning (XRL), a relatively new subfield of Explainable Artificial Intelligence, intended to be used in general public applications, with diverse audiences, requiring ethical, responsible and trustable algorithms. In critical situations where it is essential to justify and explain the agent's behaviour, better explainability and interpretability of RL models could help gain scientific insight on the inner workings of what is still considered a black box. We evaluate mainly studies directly linking explainability to RL, and split these into two categories according to the way the explanations are generated: transparent algorithms and post-hoc explainaility. We also review the most prominent XAI works from the lenses of how they could potentially enlighten the further deployment of the latest advances in RL, in the demanding present and future of everyday problems.Comment: Article accepted at Knowledge-Based System

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Independent Learning Policy (Analysis of Learning Curriculum)

Author: Nifriza Ifna
Rifma Rifma
Syahril Syahril
Publication venue: Universitas Muhammadiyah Enrekang
Publication date: 01/03/2023
Field of study

The independent learning curriculum issued by the Ministry of Education and Culture has brought changes to the national education system. At the beginning of the policy or termed the first period, there were four policies that started it, namely the elimination of the national exam and replacing it with a minimum competency assessment and a character survey with literacy and anumeration. The national school-based exam (USBN) was replaced with a school exam held by each school. Simplification of the Lesson Implementation Plan (RPP) with the aim of reducing the teacher's administrative burden. The RPP made by the teacher only includes 3 components, namely learning objectives, learning activities and evaluation. The zoning system is enforced, the zoning pathway PPDB can accept a minimum of 50 percent students, the affirmation pathway at least 15 percent, and the displacement pathway a maximum of 5 percent. The independent learning curriculum as a new paradigm in education is oriented towards the profile of Pancasila students who are the target in directing the implementation and assessment of policies. Although there are many criticisms of the Free Learning policy, there are also many educational practitioners who say the realization of the independent learning curriculum is be a breath of fresh air for teachers and students who want changes to the learning system that are emancipatory in nature and develop student competencies, especially in the context of the globalization era and the industrial revolution era 4.0 towards society 5.

OJS (Muhammadiyah University of Enrekang)

Few-shot Class-incremental Audio Classification Using Stochastic Classifier

Author: Cao Wenchang
He Qianhua
Li Jialong
Li Yanxiong
Xie Wei
Publication venue
Publication date: 03/06/2023
Field of study

It is generally assumed that number of classes is fixed in current audio classification methods, and the model can recognize pregiven classes only. When new classes emerge, the model needs to be retrained with adequate samples of all classes. If new classes continually emerge, these methods will not work well and even infeasible. In this study, we propose a method for fewshot class-incremental audio classification, which continually recognizes new classes and remember old ones. The proposed model consists of an embedding extractor and a stochastic classifier. The former is trained in base session and frozen in incremental sessions, while the latter is incrementally expanded in all sessions. Two datasets (NS-100 and LS-100) are built by choosing samples from audio corpora of NSynth and LibriSpeech, respectively. Results show that our method exceeds four baseline ones in average accuracy and performance dropping rate. Code is at https://github.com/vinceasvp/meta-sc.Comment: 5 pages, 3 figures, 4 tables. Accepted for publication in INTERSPEECH 202

arXiv.org e-Print Archive

Always Strengthen Your Strengths: A Drift-Aware Incremental Learning Framework for CTR Prediction

Author: Hu Jinghe
Lin Zhangang
Liu Congcong
Shao Jingping
Teng Fei
Zhao Xiwei
Publication venue
Publication date: 17/04/2023
Field of study

Click-through rate (CTR) prediction is of great importance in recommendation systems and online advertising platforms. When served in industrial scenarios, the user-generated data observed by the CTR model typically arrives as a stream. Streaming data has the characteristic that the underlying distribution drifts over time and may recur. This can lead to catastrophic forgetting if the model simply adapts to new data distribution all the time. Also, it's inefficient to relearn distribution that has been occurred. Due to memory constraints and diversity of data distributions in large-scale industrial applications, conventional strategies for catastrophic forgetting such as replay, parameter isolation, and knowledge distillation are difficult to be deployed. In this work, we design a novel drift-aware incremental learning framework based on ensemble learning to address catastrophic forgetting in CTR prediction. With explicit error-based drift detection on streaming data, the framework further strengthens well-adapted ensembles and freezes ensembles that do not match the input distribution avoiding catastrophic interference. Both evaluations on offline experiments and A/B test shows that our method outperforms all baselines considered.Comment: This work has been accepted by SIGIR2

arXiv.org e-Print Archive