Search CORE

388 research outputs found

Many-agent Reinforcement Learning

Author: Yang Yaodong
Publication venue: UCL (University College London)
Publication date: 28/03/2021
Field of study

Multi-agent reinforcement learning (RL) solves the problem of how each agent should behave optimally in a stochastic environment in which multiple agents are learning simultaneously. It is an interdisciplinary domain with a long history that lies in the joint area of psychology, control theory, game theory, reinforcement learning, and deep learning. Following the remarkable success of the AlphaGO series in single-agent RL, 2019 was a booming year that witnessed significant advances in multi-agent RL techniques; impressive breakthroughs have been made on developing AIs that outperform humans on many challenging tasks, especially multi-player video games. Nonetheless, one of the key challenges of multi-agent RL techniques is the scalability; it is still non-trivial to design efficient learning algorithms that can solve tasks including far more than two agents (

N \gg 2

), which I name by \emph{many-agent reinforcement learning} (MARL\footnote{I use the world of ``MARL" to denote multi-agent reinforcement learning with a particular focus on the cases of many agents; otherwise, it is denoted as ``Multi-Agent RL" by default.}) problems. In this thesis, I contribute to tackling MARL problems from four aspects. Firstly, I offer a self-contained overview of multi-agent RL techniques from a game-theoretical perspective. This overview fills the research gap that most of the existing work either fails to cover the recent advances since 2010 or does not pay adequate attention to game theory, which I believe is the cornerstone to solving many-agent learning problems. Secondly, I develop a tractable policy evaluation algorithm --

\alpha^\alpha

-Rank -- in many-agent systems. The critical advantage of

\alpha^\alpha

-Rank is that it can compute the solution concept of

\alpha

-Rank tractably in multi-player general-sum games with no need to store the entire pay-off matrix. This is in contrast to classic solution concepts such as Nash equilibrium which is known to be

PPAD

-hard in even two-player cases.

\alpha^\alpha

-Rank allows us, for the first time, to practically conduct large-scale multi-agent evaluations. Thirdly, I introduce a scalable policy learning algorithm -- mean-field MARL -- in many-agent systems. The mean-field MARL method takes advantage of the mean-field approximation from physics, and it is the first provably convergent algorithm that tries to break the curse of dimensionality for MARL tasks. With the proposed algorithm, I report the first result of solving the Ising model and multi-agent battle games through a MARL approach. Fourthly, I investigate the many-agent learning problem in open-ended meta-games (i.e., the game of a game in the policy space). Specifically, I focus on modelling the behavioural diversity in meta-games, and developing algorithms that guarantee to enlarge diversity during training. The proposed metric based on determinantal point processes serves as the first mathematically rigorous definition for diversity. Importantly, the diversity-aware learning algorithms beat the existing state-of-the-art game solvers in terms of exploitability by a large margin. On top of the algorithmic developments, I also contribute two real-world applications of MARL techniques. Specifically, I demonstrate the great potential of applying MARL to study the emergent population dynamics in nature, and model diverse and realistic interactions in autonomous driving. Both applications embody the prospect that MARL techniques could achieve huge impacts in the real physical world, outside of purely video games

UCL Discovery

Formative Assessment: A Significant Facilitator of Student Learning

Author: Yang Yaodong
Publication venue: Insights Publisher
Publication date: 29/02/2024
Field of study

In recent decades, formative assessment has garnered substantial interest of teachers and educational researchers. Definitions of formative assessment vary in the literature. Relatively well-accepted among them is the one that describe it as the process of seeking and interpreting evidence for learners and their teachers to determine where the learners are in their learning, where they need to go and how best to get there (Antoniou & James, 2014). As opposed to summative assessment, formative evaluation gives comprehensive evaluation and feedback throughout the learning process of students, with the purpose of assisting students in identifying learning gaps, modifying learning methods, and enhancing learning outcomes. Common forms of formative assessment include quizzes, observation records, face-to-face conversations, questionnaires, feedback, student self-assessment, etc. It is meant to evaluate not only students’ academic performance but also their progress in learning attitudes, learning strategies, emotional skills, and other aspects (Wu, 2023)

Insights Publisher

Enriching Students’ Academic Life with Creative Education

Author: Yang Yaodong
Publication venue: Insights Publisher
Publication date: 29/07/2023
Field of study

As research and practical exploration in education intensify, the awareness that school education should aim at outcomes more than just student academic achievements has been heightening in the educational community. Social advancement raises more demanding requirements for education, posing increased responsibilities on schools and instructors. There is a growing consensus that schools should provide students with a more colorful academic life, which goes beyond curricula content and allows students rich academic experiences, in order to foster all-round development in them

Insights Publisher

Argumentation Training Boosts the Outcome of Negotiation in Collaborative Learning

Author: Yang Yaodong
Publication venue: Insights Publisher
Publication date: 25/05/2024
Field of study

Negotiation is an essential technique in collaborative learning. In the negotiation process, students learn to listen to others’ opinions, express their own ideas, discuss divisions, and reach agreement. It also plays an important role in the construction of knowledge on the part of students. According to the constructivist theory, learning is a process of active construction, in which learners construct their knowledge and understanding through interaction with the environment (Yu-Jun, 2013). Negotiation is a crucial component of the interaction, contributing to enhancing learning engagement and developing in-depth understanding of information. Moreover, it is particularly effective in fostering students’ critical thinking ability in that it concerns discrimination and judgement of differential information and perspectives (Olson, 1997)

Insights Publisher

Internet Addiction: A Concerning Issue among Chinese College Students

Author: Yang Yaodong
Publication venue: 'Bonoi Science Advancement and Education LLC'
Publication date: 31/05/2023
Field of study

The advent of computers initiated the information revolution, fundamentally transforming individuals’ ways of life. The use of internet brings about unprecedented convenience, efficiency, and abundance of information. For college students, the internet has significantly enriched their lives, opened more learning channels, and increased learning resources, as a result, broadening their horizons, facilitating academic exchanges, and making learning more engaging. On the other hand, the problematic use of internet such as internet dependence among some of them has imposed detrimental effects on their academic achievement as well as their mental and physical health

Insights Publisher

ON OPTIMIZATIONS OF VIRTUAL MACHINE LIVE STORAGE MIGRATION FOR THE CLOUD

Author: Yang Yaodong
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2016
Field of study

Virtual Machine (VM) live storage migration is widely performed in the data cen- ters of the Cloud, for the purposes of load balance, reliability, availability, hardware maintenance and system upgrade. It entails moving all the state information of the VM being migrated, including memory state, network state and storage state, from one physical server to another within the same data center or across different data centers. To minimize its performance impact, this migration process is required to be transparent to applications running within the migrating VM, meaning that ap- plications will keep running inside the VM as if there were no migration operations at all. In this dissertation, a thorough literature review is conducted to provide a big picture of the VM live storage migration process, its problems and existing solutions. After an in-depth examination, we observe that a severe IO interference between the VM IO threads and migration IO threads exists and causes both types of the IO threads to suffer from performance degradation. This interference stems from the fact that both types of IO threads share the same critical IO path by reading from and writing to the same shared storage system. Owing to IO resource contention and requests interference between the two different types of IO requests, not only will the IO request queue lengthens in the storage system, but the time-consuming disk seek operations will also become more frequent. Based on this fundamental observation, this dissertation research presents three related but orthogonal solutions that tackle the IO interference problem in order to improve the VM live storage migration performance. First, we introduce the Workload-Aware IO Outsourcing scheme, called WAIO, to improve the VM live storage migration efficiency. Second, we address this problem by proposing a novel scheme, called SnapMig, to improve the VM live storage migration efficiency and eliminate its performance impact on user applications at the source server by effectively leveraging the existing VM snapshots in the backup servers. Third, we propose the IOFollow scheme to improve both the VM performance and migration performance simultaneously. Finally, we outline the direction for the future research work. Advisor: Hong Jian

DigitalCommons@University of Nebraska

Thermostat-assisted continuously-tempered Hamiltonian Monte Carlo for Bayesian learning

Author: Luo Rui
Wang Jianhong
Wang Jun
Yang Yaodong
Zhu Zhanxing
Publication venue
Publication date: 01/01/2018
Field of study

We propose a new sampling method, the thermostat-assisted continuously-tempered Hamiltonian Monte Carlo, for Bayesian learning on large datasets and multimodal distributions. It simulates the Nos\'e-Hoover dynamics of a continuously-tempered Hamiltonian system built on the distribution of interest. A significant advantage of this method is that it is not only able to efficiently draw representative i.i.d. samples when the distribution contains multiple isolated modes, but capable of adaptively neutralising the noise arising from mini-batches and maintaining accurate sampling. While the properties of this method have been studied using synthetic distributions, experiments on three real datasets also demonstrated the gain of performance over several strong baselines with various types of neural networks plunged in

arXiv.org e-Print Archive

Southampton (e-Prints Soton)

UCL Discovery

BlOMECHANlCAL ANALYSIS OF VERTICAL JUMP WITH DIFFERENT FOREFOOT MORPHOLOGY

Author: Gu Yaodong
Ruan Guoqing
Shu Yang
Yang Li
Publication venue: International Society of Biomechanics in Sports (ISBS)
Publication date: 06/11/2016
Field of study

This study examined biomechanical differences between habitually barefoot male and habitually shod male during vertical jump. Foot morphology was measured with Easy-Foot-Scan. Foot kinetics and ankle kinematics were obtained from EMED pressure platform and Vicon motion analysis system as completing vertical jumps under barefoot condition. The results showed that habitually barefoot subjects had a significantly larger minimal distance between hallux and other toes. habitually unshod subjects showed larger loading under hallux and medial forefoot, while habitually shod subjects presented larger loading under medial and central forefoot. in addition, habitually barefoot male had smaller ankle plantarflexion, eversion and external rotation during vertical jump. Differences of kinematics and kinetics during vertical jump might attribute to the morphological differences in the toes region, which possibly explain the foot injury risks between habitually barefoot and habitually shod individuals

ISBS (International Society of Biomechanics in Sports): Conference Proceedings Archive