4,392 research outputs found
Together we stand, Together we fall, Together we win: Dynamic Team Formation in Massive Open Online Courses
Massive Open Online Courses (MOOCs) offer a new scalable paradigm for
e-learning by providing students with global exposure and opportunities for
connecting and interacting with millions of people all around the world. Very
often, students work as teams to effectively accomplish course related tasks.
However, due to lack of face to face interaction, it becomes difficult for MOOC
students to collaborate. Additionally, the instructor also faces challenges in
manually organizing students into teams because students flock to these MOOCs
in huge numbers. Thus, the proposed research is aimed at developing a robust
methodology for dynamic team formation in MOOCs, the theoretical framework for
which is grounded at the confluence of organizational team theory, social
network analysis and machine learning. A prerequisite for such an undertaking
is that we understand the fact that, each and every informal tie established
among students offers the opportunities to influence and be influenced.
Therefore, we aim to extract value from the inherent connectedness of students
in the MOOC. These connections carry with them radical implications for the way
students understand each other in the networked learning community. Our
approach will enable course instructors to automatically group students in
teams that have fairly balanced social connections with their peers, well
defined in terms of appropriately selected qualitative and quantitative network
metrics.Comment: In Proceedings of 5th IEEE International Conference on Application of
Digital Information & Web Technologies (ICADIWT), India, February 2014 (6
pages, 3 figures
Joint Intrinsic Motivation for Coordinated Exploration in Multi-Agent Deep Reinforcement Learning
Multi-agent deep reinforcement learning (MADRL) problems often encounter the
challenge of sparse rewards. This challenge becomes even more pronounced when
coordination among agents is necessary. As performance depends not only on one
agent's behavior but rather on the joint behavior of multiple agents, finding
an adequate solution becomes significantly harder. In this context, a group of
agents can benefit from actively exploring different joint strategies in order
to determine the most efficient one. In this paper, we propose an approach for
rewarding strategies where agents collectively exhibit novel behaviors. We
present JIM (Joint Intrinsic Motivation), a multi-agent intrinsic motivation
method that follows the centralized learning with decentralized execution
paradigm. JIM rewards joint trajectories based on a centralized measure of
novelty designed to function in continuous environments. We demonstrate the
strengths of this approach both in a synthetic environment designed to reveal
shortcomings of state-of-the-art MADRL methods, and in simulated robotic tasks.
Results show that joint exploration is crucial for solving tasks where the
optimal strategy requires a high level of coordination.Comment: 13 pages, 13 figures. Published as an extended abstract at AAMAS 202
Reinforcement Learning Based Robust Volt/Var Control in Active Distribution Networks With Imprecisely Known Delay
Active distribution networks (ADNs) incorporating massive photovoltaic (PV)
devices encounter challenges of rapid voltage fluctuations and potential
violations. Due to the fluctuation and intermittency of PV generation, the
state gap, arising from time-inconsistent states and exacerbated by imprecisely
known system delays, significantly impacts the accuracy of voltage control.
This paper addresses this challenge by introducing a framework for delay
adaptive Volt/Var control (VVC) in the presence of imprecisely known system
delays to regulate the reactive power of PV inverters. The proposed approach
formulates the voltage control, based on predicted system operation states, as
a robust VVC problem. It employs sample selection from the state prediction
interval to promptly identify the worst-performing system operation state.
Furthermore, we leverage the decentralized partially observable Markov decision
process (Dec-POMDP) to reformulate the robust VVC problem. We design Multiple
Policy Networks and employ Multiple Policy Networks and Reward Shaping-based
Multi-agent Twin Delayed Deep Deterministic Policy Gradient (MPNRS-MATD3)
algorithm to efficiently address and solve the Dec-POMDP model-based problem.
Simulation results show the delay adaption characteristic of our proposed
framework, and the MPNRS-MATD3 outperforms other multi-agent reinforcement
learning algorithms in robust voltage control
Successor features based multi-agent RL for event-based decentralized MDPs
Decentralized MDPs (Dec-MDPs) provide a rigorous framework for collaborative multi-agent sequential decisionmaking under uncertainty. However, their computational complexity limits the practical impact. To address this, we focus on a class of Dec-MDPs consisting of independent collaborating agents that are tied together through a global reward function that depends upon their entire histories of states and actions to accomplish joint tasks. To overcome scalability barrier, our main contributions are: (a) We propose a new actor-critic based Reinforcement Learning (RL) approach for event-based Dec-MDPs using successor features (SF) which is a value function representation that decouples the dynamics of the environment from the rewards; (b) We then present Dec-ESR (Decentralized Event based Successor Representation) which generalizes learning for event-based Dec-MDPs using SF within an end-to-end deep RL framework; (c) We also show that Dec-ESR allows useful transfer of information on related but different tasks, hence bootstraps the learning for faster convergence on new tasks; (d) For validation purposes, we test our approach on a large multi-agent coverage problem which models schedule coordination of agents in a real urban subway network and achieves better quality solutions than previous best approaches
A formal evaluation of the performance of different corporate styles in stable and turbulent environments
The notion of "parenting styles", introduced by Goold, Campbell and Alexander, has been widely acknowledged by the Corporate Strategy literature as a good broad description of the different ways in which corporate managers choose to manage and organize multibusiness firms. The purpose of this paper is to present a formal test of the relationship between parenting style and performance. For this test, we developed a set of agent-based simulations using the Performance Landscapes framework, which captures and describes the evolution of firms led by different parenting styles in business environments with different levels of complexity and dynamism. We found that the relative performance of each style is contingent upon the characteristics of the environment in which the firm operates. In less complex business environments, the Strategic Planning style outperforms the Strategic Control and Financial Control styles. In highly complex and highly dynamic environments, by contrast, the Strategic Control style performs best. Our results also demonstrate the importance of planning and flexibility at the corporate level and so contribute to the wider debate on Strategic Planning vs. Emergent Strategies.Corporate strategy; Parenting styles; Agent-based models;
Reinforcement Learning
Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Learning is a very important aspect. This book is on reinforcement learning which involves performing actions to achieve a goal. The first 11 chapters of this book describe and extend the scope of reinforcement learning. The remaining 11 chapters show that there is already wide usage in numerous fields. Reinforcement learning can tackle control tasks that are too complex for traditional, hand-designed, non-learning controllers. As learning computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels. This book shows that reinforcement learning is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field
- …