4,392 research outputs found

    Together we stand, Together we fall, Together we win: Dynamic Team Formation in Massive Open Online Courses

    Full text link
    Massive Open Online Courses (MOOCs) offer a new scalable paradigm for e-learning by providing students with global exposure and opportunities for connecting and interacting with millions of people all around the world. Very often, students work as teams to effectively accomplish course related tasks. However, due to lack of face to face interaction, it becomes difficult for MOOC students to collaborate. Additionally, the instructor also faces challenges in manually organizing students into teams because students flock to these MOOCs in huge numbers. Thus, the proposed research is aimed at developing a robust methodology for dynamic team formation in MOOCs, the theoretical framework for which is grounded at the confluence of organizational team theory, social network analysis and machine learning. A prerequisite for such an undertaking is that we understand the fact that, each and every informal tie established among students offers the opportunities to influence and be influenced. Therefore, we aim to extract value from the inherent connectedness of students in the MOOC. These connections carry with them radical implications for the way students understand each other in the networked learning community. Our approach will enable course instructors to automatically group students in teams that have fairly balanced social connections with their peers, well defined in terms of appropriately selected qualitative and quantitative network metrics.Comment: In Proceedings of 5th IEEE International Conference on Application of Digital Information & Web Technologies (ICADIWT), India, February 2014 (6 pages, 3 figures

    Joint Intrinsic Motivation for Coordinated Exploration in Multi-Agent Deep Reinforcement Learning

    Full text link
    Multi-agent deep reinforcement learning (MADRL) problems often encounter the challenge of sparse rewards. This challenge becomes even more pronounced when coordination among agents is necessary. As performance depends not only on one agent's behavior but rather on the joint behavior of multiple agents, finding an adequate solution becomes significantly harder. In this context, a group of agents can benefit from actively exploring different joint strategies in order to determine the most efficient one. In this paper, we propose an approach for rewarding strategies where agents collectively exhibit novel behaviors. We present JIM (Joint Intrinsic Motivation), a multi-agent intrinsic motivation method that follows the centralized learning with decentralized execution paradigm. JIM rewards joint trajectories based on a centralized measure of novelty designed to function in continuous environments. We demonstrate the strengths of this approach both in a synthetic environment designed to reveal shortcomings of state-of-the-art MADRL methods, and in simulated robotic tasks. Results show that joint exploration is crucial for solving tasks where the optimal strategy requires a high level of coordination.Comment: 13 pages, 13 figures. Published as an extended abstract at AAMAS 202

    Reinforcement Learning Based Robust Volt/Var Control in Active Distribution Networks With Imprecisely Known Delay

    Full text link
    Active distribution networks (ADNs) incorporating massive photovoltaic (PV) devices encounter challenges of rapid voltage fluctuations and potential violations. Due to the fluctuation and intermittency of PV generation, the state gap, arising from time-inconsistent states and exacerbated by imprecisely known system delays, significantly impacts the accuracy of voltage control. This paper addresses this challenge by introducing a framework for delay adaptive Volt/Var control (VVC) in the presence of imprecisely known system delays to regulate the reactive power of PV inverters. The proposed approach formulates the voltage control, based on predicted system operation states, as a robust VVC problem. It employs sample selection from the state prediction interval to promptly identify the worst-performing system operation state. Furthermore, we leverage the decentralized partially observable Markov decision process (Dec-POMDP) to reformulate the robust VVC problem. We design Multiple Policy Networks and employ Multiple Policy Networks and Reward Shaping-based Multi-agent Twin Delayed Deep Deterministic Policy Gradient (MPNRS-MATD3) algorithm to efficiently address and solve the Dec-POMDP model-based problem. Simulation results show the delay adaption characteristic of our proposed framework, and the MPNRS-MATD3 outperforms other multi-agent reinforcement learning algorithms in robust voltage control

    Successor features based multi-agent RL for event-based decentralized MDPs

    Get PDF
    Decentralized MDPs (Dec-MDPs) provide a rigorous framework for collaborative multi-agent sequential decisionmaking under uncertainty. However, their computational complexity limits the practical impact. To address this, we focus on a class of Dec-MDPs consisting of independent collaborating agents that are tied together through a global reward function that depends upon their entire histories of states and actions to accomplish joint tasks. To overcome scalability barrier, our main contributions are: (a) We propose a new actor-critic based Reinforcement Learning (RL) approach for event-based Dec-MDPs using successor features (SF) which is a value function representation that decouples the dynamics of the environment from the rewards; (b) We then present Dec-ESR (Decentralized Event based Successor Representation) which generalizes learning for event-based Dec-MDPs using SF within an end-to-end deep RL framework; (c) We also show that Dec-ESR allows useful transfer of information on related but different tasks, hence bootstraps the learning for faster convergence on new tasks; (d) For validation purposes, we test our approach on a large multi-agent coverage problem which models schedule coordination of agents in a real urban subway network and achieves better quality solutions than previous best approaches

    A formal evaluation of the performance of different corporate styles in stable and turbulent environments

    Get PDF
    The notion of "parenting styles", introduced by Goold, Campbell and Alexander, has been widely acknowledged by the Corporate Strategy literature as a good broad description of the different ways in which corporate managers choose to manage and organize multibusiness firms. The purpose of this paper is to present a formal test of the relationship between parenting style and performance. For this test, we developed a set of agent-based simulations using the Performance Landscapes framework, which captures and describes the evolution of firms led by different parenting styles in business environments with different levels of complexity and dynamism. We found that the relative performance of each style is contingent upon the characteristics of the environment in which the firm operates. In less complex business environments, the Strategic Planning style outperforms the Strategic Control and Financial Control styles. In highly complex and highly dynamic environments, by contrast, the Strategic Control style performs best. Our results also demonstrate the importance of planning and flexibility at the corporate level and so contribute to the wider debate on Strategic Planning vs. Emergent Strategies.Corporate strategy; Parenting styles; Agent-based models;

    Reinforcement Learning

    Get PDF
    Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Learning is a very important aspect. This book is on reinforcement learning which involves performing actions to achieve a goal. The first 11 chapters of this book describe and extend the scope of reinforcement learning. The remaining 11 chapters show that there is already wide usage in numerous fields. Reinforcement learning can tackle control tasks that are too complex for traditional, hand-designed, non-learning controllers. As learning computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels. This book shows that reinforcement learning is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field
    corecore