Search CORE

18,657 research outputs found

Multi-agent Deep Covering Option Discovery

Author: Aggarwal Vaneet
Chen Jiayu
Haliem Marina
Lan Tian
Publication venue
Publication date: 06/10/2022
Field of study

The use of options can greatly accelerate exploration in reinforcement learning, especially when only sparse reward signals are available. While option discovery methods have been proposed for individual agents, in multi-agent reinforcement learning settings, discovering collaborative options that can coordinate the behavior of multiple agents and encourage them to visit the under-explored regions of their joint state space has not been considered. In this case, we propose Multi-agent Deep Covering Option Discovery, which constructs the multi-agent options through minimizing the expected cover time of the multiple agents' joint state space. Also, we propose a novel framework to adopt the multi-agent options in the MARL process. In practice, a multi-agent task can usually be divided into some sub-tasks, each of which can be completed by a sub-group of the agents. Therefore, our algorithm framework first leverages an attention mechanism to find collaborative agent sub-groups that would benefit most from coordinated actions. Then, a hierarchical algorithm, namely HA-MSAC, is developed to learn the multi-agent options for each sub-group to complete their sub-tasks first, and then to integrate them through a high-level policy as the solution of the whole task. This hierarchical option construction allows our framework to strike a balance between scalability and effective collaboration among the agents. The evaluation based on multi-agent collaborative tasks shows that the proposed algorithm can effectively capture the agent interactions with the attention mechanism, successfully identify multi-agent options, and significantly outperforms prior works using single-agent options or no options, in terms of both faster exploration and higher task rewards.Comment: This paper was presented in part at the ICML Reinforcement Learning for Real Life Workshop, July 202

arXiv.org e-Print Archive

Beyond A/B Testing: Sequential Randomization for Developing Interventions in Scaled Digital Learning Environments

Author: Almirall D
Brooks C
Brusilovsky P
Davis D
Dweck CS
González-Brenes JP
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 31/01/2019
Field of study

Randomized experiments ensure robust causal inference that are critical to effective learning analytics research and practice. However, traditional randomized experiments, like A/B tests, are limiting in large scale digital learning environments. While traditional experiments can accurately compare two treatment options, they are less able to inform how to adapt interventions to continually meet learners' diverse needs. In this work, we introduce a trial design for developing adaptive interventions in scaled digital learning environments -- the sequential randomized trial (SRT). With the goal of improving learner experience and developing interventions that benefit all learners at all times, SRTs inform how to sequence, time, and personalize interventions. In this paper, we provide an overview of SRTs, and we illustrate the advantages they hold compared to traditional experiments. We describe a novel SRT run in a large scale data science MOOC. The trial results contextualize how learner engagement can be addressed through inclusive culturally targeted reminder emails. We also provide practical advice for researchers who aim to run their own SRTs to develop adaptive interventions in scaled digital learning environments

arXiv.org e-Print Archive

Crossref

Heuristic usability evaluation on games: a modular approach

Author: Cascado Caballero Daniel
Font Calvo Juan Luis
Sevillano Ramos José Luis
Yáñez Gómez Rosa
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Heuristic evaluation is the preferred method to assess usability in games when experts conduct this evaluation. Many heuristics guidelines have been proposed attending to specificities of games but they only focus on specific subsets of games or platforms. In fact, to date the most used guideline to evaluate games usability is still Nielsen’s proposal, which is focused on generic software. As a result, most evaluations do not cover important aspects in games such as mobility, multiplayer interactions, enjoyability and playability, etc. To promote the usage of new heuristics adapted to different game and platform aspects we propose a modular approach based on the classification of existing game heuristics using metadata and a tool, MUSE (Meta-heUristics uSability Evaluation tool) for games, which allows a rebuild of heuristic guidelines based on metadata selection in order to obtain a customized list for every real evaluation case. The usage of these new rebuilt heuristic guidelines allows an explicit attendance to a wide range of usability aspects in games and a better detection of usability issues. We preliminarily evaluate MUSE with an analysis of two different games, using both the Nielsen’s heuristics and the customized heuristic lists generated by our tool.Unión Europea PI055-15/E0

Crossref

idUS. Depósito de Investigación Universidad de Sevilla

Socio-technical transition processes: A real option based reasoning.

Author: Arman Avadikyan
Patrick Llerena
Publication venue
Publication date
Field of study

Using a real option reasoning perspective we study the uncertainties and irreversibilities that impact the investment decisions of firms during the different phases of technological transitions. The analysis of transition dynamics via real options reasoning allows the provision of an alternative and more qualified explanation of investment decisions according to the sequentiality of pathways considered. In our framework, flexibility management through option investments concerns both the incumbent and the future technological regime. In the first case it refers to ex-post flexibility management and in the second case to ex-ante flexibility management.

Research Papers in Economics

Planning as Optimization: Dynamically Discovering Optimal Configurations for Runtime Situations

Author: Fredericks Erik M.
Gerostathopoulos Ilias
Krupitzer Christian
Vogel Thomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/05/2019
Field of study

The large number of possible configurations of modern software-based systems, combined with the large number of possible environmental situations of such systems, prohibits enumerating all adaptation options at design time and necessitates planning at run time to dynamically identify an appropriate configuration for a situation. While numerous planning techniques exist, they typically assume a detailed state-based model of the system and that the situations that warrant adaptations are known. Both of these assumptions can be violated in complex, real-world systems. As a result, adaptation planning must rely on simple models that capture what can be changed (input parameters) and observed in the system and environment (output and context parameters). We therefore propose planning as optimization: the use of optimization strategies to discover optimal system configurations at runtime for each distinct situation that is also dynamically identified at runtime. We apply our approach to CrowdNav, an open-source traffic routing system with the characteristics of a real-world system. We identify situations via clustering and conduct an empirical study that compares Bayesian optimization and two types of evolutionary optimization (NSGA-II and novelty search) in CrowdNav

arXiv.org e-Print Archive

Crossref

Boolean Matrix Factorization Meets Consecutive Ones Property

Author: Miettinen P.
Tatti N.
Publication venue
Publication date: 01/01/2019
Field of study

Boolean matrix factorization is a natural and a popular technique for summarizing binary matrices. In this paper, we study a problem of Boolean matrix factorization where we additionally require that the factor matrices have consecutive ones property (OBMF). A major application of this optimization problem comes from graph visualization: standard techniques for visualizing graphs are circular or linear layout, where nodes are ordered in circle or on a line. A common problem with visualizing graphs is clutter due to too many edges. The standard approach to deal with this is to bundle edges together and represent them as ribbon. We also show that we can use OBMF for edge bundling combined with circular or linear layout techniques. We demonstrate that not only this problem is NP-hard but we cannot have a polynomial-time algorithm that yields a multiplicative approximation guarantee (unless P = NP). On the positive side, we develop a greedy algorithm where at each step we look for the best 1-rank factorization. Since even obtaining 1-rank factorization is NP-hard, we propose an iterative algorithm where we fix one side and and find the other, reverse the roles, and repeat. We show that this step can be done in linear time using pq-trees. We also extend the problem to cyclic ones property and symmetric factorizations. Our experiments show that our algorithms find high-quality factorizations and scale well

MPG.PuRe

Futures Exchange Innovations: Reinforcement versus Cannibalism

Author: Joost M.E. Pennings
Raymond M. Leuthold
Publication venue
Publication date
Field of study

Futures exchanges are in constant search of futures contracts that will generate a profitable level of trading volume. In this context, it would be interesting to determine what effect the introduction of new futures contracts have on the trading volume of the contracts already listed. The introduction of new futures contracts may lead to a volume increase for those contracts already listed and hence, contribute to the success of a futures exchange. On the other hand, the introduction of new futures contracts could lead to a volume decrease for the contracts already listed, thereby undermining the success of the futures exchange accordingly. Using a multi-product hedging model in which the perspective has been shifted from portfolio to exchange management, we study these effects. Using data from two exchanges that are different regarding market liquidity (Amsterdam Exchanges versus Chicago Board of Trade) we show the usefulness of the proposed tool. Our findings have several important implications for a futures exchange's innovation policy.

Research Papers in Economics