Search CORE

110,508 research outputs found

Feature Reinforcement Learning: Part I: Unstructured MDPs

Author: Hutter Marcus
Publication venue
Publication date: 01/01/2009
Field of study

General-purpose, intelligent, learning agents cycle through sequences of observations, actions, and rewards that are complex, uncertain, unknown, and non-Markovian. On the other hand, reinforcement learning is well-developed for small finite state Markov decision processes (MDPs). Up to now, extracting the right state representations out of bare observations, that is, reducing the general agent setup to the MDP framework, is an art that involves significant effort by designers. The primary goal of this work is to automate the reduction process and thereby significantly expand the scope of many existing reinforcement learning algorithms and the agents that employ them. Before we can think of mechanizing this search for suitable MDPs, we need a formal objective criterion. The main contribution of this article is to develop such a criterion. I also integrate the various parts into one learning algorithm. Extensions to more realistic dynamic Bayesian networks are developed in Part II. The role of POMDPs is also considered there.Comment: 24 LaTeX pages, 5 diagram

arXiv.org e-Print Archive

CiteSeerX

The Australian National University

Decentralized Cooperative Planning for Automated Vehicles with Continuous Monte Carlo Tree Search

Author: Engelhorn Florian
Kurzer Karl
Zöllner J. Marius
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/09/2018
Field of study

Urban traffic scenarios often require a high degree of cooperation between traffic participants to ensure safety and efficiency. Observing the behavior of others, humans infer whether or not others are cooperating. This work aims to extend the capabilities of automated vehicles, enabling them to cooperate implicitly in heterogeneous environments. Continuous actions allow for arbitrary trajectories and hence are applicable to a much wider class of problems than existing cooperative approaches with discrete action spaces. Based on cooperative modeling of other agents, Monte Carlo Tree Search (MCTS) in conjunction with Decoupled-UCT evaluates the action-values of each agent in a cooperative and decentralized way, respecting the interdependence of actions among traffic participants. The extension to continuous action spaces is addressed by incorporating novel MCTS-specific enhancements for efficient search space exploration. The proposed algorithm is evaluated under different scenarios, showing that the algorithm is able to achieve effective cooperative planning and generate solutions egocentric planning fails to identify

arXiv.org e-Print Archive

Crossref

Operationalizing the circular city model for naples' city-port: A hybrid development strategy

Author: Cerreta M.
di Girasole E. G.
Poli G.
Regalbuto S.
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

The city-port context involves a decisive reality for the economic development of territories and nations, capable of significantly influencing the conditions of well-being and quality of life, and of making the Circular City Model (CCM) operational, preserving and enhancing seas and marine resources in a sustainable way. This can be achieved through the construction of appropriate production and consumption models, with attention to relations with the urban and territorial system. This paper presents an adaptive decision-making process for Naples (Italy) commercial port's development strategies, aimed at re-establishing a sustainable city-port relationship and making Circular Economy (CE) principles operative. The approach has aimed at implementing a CCM by operationalizing European recommendations provided within both the Sustainable Development Goals (SDGs) framework-specifically focusing on goals 9, 11 and 12-and the Maritime Spatial Planning European Directive 2014/89, to face conflicts about the overlapping areas of the city-port through multidimensional evaluations' principles and tools. In this perspective, a four-step methodological framework has been structured applying a place-based approach with mixed evaluation methods, eliciting soft and hard knowledge domains, which have been expressed and assessed by a core set of Sustainability Indicators (SI), linked to SDGs. The contribution outcomes have been centred on the assessment of three design alternatives for the East Naples port and the development of a hybrid regeneration scenario consistent with CE and sustainability principles. The structured decision-making process has allowed us to test how an adaptive approach can expand the knowledge base underpinning policy design and decisions to achieve better outcomes and cultivate a broad civic and technical engagement, that can enhance the legitimacy and transparency of policies

Archivio della ricerca - Università degli studi di Napoli Federico II