31,918 research outputs found
Automating Vehicles by Deep Reinforcement Learning using Task Separation with Hill Climbing
Within the context of autonomous driving a model-based reinforcement learning
algorithm is proposed for the design of neural network-parameterized
controllers. Classical model-based control methods, which include sampling- and
lattice-based algorithms and model predictive control, suffer from the
trade-off between model complexity and computational burden required for the
online solution of expensive optimization or search problems at every short
sampling time. To circumvent this trade-off, a 2-step procedure is motivated:
first learning of a controller during offline training based on an arbitrarily
complicated mathematical system model, before online fast feedforward
evaluation of the trained controller. The contribution of this paper is the
proposition of a simple gradient-free and model-based algorithm for deep
reinforcement learning using task separation with hill climbing (TSHC). In
particular, (i) simultaneous training on separate deterministic tasks with the
purpose of encoding many motion primitives in a neural network, and (ii) the
employment of maximally sparse rewards in combination with virtual velocity
constraints (VVCs) in setpoint proximity are advocated.Comment: 10 pages, 6 figures, 1 tabl
Q-Strategy: A Bidding Strategy for Market-Based Allocation of Grid Services
The application of autonomous agents by the provisioning and usage of computational services is an attractive research field. Various methods and technologies in the area of artificial intelligence, statistics and economics are playing together to achieve i) autonomic service provisioning and usage of Grid services, to invent ii) competitive bidding strategies for widely used market mechanisms and to iii) incentivize consumers and providers to use such market-based systems.
The contributions of the paper are threefold. First, we present a bidding agent framework for implementing artificial bidding agents, supporting consumers and providers in technical and economic preference elicitation as well as automated bid generation by the requesting and provisioning of Grid services. Secondly, we introduce a novel consumer-side bidding strategy, which enables a goal-oriented and strategic behavior by the generation and submission of consumer service requests and selection of provider offers. Thirdly, we evaluate and compare the Q-strategy, implemented within the presented framework, against the Truth-Telling bidding strategy in three mechanisms â a centralized CDA, a decentralized on-line machine scheduling and a FIFO-scheduling mechanisms
Self-Learning Cloud Controllers: Fuzzy Q-Learning for Knowledge Evolution
Cloud controllers aim at responding to application demands by automatically
scaling the compute resources at runtime to meet performance guarantees and
minimize resource costs. Existing cloud controllers often resort to scaling
strategies that are codified as a set of adaptation rules. However, for a cloud
provider, applications running on top of the cloud infrastructure are more or
less black-boxes, making it difficult at design time to define optimal or
pre-emptive adaptation rules. Thus, the burden of taking adaptation decisions
often is delegated to the cloud application. Yet, in most cases, application
developers in turn have limited knowledge of the cloud infrastructure. In this
paper, we propose learning adaptation rules during runtime. To this end, we
introduce FQL4KE, a self-learning fuzzy cloud controller. In particular, FQL4KE
learns and modifies fuzzy rules at runtime. The benefit is that for designing
cloud controllers, we do not have to rely solely on precise design-time
knowledge, which may be difficult to acquire. FQL4KE empowers users to specify
cloud controllers by simply adjusting weights representing priorities in system
goals instead of specifying complex adaptation rules. The applicability of
FQL4KE has been experimentally assessed as part of the cloud application
framework ElasticBench. The experimental results indicate that FQL4KE
outperforms our previously developed fuzzy controller without learning
mechanisms and the native Azure auto-scaling
Decentralized Cooperative Planning for Automated Vehicles with Hierarchical Monte Carlo Tree Search
Today's automated vehicles lack the ability to cooperate implicitly with
others. This work presents a Monte Carlo Tree Search (MCTS) based approach for
decentralized cooperative planning using macro-actions for automated vehicles
in heterogeneous environments. Based on cooperative modeling of other agents
and Decoupled-UCT (a variant of MCTS), the algorithm evaluates the
state-action-values of each agent in a cooperative and decentralized manner,
explicitly modeling the interdependence of actions between traffic
participants. Macro-actions allow for temporal extension over multiple time
steps and increase the effective search depth requiring fewer iterations to
plan over longer horizons. Without predefined policies for macro-actions, the
algorithm simultaneously learns policies over and within macro-actions. The
proposed method is evaluated under several conflict scenarios, showing that the
algorithm can achieve effective cooperative planning with learned macro-actions
in heterogeneous environments
Recommended from our members
State-of-the-art on research and applications of machine learning in the building life cycle
Fueled by big data, powerful and affordable computing resources, and advanced algorithms, machine learning has been explored and applied to buildings research for the past decades and has demonstrated its potential to enhance building performance. This study systematically surveyed how machine learning has been applied at different stages of building life cycle. By conducting a literature search on the Web of Knowledge platform, we found 9579 papers in this field and selected 153 papers for an in-depth review. The number of published papers is increasing year by year, with a focus on building design, operation, and control. However, no study was found using machine learning in building commissioning. There are successful pilot studies on fault detection and diagnosis of HVAC equipment and systems, load prediction, energy baseline estimate, load shape clustering, occupancy prediction, and learning occupant behaviors and energy use patterns. None of the existing studies were adopted broadly by the building industry, due to common challenges including (1) lack of large scale labeled data to train and validate the model, (2) lack of model transferability, which limits a model trained with one data-rich building to be used in another building with limited data, (3) lack of strong justification of costs and benefits of deploying machine learning, and (4) the performance might not be reliable and robust for the stated goals, as the method might work for some buildings but could not be generalized to others. Findings from the study can inform future machine learning research to improve occupant comfort, energy efficiency, demand flexibility, and resilience of buildings, as well as to inspire young researchers in the field to explore multidisciplinary approaches that integrate building science, computing science, data science, and social science
Recommended from our members
When users control the algorithms: Values expressed in practices on the twitter platform
Recent interest in ethical AI has brought a slew of values, including fairness, into conversations about technology design. Research in the area of algorithmic fairness tends to be rooted in questions of distribution that can be subject to precise formalism and technical implementation. We seek to expand this conversation to include the experiences of people subject to algorithmic classification and decision-making. By examining tweets about the âTwitter algorithmâ we consider the wide range of concerns and desires Twitter users express. We find a concern with fairness (narrowly construed) is present, particularly in the ways users complain that the platform enacts a political bias against conservatives. However, we find another important category of concern, evident in attempts to exert control over the algorithm. Twitter users who seek control do so for a variety of reasons, many well justified. We argue for the need for better and clearer definitions of what constitutes legitimate and illegitimate control over algorithmic processes and to consider support for users who wish to enact their own collective choices
- âŠ