20,296 research outputs found
Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks
Future wireless networks have a substantial potential in terms of supporting
a broad range of complex compelling applications both in military and civilian
fields, where the users are able to enjoy high-rate, low-latency, low-cost and
reliable information services. Achieving this ambitious goal requires new radio
techniques for adaptive learning and intelligent decision making because of the
complex heterogeneous nature of the network structures and wireless services.
Machine learning (ML) algorithms have great success in supporting big data
analytics, efficient parameter estimation and interactive decision making.
Hence, in this article, we review the thirty-year history of ML by elaborating
on supervised learning, unsupervised learning, reinforcement learning and deep
learning. Furthermore, we investigate their employment in the compelling
applications of wireless networks, including heterogeneous networks (HetNets),
cognitive radios (CR), Internet of things (IoT), machine to machine networks
(M2M), and so on. This article aims for assisting the readers in clarifying the
motivation and methodology of the various ML algorithms, so as to invoke them
for hitherto unexplored services as well as scenarios of future wireless
networks.Comment: 46 pages, 22 fig
Factorized Q-Learning for Large-Scale Multi-Agent Systems
Deep Q-learning has achieved significant success in single-agent decision
making tasks. However, it is challenging to extend Q-learning to large-scale
multi-agent scenarios, due to the explosion of action space resulting from the
complex dynamics between the environment and the agents. In this paper, we
propose to make the computation of multi-agent Q-learning tractable by treating
the Q-function (w.r.t. state and joint-action) as a high-order high-dimensional
tensor and then approximate it with factorized pairwise interactions.
Furthermore, we utilize a composite deep neural network architecture for
computing the factorized Q-function, share the model parameters among all the
agents within the same group, and estimate the agents' optimal joint actions
through a coordinate descent type algorithm. All these simplifications greatly
reduce the model complexity and accelerate the learning process. Extensive
experiments on two different multi-agent problems demonstrate the performance
gain of our proposed approach in comparison with strong baselines, particularly
when there are a large number of agents.Comment: 7 pages, 5 figures, DAI 201
Recommended from our members
Towards Informed Exploration for Deep Reinforcement Learning
In this thesis, we discuss various techniques for improving exploration for deep reinforcement learning. We begin with a brief review of reinforcement learning (RL) and the fundamental v.s. exploitation trade-off. Then we review how deep RL has improved upon classical and summarize six categories of the latest exploration methods for deep RL, in the order increasing usage of prior information. We then explore representative works in three categories discuss their strengths and weaknesses. The first category, represented by Soft Q-learning, uses regularization to encourage exploration. The second category, represented by count-based via hashing, maps states to hash codes for counting and assigns higher exploration to less-encountered states. The third category utilizes hierarchy and is represented by modular architecture for RL agents to play StarCraft II. Finally, we conclude that exploration by prior knowledge is a promising research direction and suggest topics of potentially impact
Inverse Reinforcement Learning in Swarm Systems
Inverse reinforcement learning (IRL) has become a useful tool for learning
behavioral models from demonstration data. However, IRL remains mostly
unexplored for multi-agent systems. In this paper, we show how the principle of
IRL can be extended to homogeneous large-scale problems, inspired by the
collective swarming behavior of natural systems. In particular, we make the
following contributions to the field: 1) We introduce the swarMDP framework, a
sub-class of decentralized partially observable Markov decision processes
endowed with a swarm characterization. 2) Exploiting the inherent homogeneity
of this framework, we reduce the resulting multi-agent IRL problem to a
single-agent one by proving that the agent-specific value functions in this
model coincide. 3) To solve the corresponding control problem, we propose a
novel heterogeneous learning scheme that is particularly tailored to the swarm
setting. Results on two example systems demonstrate that our framework is able
to produce meaningful local reward models from which we can replicate the
observed global system dynamics.Comment: 9 pages, 8 figures; ### Version 2 ### version accepted at AAMAS 201
Human Motion Trajectory Prediction: A Survey
With growing numbers of intelligent autonomous systems in human environments,
the ability of such systems to perceive, understand and anticipate human
behavior becomes increasingly important. Specifically, predicting future
positions of dynamic agents and planning considering such predictions are key
tasks for self-driving vehicles, service robots and advanced surveillance
systems. This paper provides a survey of human motion trajectory prediction. We
review, analyze and structure a large selection of work from different
communities and propose a taxonomy that categorizes existing methods based on
the motion modeling approach and level of contextual information used. We
provide an overview of the existing datasets and performance metrics. We
discuss limitations of the state of the art and outline directions for further
research.Comment: Submitted to the International Journal of Robotics Research (IJRR),
37 page
- …