3,127 research outputs found
Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey
Wireless sensor networks (WSNs) consist of autonomous and resource-limited
devices. The devices cooperate to monitor one or more physical phenomena within
an area of interest. WSNs operate as stochastic systems because of randomness
in the monitored environments. For long service time and low maintenance cost,
WSNs require adaptive and robust methods to address data exchange, topology
formulation, resource and power optimization, sensing coverage and object
detection, and security challenges. In these problems, sensor nodes are to make
optimized decisions from a set of accessible strategies to achieve design
goals. This survey reviews numerous applications of the Markov decision process
(MDP) framework, a powerful decision-making tool to develop adaptive algorithms
and protocols for WSNs. Furthermore, various solution methods are discussed and
compared to serve as a guide for using MDPs in WSNs
Strategic Learning for Active, Adaptive, and Autonomous Cyber Defense
The increasing instances of advanced attacks call for a new defense paradigm
that is active, autonomous, and adaptive, named as the \texttt{`3A'} defense
paradigm. This chapter introduces three defense schemes that actively interact
with attackers to increase the attack cost and gather threat information, i.e.,
defensive deception for detection and counter-deception, feedback-driven Moving
Target Defense (MTD), and adaptive honeypot engagement. Due to the cyber
deception, external noise, and the absent knowledge of the other players'
behaviors and goals, these schemes possess three progressive levels of
information restrictions, i.e., from the parameter uncertainty, the payoff
uncertainty, to the environmental uncertainty. To estimate the unknown and
reduce uncertainty, we adopt three different strategic learning schemes that
fit the associated information restrictions. All three learning schemes share
the same feedback structure of sensation, estimation, and actions so that the
most rewarding policies get reinforced and converge to the optimal ones in
autonomous and adaptive fashions. This work aims to shed lights on proactive
defense strategies, lay a solid foundation for strategic learning under
incomplete information, and quantify the tradeoff between the security and
costs.Comment: arXiv admin note: text overlap with arXiv:1906.1218
Applications of Repeated Games in Wireless Networks: A Survey
A repeated game is an effective tool to model interactions and conflicts for
players aiming to achieve their objectives in a long-term basis. Contrary to
static noncooperative games that model an interaction among players in only one
period, in repeated games, interactions of players repeat for multiple periods;
and thus the players become aware of other players' past behaviors and their
future benefits, and will adapt their behavior accordingly. In wireless
networks, conflicts among wireless nodes can lead to selfish behaviors,
resulting in poor network performances and detrimental individual payoffs. In
this paper, we survey the applications of repeated games in different wireless
networks. The main goal is to demonstrate the use of repeated games to
encourage wireless nodes to cooperate, thereby improving network performances
and avoiding network disruption due to selfish behaviors. Furthermore, various
problems in wireless networks and variations of repeated game models together
with the corresponding solutions are discussed in this survey. Finally, we
outline some open issues and future research directions.Comment: 32 pages, 15 figures, 5 tables, 168 reference
Agent-based transportation planning compared with scheduling heuristics
Here we consider the problem of dynamically assigning vehicles to transportation orders that have di¤erent time windows and should be handled in real time. We introduce a new agent-based system for the planning and scheduling of these transportation networks. Intelligent vehicle agents schedule their own routes. They interact with job agents, who strive for minimum transportation costs, using a Vickrey auction for each incoming order. We use simulation to compare the on-time delivery percentage and the vehicle utilization of an agent-based planning system to a traditional system based on OR heuristics (look-ahead rules, serial scheduling). Numerical experiments show that a properly designed multi-agent system may perform as good as or even better than traditional methods
Reinforcement Learning
Brains rule the world, and brain-like computation is increasingly used in computers and electronic devices. Brain-like computation is about processing and interpreting data or directly putting forward and performing actions. Learning is a very important aspect. This book is on reinforcement learning which involves performing actions to achieve a goal. The first 11 chapters of this book describe and extend the scope of reinforcement learning. The remaining 11 chapters show that there is already wide usage in numerous fields. Reinforcement learning can tackle control tasks that are too complex for traditional, hand-designed, non-learning controllers. As learning computers can deal with technical complexities, the tasks of human operators remain to specify goals on increasingly higher levels. This book shows that reinforcement learning is a very dynamic area in terms of theory and applications and it shall stimulate and encourage new research in this field
- …