12,208 research outputs found
Eagle: End-to-end Deep Reinforcement Learning based Autonomous Control of PTZ Cameras
Existing approaches for autonomous control of pan-tilt-zoom (PTZ) cameras use
multiple stages where object detection and localization are performed
separately from the control of the PTZ mechanisms. These approaches require
manual labels and suffer from performance bottlenecks due to error propagation
across the multi-stage flow of information. The large size of object detection
neural networks also makes prior solutions infeasible for real-time deployment
in resource-constrained devices. We present an end-to-end deep reinforcement
learning (RL) solution called Eagle to train a neural network policy that
directly takes images as input to control the PTZ camera. Training
reinforcement learning is cumbersome in the real world due to labeling effort,
runtime environment stochasticity, and fragile experimental setups. We
introduce a photo-realistic simulation framework for training and evaluation of
PTZ camera control policies. Eagle achieves superior camera control performance
by maintaining the object of interest close to the center of captured images at
high resolution and has up to 17% more tracking duration than the
state-of-the-art. Eagle policies are lightweight (90x fewer parameters than
Yolo5s) and can run on embedded camera platforms such as Raspberry PI (33 FPS)
and Jetson Nano (38 FPS), facilitating real-time PTZ tracking for
resource-constrained environments. With domain randomization, Eagle policies
trained in our simulator can be transferred directly to real-world scenarios.Comment: 20 pages, IoTD
Markov Decision Processes with Applications in Wireless Sensor Networks: A Survey
Wireless sensor networks (WSNs) consist of autonomous and resource-limited
devices. The devices cooperate to monitor one or more physical phenomena within
an area of interest. WSNs operate as stochastic systems because of randomness
in the monitored environments. For long service time and low maintenance cost,
WSNs require adaptive and robust methods to address data exchange, topology
formulation, resource and power optimization, sensing coverage and object
detection, and security challenges. In these problems, sensor nodes are to make
optimized decisions from a set of accessible strategies to achieve design
goals. This survey reviews numerous applications of the Markov decision process
(MDP) framework, a powerful decision-making tool to develop adaptive algorithms
and protocols for WSNs. Furthermore, various solution methods are discussed and
compared to serve as a guide for using MDPs in WSNs
Human-Machine Collaborative Optimization via Apprenticeship Scheduling
Coordinating agents to complete a set of tasks with intercoupled temporal and
resource constraints is computationally challenging, yet human domain experts
can solve these difficult scheduling problems using paradigms learned through
years of apprenticeship. A process for manually codifying this domain knowledge
within a computational framework is necessary to scale beyond the
``single-expert, single-trainee" apprenticeship model. However, human domain
experts often have difficulty describing their decision-making processes,
causing the codification of this knowledge to become laborious. We propose a
new approach for capturing domain-expert heuristics through a pairwise ranking
formulation. Our approach is model-free and does not require enumerating or
iterating through a large state space. We empirically demonstrate that this
approach accurately learns multifaceted heuristics on a synthetic data set
incorporating job-shop scheduling and vehicle routing problems, as well as on
two real-world data sets consisting of demonstrations of experts solving a
weapon-to-target assignment problem and a hospital resource allocation problem.
We also demonstrate that policies learned from human scheduling demonstration
via apprenticeship learning can substantially improve the efficiency of a
branch-and-bound search for an optimal schedule. We employ this human-machine
collaborative optimization technique on a variant of the weapon-to-target
assignment problem. We demonstrate that this technique generates solutions
substantially superior to those produced by human domain experts at a rate up
to 9.5 times faster than an optimization approach and can be applied to
optimally solve problems twice as complex as those solved by a human
demonstrator.Comment: Portions of this paper were published in the Proceedings of the
International Joint Conference on Artificial Intelligence (IJCAI) in 2016 and
in the Proceedings of Robotics: Science and Systems (RSS) in 2016. The paper
consists of 50 pages with 11 figures and 4 table
- …