359 research outputs found
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Probabilistic Inference for Model Based Control
Robotic systems are essential for enhancing productivity, automation, and performing hazardous tasks. Addressing the unpredictability of physical systems, this thesis advances robotic planning and control under uncertainty, introducing learning-based methods for managing uncertain parameters and adapting to changing environments in real-time.
Our first contribution is a framework using Bayesian statistics for likelihood-free inference of model parameters. This allows employing complex simulators for designing efficient, robust controllers. The method, integrating the unscented transform with a variant of information theoretical model predictive control, shows better performance in trajectory evaluation compared to Monte Carlo sampling, easing the computational load in various control and robotics tasks.
Next, we reframe robotic planning and control as a Bayesian inference problem, focusing on the posterior distribution of actions and model parameters. An implicit variational inference algorithm, performing Stein Variational Gradient Descent, estimates distributions over model parameters and control inputs in real-time. This Bayesian approach effectively handles complex multi-modal posterior distributions, vital for dynamic and realistic robot navigation.
Finally, we tackle diversity in high-dimensional spaces. Our approach mitigates underestimation of uncertainty in posterior distributions, which leads to locally optimal solutions. Using the theory of rough paths, we develop an algorithm for parallel trajectory optimisation, enhancing solution diversity and avoiding mode collapse. This method extends our variational inference approach for trajectory estimation, employing diversity-enhancing kernels and leveraging path signature representation of trajectories. Empirical tests, ranging from 2-D navigation to robotic manipulators in cluttered environments, affirm our method's efficiency, outperforming existing alternatives
Acceleration in Policy Optimization
We work towards a unifying paradigm for accelerating policy optimization
methods in reinforcement learning (RL) by integrating foresight in the policy
improvement step via optimistic and adaptive updates. Leveraging the connection
between policy iteration and policy gradient methods, we view policy
optimization algorithms as iteratively solving a sequence of surrogate
objectives, local lower bounds on the original objective. We define optimism as
predictive modelling of the future behavior of a policy, and adaptivity as
taking immediate and anticipatory corrective actions to mitigate accumulating
errors from overshooting predictions or delayed responses to change. We use
this shared lens to jointly express other well-known algorithms, including
model-based policy improvement based on forward search, and optimistic
meta-learning algorithms. We analyze properties of this formulation, and show
connections to other accelerated optimization algorithms. Then, we design an
optimistic policy gradient algorithm, adaptive via meta-gradient learning, and
empirically highlight several design choices pertaining to acceleration, in an
illustrative task
Suspicion-Agent: Playing Imperfect Information Games with Theory of Mind Aware GPT-4
Unlike perfect information games, where all elements are known to every
player, imperfect information games emulate the real-world complexities of
decision-making under uncertain or incomplete information. GPT-4, the recent
breakthrough in large language models (LLMs) trained on massive passive data,
is notable for its knowledge retrieval and reasoning abilities. This paper
delves into the applicability of GPT-4's learned knowledge for imperfect
information games. To achieve this, we introduce \textbf{Suspicion-Agent}, an
innovative agent that leverages GPT-4's capabilities for performing in
imperfect information games. With proper prompt engineering to achieve
different functions, Suspicion-Agent based on GPT-4 demonstrates remarkable
adaptability across a range of imperfect information card games. Importantly,
GPT-4 displays a strong high-order theory of mind (ToM) capacity, meaning it
can understand others and intentionally impact others' behavior. Leveraging
this, we design a planning strategy that enables GPT-4 to competently play
against different opponents, adapting its gameplay style as needed, while
requiring only the game rules and descriptions of observations as input. In the
experiments, we qualitatively showcase the capabilities of Suspicion-Agent
across three different imperfect information games and then quantitatively
evaluate it in Leduc Hold'em. The results show that Suspicion-Agent can
potentially outperform traditional algorithms designed for imperfect
information games, without any specialized training or examples. In order to
encourage and foster deeper insights within the community, we make our
game-related data publicly available
Barriers and Challenges to the Adoption of Collaborative Project Delivery Methods: A Multiple-Case Study Analysis
The last 30 years have seen considerable advantages afforded to those who chose to adopt more collaborative project delivery methods (IPD, Progressive Design Build) in the Construction Industry; however, despite these benefits, construction projects utilizing collaborative delivery methods still only account for a small fraction of overall construction deliveries. This research focused on probing leading industry professionals to better identify the barriers and challenges which are currently preventing project stakeholders across the United States from adopting these more collaborative project delivery methods, particularly first-time adopters. This study collected data from semi-structured interviews with 13 professionals in the Construction Industry who had experience with collaborative project delivery methods. Detailed analysis of stakeholder responses using NVivo software led to key insights associated with the implementation of IPD and Progressive Design Build projects including those related to teaming, learning, and administration. A new class of challenges is proposed related to managerial obstacles. The implication of this research is that appropriately recognizing and categorizing the pitfalls associated with the implementation of collaborative project delivery methods can provide Construction Industry professionals with a valuable framework for formulating effective solutions to overcome them
Securing multi-robot systems with inter-robot observations and accusations
In various industries, such as manufacturing, logistics, agriculture, defense, search and rescue, and transportation, Multi-robot systems (MRSs) are increasingly gaining popularity. These systems involve multiple robots working together towards a shared objective, either autonomously or under human supervision. However, as MRSs operate in uncertain or even adversarial environments, and the sensors and actuators of each robot may be error-prone, they are susceptible to faults and security threats unique to MRSs. Classical techniques from distributed systems cannot detect or mitigate these threats. In this dissertation, novel techniques are proposed to enhance the security and fault-tolerance of MRSs through inter-robot observations and accusations.
A fundamental security property is proposed for MRSs, which ensures that forbidden deviations from a desired multi-robot motion plan by the system supervisor are detected. Relying solely on self-reported motion information from the robots for monitoring deviations can leave the system vulnerable to attacks from a single compromised robot. The concept of co-observations is introduced, which are additional data reported to the supervisor to supplement the self-reported motion information. Co-observation-based detection is formalized as a method of identifying deviations from the expected motion plan based on discrepancies in the sequence of co-observations reported. An optimal deviation-detecting motion planning problem is formulated that achieves all the original application objectives while ensuring that all forbidden plan-deviation attacks trigger co-observation-based detection by the supervisor. A secure motion planner based on constraint solving is proposed as a proof-of-concept to implement the deviation-detecting security property.
The security and resilience of MRSs against plan deviation attacks are further improved by limiting the information available to attackers. An efficient algorithm is proposed that verifies the inability of an attacker to stealthily perform forbidden plan deviation attacks with a given motion plan and announcement scheme. Such announcement schemes are referred to as horizon-limiting. An optimal horizon-limiting planning problem is formulated that maximizes planning lookahead while maintaining the announcement scheme as horizon-limiting. Co-observations and horizon-limiting announcements are shown to be efficient and scalable in protecting MRSs, including systems with hundreds of robots, as evidenced by a case study in a warehouse setting.
Lastly, the Decentralized Blocklist Protocol (DBP), a method for designing Byzantine-resilient decentralized MRSs, is introduced. DBP is based on inter-robot accusations and allows cooperative robots to identify misbehavior through co-observations and share this information through the network. The method is adaptive to the number of faulty robots and is widely applicable to various decentralized MRS applications. It also permits fast information propagation, requires fewer cooperative observers of application-specific variables, and reduces the worst-case connectivity requirement, making it more scalable than existing methods. Empirical results demonstrate the scalability and effectiveness of DBP in cooperative target tracking, time synchronization, and localization case studies with hundreds of robots.
The techniques proposed in this dissertation enhance the security and fault-tolerance of MRSs operating in uncertain and adversarial environments, aiding in the development of secure MRSs for emerging applications
A Finite-Horizon Approach to Active Level Set Estimation
We consider the problem of active learning in the context of spatial sampling
for level set estimation (LSE), where the goal is to localize all regions where
a function of interest lies above/below a given threshold as quickly as
possible. We present a finite-horizon search procedure to perform LSE in one
dimension while optimally balancing both the final estimation error and the
distance traveled for a fixed number of samples. A tuning parameter is used to
trade off between the estimation accuracy and distance traveled. We show that
the resulting optimization problem can be solved in closed form and that the
resulting policy generalizes existing approaches to this problem. We then show
how this approach can be used to perform level set estimation in higher
dimensions under the popular Gaussian process model. Empirical results on
synthetic data indicate that as the cost of travel increases, our method's
ability to treat distance nonmyopically allows it to significantly improve on
the state of the art. On real air quality data, our approach achieves roughly
one fifth the estimation error at less than half the cost of competing
algorithms
On Transforming Reinforcement Learning by Transformer: The Development Trajectory
Transformer, originally devised for natural language processing, has also
attested significant success in computer vision. Thanks to its super expressive
power, researchers are investigating ways to deploy transformers to
reinforcement learning (RL) and the transformer-based models have manifested
their potential in representative RL benchmarks. In this paper, we collect and
dissect recent advances on transforming RL by transformer (transformer-based RL
or TRL), in order to explore its development trajectory and future trend. We
group existing developments in two categories: architecture enhancement and
trajectory optimization, and examine the main applications of TRL in robotic
manipulation, text-based games, navigation and autonomous driving. For
architecture enhancement, these methods consider how to apply the powerful
transformer structure to RL problems under the traditional RL framework, which
model agents and environments much more precisely than deep RL methods, but
they are still limited by the inherent defects of traditional RL algorithms,
such as bootstrapping and "deadly triad". For trajectory optimization, these
methods treat RL problems as sequence modeling and train a joint state-action
model over entire trajectories under the behavior cloning framework, which are
able to extract policies from static datasets and fully use the long-sequence
modeling capability of the transformer. Given these advancements, extensions
and challenges in TRL are reviewed and proposals about future direction are
discussed. We hope that this survey can provide a detailed introduction to TRL
and motivate future research in this rapidly developing field.Comment: 26 page
- …