131 research outputs found
Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks
Future wireless networks have a substantial potential in terms of supporting
a broad range of complex compelling applications both in military and civilian
fields, where the users are able to enjoy high-rate, low-latency, low-cost and
reliable information services. Achieving this ambitious goal requires new radio
techniques for adaptive learning and intelligent decision making because of the
complex heterogeneous nature of the network structures and wireless services.
Machine learning (ML) algorithms have great success in supporting big data
analytics, efficient parameter estimation and interactive decision making.
Hence, in this article, we review the thirty-year history of ML by elaborating
on supervised learning, unsupervised learning, reinforcement learning and deep
learning. Furthermore, we investigate their employment in the compelling
applications of wireless networks, including heterogeneous networks (HetNets),
cognitive radios (CR), Internet of things (IoT), machine to machine networks
(M2M), and so on. This article aims for assisting the readers in clarifying the
motivation and methodology of the various ML algorithms, so as to invoke them
for hitherto unexplored services as well as scenarios of future wireless
networks.Comment: 46 pages, 22 fig
Adaptive and learning-based formation control of swarm robots
Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation
Coalition Formation under Uncertainty
Many multiagent systems require allocation of agents to tasks in order to ensure successful task execution. Most systems that perform this allocation assume that the quantity of agents needed for a task is known beforehand. Coalition formation approaches relax this assumption, allowing multiple agents to be dynamically assigned. Unfortunately, many current approaches to coalition formation lack provisions for uncertainty. This prevents application of coalition formation techniques to complex domains, such as real-world robotic systems and agent domains where full state knowledge is not available. Those that do handle uncertainty have no ability to handle dynamic addition or removal of agents from the collective and they constrain the environment to limit the sources of uncertainty. A modeling approach and algorithm for coalition formation is presented that decreases the collective\u27s dependence on knowing agent types. The agent modeling approach enforces stability, allows for arbitrary expansion of the collective, and serves as a basis for calculation of individual coalition payoffs. It explicitly captures uncertainty in agent type and allows uncertainty in coalition value and agent cost, and no agent in the collective is required to perfectly know another agents type. The modeling approach is incorporated into a two part algorithm to generate, evaluate, and join stable coalitions for task execution. A comparison with a prior approach designed to handle uncertainty in agent type shows that the protocol not only provides greater flexibility, but also handles uncertainty on a greater scale. Additional results show the application of the approach to real-world robotics and demonstrate the algorithm\u27s scalability. This provides a framework well suited to decentralized task allocation in general collectives
A Decentralized Partially Observable Markov Decision Model with Action Duration for Goal Recognition in Real Time Strategy Games
Multiagent goal recognition is a tough yet important problem in many real time strategy games or simulation systems. Traditional modeling methods either are in great demand of detailed agentsâ domain knowledge and training dataset for policy estimation or lack clear definition of action duration. To solve the above problems, we propose a novel Dec-POMDM-T model, combining the classic Dec-POMDP, an observation model for recognizer, joint goal with its termination indicator, and time duration variables for actions with action termination variables. In this paper, a model-free algorithm named cooperative colearning based on Sarsa is used. Considering that Dec-POMDM-T usually encounters multiagent goal recognition problems with different sorts of noises, partially missing data, and unknown action durations, the paper exploits the SIS PF with resampling for inference under the dynamic Bayesian network structure of Dec-POMDM-T. In experiments, a modified predator-prey scenario is adopted to study multiagent joint goal recognition problem, which is the recognition of the joint target shared among cooperative predators. Experiment results show that (a) Dec-POMDM-T works effectively in multiagent goal recognition and adapts well to dynamic changing goals within agent group; (b) Dec-POMDM-T outperforms traditional Dec-MDP-based methods in terms of precision, recall, and F-measure
Intention prediction for interactive navigation in distributed robotic systems
Modern applications of mobile robots require them to have the ability to safely and
effectively navigate in human environments. New challenges arise when these
robots must plan their motion in a human-aware fashion. Current methods
addressing this problem have focused mainly on the activity forecasting aspect,
aiming at improving predictions without considering the active nature of the
interaction, i.e. the robotâs effect on the environment and consequent issues such as
reciprocity. Furthermore, many methods rely on computationally expensive offline
training of predictive models that may not be well suited to rapidly evolving
dynamic environments.
This thesis presents a novel approach for enabling autonomous robots to navigate
socially in environments with humans. Following formulations of the inverse
planning problem, agents reason about the intentions of other agents and make
predictions about their future interactive motion. A technique is proposed to
implement counterfactual reasoning over a parametrised set of light-weight
reciprocal motion models, thus making it more tractable to maintain beliefs over the
future trajectories of other agents towards plausible goals. The speed of inference
and the effectiveness of the algorithms is demonstrated via physical robot
experiments, where computationally constrained robots navigate amongst humans
in a distributed multi-sensor setup, able to infer other agentsâ intentions as fast as
100ms after the first observation.
While intention inference is a key aspect of successful human-robot interaction,
executing any task requires planning that takes into account the predicted goals and
trajectories of other agents, e.g., pedestrians. It is well known that robots
demonstrate unwanted behaviours, such as freezing or becoming sluggishly
responsive, when placed in dynamic and cluttered environments, due to the way in
which safety margins according to simple heuristics end up covering the entire
feasible space of motion. The presented approach makes more refined predictions
about future movement, which enables robots to find collision-free paths quickly
and efficiently.
This thesis describes a novel technique for generating "interactive costmaps", a
representation of the plannerâs costs and rewards across time and space, providing
an autonomous robot with the information required to navigate socially given the
estimate of other agentsâ intentions. This multi-layered costmap deters the robot from
obstructing while encouraging social navigation respectful of other agentsâ activity.
Results show that this approach minimises collisions and near-collisions, minimises
travel times for agents, and importantly offers the same computational cost as the
most common costmap alternatives for navigation.
A key part of the practical deployment of such technologies is their ease of
implementation and configuration. Since every use case and environment is
different and distinct, the presented methods use online adaptation to learn
parameters of the navigating agents during runtime. Furthermore, this thesis
includes a novel technique for allocating tasks in distributed robotics systems,
where a tool is provided to maximise the performance on any distributed setup by
automatic parameter tuning. All of these methods are implemented in ROS and
distributed as open-source. The ultimate aim is to provide an accessible and efficient
framework that may be seamlessly deployed on modern robots, enabling
widespread use of intention prediction for interactive navigation in distributed
robotic systems
Scalable Decision-Theoretic Planning in Open and Typed Multiagent Systems
In open agent systems, the set of agents that are cooperating or competing
changes over time and in ways that are nontrivial to predict. For example, if
collaborative robots were tasked with fighting wildfires, they may run out of
suppressants and be temporarily unavailable to assist their peers. We consider
the problem of planning in these contexts with the additional challenges that
the agents are unable to communicate with each other and that there are many of
them. Because an agent's optimal action depends on the actions of others, each
agent must not only predict the actions of its peers, but, before that, reason
whether they are even present to perform an action. Addressing openness thus
requires agents to model each other's presence, which becomes computationally
intractable with high numbers of agents. We present a novel, principled, and
scalable method in this context that enables an agent to reason about others'
presence in its shared environment and their actions. Our method extrapolates
models of a few peers to the overall behavior of the many-agent system, and
combines it with a generalization of Monte Carlo tree search to perform
individual agent reasoning in many-agent open environments. Theoretical
analyses establish the number of agents to model in order to achieve acceptable
worst case bounds on extrapolation error, as well as regret bounds on the
agent's utility from modeling only some neighbors. Simulations of multiagent
wildfire suppression problems demonstrate our approach's efficacy compared with
alternative baselines.Comment: Pre-print with appendices for AAAI 202
Theory of mind and decision science: Towards a typology of tasks and computational models
The ability to form a Theory of Mind (ToM), i.e., to theorize about othersâ mental states to explain and predict behavior in relation to attributed intentional states, constitutes a hallmark of human cognition. These abilities are multi-faceted and include a variety of different cognitive sub-functions. Here, we focus on decision processes in social contexts and review a number of experimental and computational modeling approaches in this field. We provide an overview of experimental accounts and formal computational models with respect to two dimensions: interactivity and uncertainty. Thereby, we aim at capturing the nuances of ToM functions in the context of social decision processes. We suggest there to be an increase in ToM engagement and multiplexing as social cognitive decision-making tasks become more interactive and uncertain. We propose that representing others as intentional and goal directed agents who perform consequential actions is elicited only at the edges of these two dimensions. Further, we argue that computational models of valuation and beliefs follow these dimensions to best allow researchers to effectively model sophisticated ToM-processes. Finally, we relate this typology to neuroimaging findings in neurotypical (NT) humans, studies of persons with autism spectrum (AS), and studies of nonhuman primates
A Unified Framework for Solving Multiagent Task Assignment Problems
Multiagent task assignment problem descriptors do not fully represent the complex interactions in a multiagent domain, and algorithmic solutions vary widely depending on how the domain is represented. This issue is compounded as related research fields contain descriptors that similarly describe multiagent task assignment problems, including complex domain interactions, but generally do not provide the mechanisms needed to solve the multiagent aspect of task assignment. This research presents a unified approach to representing and solving the multiagent task assignment problem for complex problem domains. Ideas central to multiagent task allocation, project scheduling, constraint satisfaction, and coalition formation are combined to form the basis of the constrained multiagent task scheduling (CMTS) problem. Basic analysis reveals the exponential size of the solution space for a CMTS problem, approximated by O(2n(m+n)) based on the number of agents and tasks involved in a problem. The shape of the solution space is shown to contain numerous discontinuous regions due to the complexities involved in relational constraints defined between agents and tasks. The CMTS descriptor represents a wide range of classical and modern problems, such as job shop scheduling, the traveling salesman problem, vehicle routing, and cooperative multi-object tracking. Problems using the CMTS representation are solvable by a suite of algorithms, with varying degrees of suitability. Solution generating methods range from simple random scheduling to state-of-the-art biologically inspired approaches. Techniques from classical task assignment solvers are extended to handle multiagent task problems where agents can also multitask. Additional ideas are incorporated from constraint satisfaction, project scheduling, evolutionary algorithms, dynamic coalition formation, auctioning, and behavior-based robotics to highlight how different solution generation strategies apply to the complex problem space
- âŠ