131 research outputs found

    Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks

    Full text link
    Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning (ML) algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of ML by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning. Furthermore, we investigate their employment in the compelling applications of wireless networks, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various ML algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.Comment: 46 pages, 22 fig

    Adaptive and learning-based formation control of swarm robots

    Get PDF
    Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation

    Coalition Formation under Uncertainty

    Get PDF
    Many multiagent systems require allocation of agents to tasks in order to ensure successful task execution. Most systems that perform this allocation assume that the quantity of agents needed for a task is known beforehand. Coalition formation approaches relax this assumption, allowing multiple agents to be dynamically assigned. Unfortunately, many current approaches to coalition formation lack provisions for uncertainty. This prevents application of coalition formation techniques to complex domains, such as real-world robotic systems and agent domains where full state knowledge is not available. Those that do handle uncertainty have no ability to handle dynamic addition or removal of agents from the collective and they constrain the environment to limit the sources of uncertainty. A modeling approach and algorithm for coalition formation is presented that decreases the collective\u27s dependence on knowing agent types. The agent modeling approach enforces stability, allows for arbitrary expansion of the collective, and serves as a basis for calculation of individual coalition payoffs. It explicitly captures uncertainty in agent type and allows uncertainty in coalition value and agent cost, and no agent in the collective is required to perfectly know another agents type. The modeling approach is incorporated into a two part algorithm to generate, evaluate, and join stable coalitions for task execution. A comparison with a prior approach designed to handle uncertainty in agent type shows that the protocol not only provides greater flexibility, but also handles uncertainty on a greater scale. Additional results show the application of the approach to real-world robotics and demonstrate the algorithm\u27s scalability. This provides a framework well suited to decentralized task allocation in general collectives

    A Decentralized Partially Observable Markov Decision Model with Action Duration for Goal Recognition in Real Time Strategy Games

    Get PDF
    Multiagent goal recognition is a tough yet important problem in many real time strategy games or simulation systems. Traditional modeling methods either are in great demand of detailed agents’ domain knowledge and training dataset for policy estimation or lack clear definition of action duration. To solve the above problems, we propose a novel Dec-POMDM-T model, combining the classic Dec-POMDP, an observation model for recognizer, joint goal with its termination indicator, and time duration variables for actions with action termination variables. In this paper, a model-free algorithm named cooperative colearning based on Sarsa is used. Considering that Dec-POMDM-T usually encounters multiagent goal recognition problems with different sorts of noises, partially missing data, and unknown action durations, the paper exploits the SIS PF with resampling for inference under the dynamic Bayesian network structure of Dec-POMDM-T. In experiments, a modified predator-prey scenario is adopted to study multiagent joint goal recognition problem, which is the recognition of the joint target shared among cooperative predators. Experiment results show that (a) Dec-POMDM-T works effectively in multiagent goal recognition and adapts well to dynamic changing goals within agent group; (b) Dec-POMDM-T outperforms traditional Dec-MDP-based methods in terms of precision, recall, and F-measure

    Intention prediction for interactive navigation in distributed robotic systems

    Get PDF
    Modern applications of mobile robots require them to have the ability to safely and effectively navigate in human environments. New challenges arise when these robots must plan their motion in a human-aware fashion. Current methods addressing this problem have focused mainly on the activity forecasting aspect, aiming at improving predictions without considering the active nature of the interaction, i.e. the robot’s effect on the environment and consequent issues such as reciprocity. Furthermore, many methods rely on computationally expensive offline training of predictive models that may not be well suited to rapidly evolving dynamic environments. This thesis presents a novel approach for enabling autonomous robots to navigate socially in environments with humans. Following formulations of the inverse planning problem, agents reason about the intentions of other agents and make predictions about their future interactive motion. A technique is proposed to implement counterfactual reasoning over a parametrised set of light-weight reciprocal motion models, thus making it more tractable to maintain beliefs over the future trajectories of other agents towards plausible goals. The speed of inference and the effectiveness of the algorithms is demonstrated via physical robot experiments, where computationally constrained robots navigate amongst humans in a distributed multi-sensor setup, able to infer other agents’ intentions as fast as 100ms after the first observation. While intention inference is a key aspect of successful human-robot interaction, executing any task requires planning that takes into account the predicted goals and trajectories of other agents, e.g., pedestrians. It is well known that robots demonstrate unwanted behaviours, such as freezing or becoming sluggishly responsive, when placed in dynamic and cluttered environments, due to the way in which safety margins according to simple heuristics end up covering the entire feasible space of motion. The presented approach makes more refined predictions about future movement, which enables robots to find collision-free paths quickly and efficiently. This thesis describes a novel technique for generating "interactive costmaps", a representation of the planner’s costs and rewards across time and space, providing an autonomous robot with the information required to navigate socially given the estimate of other agents’ intentions. This multi-layered costmap deters the robot from obstructing while encouraging social navigation respectful of other agents’ activity. Results show that this approach minimises collisions and near-collisions, minimises travel times for agents, and importantly offers the same computational cost as the most common costmap alternatives for navigation. A key part of the practical deployment of such technologies is their ease of implementation and configuration. Since every use case and environment is different and distinct, the presented methods use online adaptation to learn parameters of the navigating agents during runtime. Furthermore, this thesis includes a novel technique for allocating tasks in distributed robotics systems, where a tool is provided to maximise the performance on any distributed setup by automatic parameter tuning. All of these methods are implemented in ROS and distributed as open-source. The ultimate aim is to provide an accessible and efficient framework that may be seamlessly deployed on modern robots, enabling widespread use of intention prediction for interactive navigation in distributed robotic systems

    Scalable Decision-Theoretic Planning in Open and Typed Multiagent Systems

    Full text link
    In open agent systems, the set of agents that are cooperating or competing changes over time and in ways that are nontrivial to predict. For example, if collaborative robots were tasked with fighting wildfires, they may run out of suppressants and be temporarily unavailable to assist their peers. We consider the problem of planning in these contexts with the additional challenges that the agents are unable to communicate with each other and that there are many of them. Because an agent's optimal action depends on the actions of others, each agent must not only predict the actions of its peers, but, before that, reason whether they are even present to perform an action. Addressing openness thus requires agents to model each other's presence, which becomes computationally intractable with high numbers of agents. We present a novel, principled, and scalable method in this context that enables an agent to reason about others' presence in its shared environment and their actions. Our method extrapolates models of a few peers to the overall behavior of the many-agent system, and combines it with a generalization of Monte Carlo tree search to perform individual agent reasoning in many-agent open environments. Theoretical analyses establish the number of agents to model in order to achieve acceptable worst case bounds on extrapolation error, as well as regret bounds on the agent's utility from modeling only some neighbors. Simulations of multiagent wildfire suppression problems demonstrate our approach's efficacy compared with alternative baselines.Comment: Pre-print with appendices for AAAI 202

    Theory of mind and decision science: Towards a typology of tasks and computational models

    Get PDF
    The ability to form a Theory of Mind (ToM), i.e., to theorize about others’ mental states to explain and predict behavior in relation to attributed intentional states, constitutes a hallmark of human cognition. These abilities are multi-faceted and include a variety of different cognitive sub-functions. Here, we focus on decision processes in social contexts and review a number of experimental and computational modeling approaches in this field. We provide an overview of experimental accounts and formal computational models with respect to two dimensions: interactivity and uncertainty. Thereby, we aim at capturing the nuances of ToM functions in the context of social decision processes. We suggest there to be an increase in ToM engagement and multiplexing as social cognitive decision-making tasks become more interactive and uncertain. We propose that representing others as intentional and goal directed agents who perform consequential actions is elicited only at the edges of these two dimensions. Further, we argue that computational models of valuation and beliefs follow these dimensions to best allow researchers to effectively model sophisticated ToM-processes. Finally, we relate this typology to neuroimaging findings in neurotypical (NT) humans, studies of persons with autism spectrum (AS), and studies of nonhuman primates

    A Unified Framework for Solving Multiagent Task Assignment Problems

    Get PDF
    Multiagent task assignment problem descriptors do not fully represent the complex interactions in a multiagent domain, and algorithmic solutions vary widely depending on how the domain is represented. This issue is compounded as related research fields contain descriptors that similarly describe multiagent task assignment problems, including complex domain interactions, but generally do not provide the mechanisms needed to solve the multiagent aspect of task assignment. This research presents a unified approach to representing and solving the multiagent task assignment problem for complex problem domains. Ideas central to multiagent task allocation, project scheduling, constraint satisfaction, and coalition formation are combined to form the basis of the constrained multiagent task scheduling (CMTS) problem. Basic analysis reveals the exponential size of the solution space for a CMTS problem, approximated by O(2n(m+n)) based on the number of agents and tasks involved in a problem. The shape of the solution space is shown to contain numerous discontinuous regions due to the complexities involved in relational constraints defined between agents and tasks. The CMTS descriptor represents a wide range of classical and modern problems, such as job shop scheduling, the traveling salesman problem, vehicle routing, and cooperative multi-object tracking. Problems using the CMTS representation are solvable by a suite of algorithms, with varying degrees of suitability. Solution generating methods range from simple random scheduling to state-of-the-art biologically inspired approaches. Techniques from classical task assignment solvers are extended to handle multiagent task problems where agents can also multitask. Additional ideas are incorporated from constraint satisfaction, project scheduling, evolutionary algorithms, dynamic coalition formation, auctioning, and behavior-based robotics to highlight how different solution generation strategies apply to the complex problem space
    • 

    corecore