71,647 research outputs found

    Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization

    Full text link
    Reinforcement learning can acquire complex behaviors from high-level specifications. However, defining a cost function that can be optimized effectively and encodes the correct task is challenging in practice. We explore how inverse optimal control (IOC) can be used to learn behaviors from demonstrations, with applications to torque control of high-dimensional robotic systems. Our method addresses two key challenges in inverse optimal control: first, the need for informative features and effective regularization to impose structure on the cost, and second, the difficulty of learning the cost function under unknown dynamics for high-dimensional continuous systems. To address the former challenge, we present an algorithm capable of learning arbitrary nonlinear cost functions, such as neural networks, without meticulous feature engineering. To address the latter challenge, we formulate an efficient sample-based approximation for MaxEnt IOC. We evaluate our method on a series of simulated tasks and real-world robotic manipulation problems, demonstrating substantial improvement over prior methods both in terms of task complexity and sample efficiency.Comment: International Conference on Machine Learning (ICML), 2016, to appea

    Distributed Constraint Optimization Problems and Applications: A Survey

    Full text link
    The field of Multi-Agent System (MAS) is an active area of research within Artificial Intelligence, with an increasingly important impact in industrial and other real-world applications. Within a MAS, autonomous agents interact to pursue personal interests and/or to achieve common objectives. Distributed Constraint Optimization Problems (DCOPs) have emerged as one of the prominent agent architectures to govern the agents' autonomous behavior, where both algorithms and communication models are driven by the structure of the specific problem. During the last decade, several extensions to the DCOP model have enabled them to support MAS in complex, real-time, and uncertain environments. This survey aims at providing an overview of the DCOP model, giving a classification of its multiple extensions and addressing both resolution methods and applications that find a natural mapping within each class of DCOPs. The proposed classification suggests several future perspectives for DCOP extensions, and identifies challenges in the design of efficient resolution algorithms, possibly through the adaptation of strategies from different areas

    Network Multiple-Input and Multiple-Output for Wireless Local Area Networks

    Full text link
    This paper presents a tutorial for network multiple-input and multiple-output (netMIMO) in wireless local area networks (WLAN). Wireless traffic demand is growing exponentially. NetMIMO allows access points (APs) in a WLAN to cooperate in their transmissions as if the APs form a single virtual MIMO node. NetMIMO can significantly increase network capacity by reducing interferences and contentions through the cooperation of the APs. This paper covers a few representative netMIMO methods, ranging from interference alignment and cancelation, channel access protocol to allow MIMO nodes to join ongoing transmissions, distributed synchronization, to interference and contention mitigation in multiple contention domains. We believe the netMIMO methods described here are just the beginning of the new technologies to address the challenge of ever-increasing wireless traffic demand, and the future will see even more new developments in this field.Comment: This paper has been withdraw by the authors due to a crucial error in algorithm

    A Vision of 6G Wireless Systems: Applications, Trends, Technologies, and Open Research Problems

    Full text link
    The ongoing deployment of 5G cellular systems is continuously exposing the inherent limitations of this system, compared to its original premise as an enabler for Internet of Everything applications. These 5G drawbacks are currently spurring worldwide activities focused on defining the next-generation 6G wireless system that can truly integrate far-reaching applications ranging from autonomous systems to extended reality and haptics. Despite recent 6G initiatives1, the fundamental architectural and performance components of the system remain largely undefined. In this paper, we present a holistic, forward-looking vision that defines the tenets of a 6G system. We opine that 6G will not be a mere exploration of more spectrum at high-frequency bands, but it will rather be a convergence of upcoming technological trends driven by exciting, underlying services. In this regard, we first identify the primary drivers of 6G systems, in terms of applications and accompanying technological trends. Then, we propose a new set of service classes and expose their target 6G performance requirements. We then identify the enabling technologies for the introduced 6G services and outline a comprehensive research agenda that leverages those technologies. We conclude by providing concrete recommendations for the roadmap toward 6G. Ultimately, the intent of this article is to serve as a basis for stimulating more out-of-the-box research around 6G.Comment: This paper has been accepted by IEEE Networ

    Generating and designing DNA with deep generative models

    Full text link
    We propose generative neural network methods to generate DNA sequences and tune them to have desired properties. We present three approaches: creating synthetic DNA sequences using a generative adversarial network; a DNA-based variant of the activation maximization ("deep dream") design method; and a joint procedure which combines these two approaches together. We show that these tools capture important structures of the data and, when applied to designing probes for protein binding microarrays, allow us to generate new sequences whose properties are estimated to be superior to those found in the training data. We believe that these results open the door for applying deep generative models to advance genomics research.Comment: NIPS 2017 Computational Biology Worksho

    Search and Placement in Tiered Cache Networks

    Full text link
    Content distribution networks have been extremely successful in today's Internet. Despite their success, there are still a number of scalability and performance challenges that motivate clean slate solutions for content dissemination, such as content centric networking. In this paper, we address two of the fundamental problems faced by any content dissemination system: content search and content placement. We consider a multi-tiered, multi-domain hierarchical system wherein random walks are used to cope with the tradeoff between exploitation of known paths towards custodians versus opportunistic exploration of replicas in a given neighborhood. TTL-like mechanisms, referred to as reinforced counters, are used for content placement. We propose an analytical model to study the interplay between search and placement. The model yields closed form expressions for metrics of interest such as the average delay experienced by users and the load placed on custodians. Then, leveraging the model solution we pose a joint placement-search optimization problem. We show that previously proposed strategies for optimal placement, such as the square-root allocation, follow as special cases of ours, and that a bang-bang search policy is optimal if content allocation is given

    Reinforcement and Imitation Learning for Diverse Visuomotor Skills

    Full text link
    We propose a model-free deep reinforcement learning method that leverages a small amount of demonstration data to assist a reinforcement learning agent. We apply this approach to robotic manipulation tasks and train end-to-end visuomotor policies that map directly from RGB camera inputs to joint velocities. We demonstrate that our approach can solve a wide variety of visuomotor tasks, for which engineering a scripted controller would be laborious. In experiments, our reinforcement and imitation agent achieves significantly better performances than agents trained with reinforcement learning or imitation learning alone. We also illustrate that these policies, trained with large visual and dynamics variations, can achieve preliminary successes in zero-shot sim2real transfer. A brief visual description of this work can be viewed in https://youtu.be/EDl8SQUNjj0Comment: 13 pages, 6 figures, Published in RSS 201

    An introduction to domain adaptation and transfer learning

    Full text link
    In machine learning, if the training data is an unbiased sample of an underlying distribution, then the learned classification function will make accurate predictions for new samples. However, if the training data is not an unbiased sample, then there will be differences between how the training data is distributed and how the test data is distributed. Standard classifiers cannot cope with changes in data distributions between training and test phases, and will not perform well. Domain adaptation and transfer learning are sub-fields within machine learning that are concerned with accounting for these types of changes. Here, we present an introduction to these fields, guided by the question: when and how can a classifier generalize from a source to a target domain? We will start with a brief introduction into risk minimization, and how transfer learning and domain adaptation expand upon this framework. Following that, we discuss three special cases of data set shift, namely prior, covariate and concept shift. For more complex domain shifts, there are a wide variety of approaches. These are categorized into: importance-weighting, subspace mapping, domain-invariant spaces, feature augmentation, minimax estimators and robust algorithms. A number of points will arise, which we will discuss in the last section. We conclude with the remark that many open questions will have to be addressed before transfer learners and domain-adaptive classifiers become practical.Comment: Technical Report. 41 pages, 5 figure

    Software Defined Optical Networks (SDONs): A Comprehensive Survey

    Full text link
    The emerging Software Defined Networking (SDN) paradigm separates the data plane from the control plane and centralizes network control in an SDN controller. Applications interact with controllers to implement network services, such as network transport with Quality of Service (QoS). SDN facilitates the virtualization of network functions so that multiple virtual networks can operate over a given installed physical network infrastructure. Due to the specific characteristics of optical (photonic) communication components and the high optical transmission capacities, SDN based optical networking poses particular challenges, but holds also great potential. In this article, we comprehensively survey studies that examine the SDN paradigm in optical networks; in brief, we survey the area of Software Defined Optical Networks (SDONs). We mainly organize the SDON studies into studies focused on the infrastructure layer, the control layer, and the application layer. Moreover, we cover SDON studies focused on network virtualization, as well as SDON studies focused on the orchestration of multilayer and multidomain networking. Based on the survey, we identify open challenges for SDONs and outline future directions

    A Distributed Reinforcement Learning Solution With Knowledge Transfer Capability for A Bike Rebalancing Problem

    Full text link
    Rebalancing is a critical service bottleneck for many transportation services, such as Citi Bike. Citi Bike relies on manual orchestrations of rebalancing bikes between dispatchers and field agents. Motivated by such problem and the lack of smart autonomous solutions in this area, this project explored a new RL architecture called Distributed RL (DiRL) with Transfer Learning (TL) capability. The DiRL solution is adaptive to changing traffic dynamics when keeping bike stock under control at the minimum cost. DiRL achieved a 350% improvement in bike rebalancing autonomously and TL offered a 62.4% performance boost in managing an entire bike network. Lastly, a field trip to the dispatch office of Chariot, a ride-sharing service, provided insights to overcome challenges of deploying an RL solution in the real world
    corecore