10,827 research outputs found
Machine Learning for Heterogeneous Ultra-Dense Networks with Graphical Representations
Heterogeneous ultra-dense network (H-UDN) is envisioned as a promising
solution to sustain the explosive mobile traffic demand through network
densification. By placing access points, processors, and storage units as close
as possible to mobile users, H-UDNs bring forth a number of advantages,
including high spectral efficiency, high energy efficiency, and low latency.
Nonetheless, the high density and diversity of network entities in H-UDNs
introduce formidable design challenges in collaborative signal processing and
resource management. This article illustrates the great potential of machine
learning techniques in solving these challenges. In particular, we show how to
utilize graphical representations of H-UDNs to design efficient machine
learning algorithms
Machine Learning Methods for Data Association in Multi-Object Tracking
Data association is a key step within the multi-object tracking pipeline that
is notoriously challenging due to its combinatorial nature. A popular and
general way to formulate data association is as the NP-hard multidimensional
assignment problem (MDAP). Over the last few years, data-driven approaches to
assignment have become increasingly prevalent as these techniques have started
to mature. We focus this survey solely on learning algorithms for the
assignment step of multi-object tracking, and we attempt to unify various
methods by highlighting their connections to linear assignment as well as to
the MDAP. First, we review probabilistic and end-to-end optimization approaches
to data association, followed by methods that learn association affinities from
data. We then compare the performance of the methods presented in this survey,
and conclude by discussing future research directions.Comment: Accepted for publication in ACM Computing Survey
Generalized Grounding Graphs: A Probabilistic Framework for Understanding Grounded Commands
Many task domains require robots to interpret and act upon natural language
commands which are given by people and which refer to the robot's physical
surroundings. Such interpretation is known variously as the symbol grounding
problem, grounded semantics and grounded language acquisition. This problem is
challenging because people employ diverse vocabulary and grammar, and because
robots have substantial uncertainty about the nature and contents of their
surroundings, making it difficult to associate the constitutive language
elements (principally noun phrases and spatial relations) of the command text
to elements of those surroundings. Symbolic models capture linguistic structure
but have not scaled successfully to handle the diverse language produced by
untrained users. Existing statistical approaches can better handle diversity,
but have not to date modeled complex linguistic structure, limiting achievable
accuracy. Recent hybrid approaches have addressed limitations in scaling and
complexity, but have not effectively associated linguistic and perceptual
features. Our framework, called Generalized Grounding Graphs (G^3), addresses
these issues by defining a probabilistic graphical model dynamically according
to the linguistic parse structure of a natural language command. This approach
scales effectively, handles linguistic diversity, and enables the system to
associate parts of a command with the specific objects, places, and events in
the external world to which they refer. We show that robots can learn word
meanings and use those learned meanings to robustly follow natural language
commands produced by untrained users. We demonstrate our approach for both
mobility commands and mobile manipulation commands involving a variety of
semi-autonomous robotic platforms, including a wheelchair, a micro-air vehicle,
a forklift, and the Willow Garage PR2.Comment: Submitted to the Journal of Artificial Intelligence Researc
Safe Navigation with Human Instructions in Complex Scenes
In this paper, we present a robotic navigation algorithm with natural
language interfaces, which enables a robot to safely walk through a changing
environment with moving persons by following human instructions such as "go to
the restaurant and keep away from people". We first classify human instructions
into three types: the goal, the constraints, and uninformative phrases. Next,
we provide grounding for the extracted goal and constraint items in a dynamic
manner along with the navigation process, to deal with the target objects that
are too far away for sensor observation and the appearance of moving obstacles
like humans. In particular, for a goal phrase (e.g., "go to the restaurant"),
we ground it to a location in a predefined semantic map and treat it as a goal
for a global motion planner, which plans a collision-free path in the workspace
for the robot to follow. For a constraint phrase (e.g., "keep away from
people"), we dynamically add the corresponding constraint into a local planner
by adjusting the values of a local costmap according to the results returned by
the object detection module. The updated costmap is then used to compute a
local collision avoidance control for the safe navigation of the robot. By
combining natural language processing, motion planning, and computer vision,
our developed system is demonstrated to be able to successfully follow natural
language navigation instructions to achieve navigation tasks in both simulated
and real-world scenarios. Videos are available at
https://sites.google.com/view/snh
Coloring Big Graphs with AlphaGoZero
We show that recent innovations in deep reinforcement learning can
effectively color very large graphs -- a well-known NP-hard problem with clear
commercial applications. Because the Monte Carlo Tree Search with Upper
Confidence Bound algorithm used in AlphaGoZero can improve the performance of a
given heuristic, our approach allows deep neural networks trained using high
performance computing (HPC) technologies to transform computation into improved
heuristics with zero prior knowledge. Key to our approach is the introduction
of a novel deep neural network architecture (FastColorNet) that has access to
the full graph context and requires time and space to color a graph with
vertices, which enables scaling to very large graphs that arise in real
applications like parallel computing, compilers, numerical solvers, and design
automation, among others. As a result, we are able to learn new state of the
art heuristics for graph coloring
Multi-Hop Knowledge Graph Reasoning with Reward Shaping
Multi-hop reasoning is an effective approach for query answering (QA) over
incomplete knowledge graphs (KGs). The problem can be formulated in a
reinforcement learning (RL) setup, where a policy-based agent sequentially
extends its inference path until it reaches a target. However, in an incomplete
KG environment, the agent receives low-quality rewards corrupted by false
negatives in the training data, which harms generalization at test time.
Furthermore, since no golden action sequence is used for training, the agent
can be misled by spurious search trajectories that incidentally lead to the
correct answer. We propose two modeling advances to address both issues: (1) we
reduce the impact of false negative supervision by adopting a pretrained
one-hop embedding model to estimate the reward of unobserved facts; (2) we
counter the sensitivity to spurious paths of on-policy RL by forcing the agent
to explore a diverse set of paths using randomly generated edge masks. Our
approach significantly improves over existing path-based KGQA models on several
benchmark datasets and is comparable or better than embedding-based models.Comment: Accepted to EMNLP 2018, 12 page
Video Relationship Reasoning using Gated Spatio-Temporal Energy Graph
Visual relationship reasoning is a crucial yet challenging task for
understanding rich interactions across visual concepts. For example, a
relationship 'man, open, door' involves a complex relation 'open' between
concrete entities 'man, door'. While much of the existing work has studied this
problem in the context of still images, understanding visual relationships in
videos has received limited attention. Due to their temporal nature, videos
enable us to model and reason about a more comprehensive set of visual
relationships, such as those requiring multiple (temporal) observations (e.g.,
'man, lift up, box' vs. 'man, put down, box'), as well as relationships that
are often correlated through time (e.g., 'woman, pay, money' followed by
'woman, buy, coffee'). In this paper, we construct a Conditional Random Field
on a fully-connected spatio-temporal graph that exploits the statistical
dependency between relational entities spatially and temporally. We introduce a
novel gated energy function parametrization that learns adaptive relations
conditioned on visual observations. Our model optimization is computationally
efficient, and its space computation complexity is significantly amortized
through our proposed parameterization. Experimental results on benchmark video
datasets (ImageNet Video and Charades) demonstrate state-of-the-art performance
across three standard relationship reasoning tasks: Detection, Tagging, and
Recognition.Comment: CVPR 2019. Supplementary included. Fixing a small typ
TensorFlow: A system for large-scale machine learning
TensorFlow is a machine learning system that operates at large scale and in
heterogeneous environments. TensorFlow uses dataflow graphs to represent
computation, shared state, and the operations that mutate that state. It maps
the nodes of a dataflow graph across many machines in a cluster, and within a
machine across multiple computational devices, including multicore CPUs,
general-purpose GPUs, and custom designed ASICs known as Tensor Processing
Units (TPUs). This architecture gives flexibility to the application developer:
whereas in previous "parameter server" designs the management of shared state
is built into the system, TensorFlow enables developers to experiment with
novel optimizations and training algorithms. TensorFlow supports a variety of
applications, with particularly strong support for training and inference on
deep neural networks. Several Google services use TensorFlow in production, we
have released it as an open-source project, and it has become widely used for
machine learning research. In this paper, we describe the TensorFlow dataflow
model in contrast to existing systems, and demonstrate the compelling
performance that TensorFlow achieves for several real-world applications.Comment: 18 pages, 9 figures; v2 has a spelling correction in the metadat
Memory-based Deep Reinforcement Learning for Obstacle Avoidance in UAV with Limited Environment Knowledge
This paper presents our method for enabling a UAV quadrotor, equipped with a
monocular camera, to autonomously avoid collisions with obstacles in
unstructured and unknown indoor environments. When compared to obstacle
avoidance in ground vehicular robots, UAV navigation brings in additional
challenges because the UAV motion is no more constrained to a well-defined
indoor ground or street environment. Horizontal structures in indoor and
outdoor environments like decorative items, furnishings, ceiling fans,
sign-boards, tree branches etc., also become relevant obstacles unlike those
for ground vehicular robots. Thus, methods of obstacle avoidance developed for
ground robots are clearly inadequate for UAV navigation. Current control
methods using monocular images for UAV obstacle avoidance are heavily dependent
on environment information. These controllers do not fully retain and utilize
the extensively available information about the ambient environment for
decision making. We propose a deep reinforcement learning based method for UAV
obstacle avoidance (OA) and autonomous exploration which is capable of doing
exactly the same. The crucial idea in our method is the concept of partial
observability and how UAVs can retain relevant information about the
environment structure to make better future navigation decisions. Our OA
technique uses recurrent neural networks with temporal attention and provides
better results compared to prior works in terms of distance covered during
navigation without collisions. In addition, our technique has a high inference
rate (a key factor in robotic applications) and is energy-efficient as it
minimizes oscillatory motion of UAV and reduces power wastage.Comment: Submitted to IEEE Transactions on Cybernetics. Supplementary Video:
https://www.youtube.com/watch?v=Lqh_B9U3Gv
Reinforcement Learning with Deep Energy-Based Policies
We propose a method for learning expressive energy-based policies for
continuous states and actions, which has been feasible only in tabular domains
before. We apply our method to learning maximum entropy policies, resulting
into a new algorithm, called soft Q-learning, that expresses the optimal policy
via a Boltzmann distribution. We use the recently proposed amortized Stein
variational gradient descent to learn a stochastic sampling network that
approximates samples from this distribution. The benefits of the proposed
algorithm include improved exploration and compositionality that allows
transferring skills between tasks, which we confirm in simulated experiments
with swimming and walking robots. We also draw a connection to actor-critic
methods, which can be viewed performing approximate inference on the
corresponding energy-based model
- …