56 research outputs found
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Self-paced Convolutional Neural Network for Computer Aided Detection in Medical Imaging Analysis
Tissue characterization has long been an important component of Computer
Aided Diagnosis (CAD) systems for automatic lesion detection and further
clinical planning. Motivated by the superior performance of deep learning
methods on various computer vision problems, there has been increasing work
applying deep learning to medical image analysis. However, the development of a
robust and reliable deep learning model for computer-aided diagnosis is still
highly challenging due to the combination of the high heterogeneity in the
medical images and the relative lack of training samples. Specifically,
annotation and labeling of the medical images is much more expensive and
time-consuming than other applications and often involves manual labor from
multiple domain experts. In this work, we propose a multi-stage, self-paced
learning framework utilizing a convolutional neural network (CNN) to classify
Computed Tomography (CT) image patches. The key contribution of this approach
is that we augment the size of training samples by refining the unlabeled
instances with a self-paced learning CNN. By implementing the framework on high
performance computing servers including the NVIDIA DGX1 machine, we obtained
the experimental result, showing that the self-pace boosted network
consistently outperformed the original network even with very scarce manual
labels. The performance gain indicates that applications with limited training
samples such as medical image analysis can benefit from using the proposed
framework.Comment: accepted by 8th International Workshop on Machine Learning in Medical
Imaging (MLMI 2017
An Attention-based Graph Neural Network for Heterogeneous Structural Learning
In this paper, we focus on graph representation learning of heterogeneous
information network (HIN), in which various types of vertices are connected by
various types of relations. Most of the existing methods conducted on HIN
revise homogeneous graph embedding models via meta-paths to learn
low-dimensional vector space of HIN. In this paper, we propose a novel
Heterogeneous Graph Structural Attention Neural Network (HetSANN) to directly
encode structural information of HIN without meta-path and achieve more
informative representations. With this method, domain experts will not be
needed to design meta-path schemes and the heterogeneous information can be
processed automatically by our proposed model. Specifically, we implicitly
represent heterogeneous information using the following two methods: 1) we
model the transformation between heterogeneous vertices through a projection in
low-dimensional entity spaces; 2) afterwards, we apply the graph neural network
to aggregate multi-relational information of projected neighborhood by means of
attention mechanism. We also present three extensions of HetSANN, i.e.,
voices-sharing product attention for the pairwise relationships in HIN,
cycle-consistency loss to retain the transformation between heterogeneous
entity spaces, and multi-task learning with full use of information. The
experiments conducted on three public datasets demonstrate that our proposed
models achieve significant and consistent improvements compared to
state-of-the-art solutions
Domain Adaptation with Incomplete Target Domains
Domain adaptation, as a task of reducing the annotation cost in a target
domain by exploiting the existing labeled data in an auxiliary source domain,
has received a lot of attention in the research community. However, the
standard domain adaptation has assumed perfectly observed data in both domains,
while in real world applications the existence of missing data can be
prevalent. In this paper, we tackle a more challenging domain adaptation
scenario where one has an incomplete target domain with partially observed
data. We propose an Incomplete Data Imputation based Adversarial Network
(IDIAN) model to address this new domain adaptation challenge. In the proposed
model, we design a data imputation module to fill the missing feature values
based on the partial observations in the target domain, while aligning the two
domains via deep adversarial adaption. We conduct experiments on both
cross-domain benchmark tasks and a real world adaptation task with imperfect
target domains. The experimental results demonstrate the effectiveness of the
proposed method
CoRide: Joint Order Dispatching and Fleet Management for Multi-Scale Ride-Hailing Platforms
How to optimally dispatch orders to vehicles and how to tradeoff between
immediate and future returns are fundamental questions for a typical
ride-hailing platform. We model ride-hailing as a large-scale parallel ranking
problem and study the joint decision-making task of order dispatching and fleet
management in online ride-hailing platforms. This task brings unique challenges
in the following four aspects. First, to facilitate a huge number of vehicles
to act and learn efficiently and robustly, we treat each region cell as an
agent and build a multi-agent reinforcement learning framework. Second, to
coordinate the agents from different regions to achieve long-term benefits, we
leverage the geographical hierarchy of the region grids to perform hierarchical
reinforcement learning. Third, to deal with the heterogeneous and variant
action space for joint order dispatching and fleet management, we design the
action as the ranking weight vector to rank and select the specific order or
the fleet management destination in a unified formulation. Fourth, to achieve
the multi-scale ride-hailing platform, we conduct the decision-making process
in a hierarchical way where a multi-head attention mechanism is utilized to
incorporate the impacts of neighbor agents and capture the key agent in each
scale. The whole novel framework is named as CoRide. Extensive experiments
based on multiple cities real-world data as well as analytic synthetic data
demonstrate that CoRide provides superior performance in terms of platform
revenue and user experience in the task of city-wide hybrid order dispatching
and fleet management over strong baselines.Comment: CIKM 201
- …