299,810 research outputs found
Multi-instance graphical transfer clustering for traffic data learning
© 2016 IEEE. In order to better model complex real-world data and to develop robust features that capture relevant information, we usually employ unsupervised feature learning to learn a layer of features representations from unlabeled data. However, developing domain-specific features for each task is expensive, time-consuming and requires expertise of the data. In this paper, we introduce multi-instance clustering and graphical learning to unsupervised transfer learning. For a better clustering efficient, we proposed a set of algorithms on the application of traffic data learning, instance feature representation, distance calculation of multi-instance clustering, multi-instance graphical cluster initialisation, multi-instance multi-cluster update, and graphical multi-instance transfer clustering (GMITC). In the end of this paper, we examine the proposed algorithms on the Eastwest datasets by couples of baselines. The experiment results indicate that our proposed algorithms can get higher clustering accuracy and much higher programming speed
A Principled Approach for Learning Task Similarity in Multitask Learning
Multitask learning aims at solving a set of related tasks simultaneously, by
exploiting the shared knowledge for improving the performance on individual
tasks. Hence, an important aspect of multitask learning is to understand the
similarities within a set of tasks. Previous works have incorporated this
similarity information explicitly (e.g., weighted loss for each task) or
implicitly (e.g., adversarial loss for feature adaptation), for achieving good
empirical performances. However, the theoretical motivations for adding task
similarity knowledge are often missing or incomplete. In this paper, we give a
different perspective from a theoretical point of view to understand this
practice. We first provide an upper bound on the generalization error of
multitask learning, showing the benefit of explicit and implicit task
similarity knowledge. We systematically derive the bounds based on two distinct
task similarity metrics: H divergence and Wasserstein distance. From these
theoretical results, we revisit the Adversarial Multi-task Neural Network,
proposing a new training algorithm to learn the task relation coefficients and
neural network parameters iteratively. We assess our new algorithm empirically
on several benchmarks, showing not only that we find interesting and robust
task relations, but that the proposed approach outperforms the baselines,
reaffirming the benefits of theoretical insight in algorithm design
MMFL-Net: Multi-scale and Multi-granularity Feature Learning for Cross-domain Fashion Retrieval
Instance-level image retrieval in fashion is a challenging issue owing to its
increasing importance in real-scenario visual fashion search. Cross-domain
fashion retrieval aims to match the unconstrained customer images as queries
for photographs provided by retailers; however, it is a difficult task due to a
wide range of consumer-to-shop (C2S) domain discrepancies and also considering
that clothing image is vulnerable to various non-rigid deformations. To this
end, we propose a novel multi-scale and multi-granularity feature learning
network (MMFL-Net), which can jointly learn global-local aggregation feature
representations of clothing images in a unified framework, aiming to train a
cross-domain model for C2S fashion visual similarity. First, a new
semantic-spatial feature fusion part is designed to bridge the semantic-spatial
gap by applying top-down and bottom-up bidirectional multi-scale feature
fusion. Next, a multi-branch deep network architecture is introduced to capture
global salient, part-informed, and local detailed information, and extracting
robust and discrimination feature embedding by integrating the similarity
learning of coarse-to-fine embedding with the multiple granularities. Finally,
the improved trihard loss, center loss, and multi-task classification loss are
adopted for our MMFL-Net, which can jointly optimize intra-class and
inter-class distance and thus explicitly improve intra-class compactness and
inter-class discriminability between its visual representations for feature
learning. Furthermore, our proposed model also combines the multi-task
attribute recognition and classification module with multi-label semantic
attributes and product ID labels. Experimental results demonstrate that our
proposed MMFL-Net achieves significant improvement over the state-of-the-art
methods on the two datasets, DeepFashion-C2S and Street2Shop.Comment: 27 pages, 12 figures, Published by <Multimedia Tools and
Applications
FiLM: Visual Reasoning with a General Conditioning Layer
We introduce a general-purpose conditioning method for neural networks called
FiLM: Feature-wise Linear Modulation. FiLM layers influence neural network
computation via a simple, feature-wise affine transformation based on
conditioning information. We show that FiLM layers are highly effective for
visual reasoning - answering image-related questions which require a
multi-step, high-level process - a task which has proven difficult for standard
deep learning methods that do not explicitly model reasoning. Specifically, we
show on visual reasoning tasks that FiLM layers 1) halve state-of-the-art error
for the CLEVR benchmark, 2) modulate features in a coherent manner, 3) are
robust to ablations and architectural modifications, and 4) generalize well to
challenging, new data from few examples or even zero-shot.Comment: AAAI 2018. Code available at http://github.com/ethanjperez/film .
Extends arXiv:1707.0301
- …