224 research outputs found
Deep Domain-Adversarial Image Generation for Domain Generalisation
Machine learning models typically suffer from the domain shift problem when
trained on a source dataset and evaluated on a target dataset of different
distribution. To overcome this problem, domain generalisation (DG) methods aim
to leverage data from multiple source domains so that a trained model can
generalise to unseen domains. In this paper, we propose a novel DG approach
based on \emph{Deep Domain-Adversarial Image Generation} (DDAIG). Specifically,
DDAIG consists of three components, namely a label classifier, a domain
classifier and a domain transformation network (DoTNet). The goal for DoTNet is
to map the source training data to unseen domains. This is achieved by having a
learning objective formulated to ensure that the generated data can be
correctly classified by the label classifier while fooling the domain
classifier. By augmenting the source training data with the generated unseen
domain data, we can make the label classifier more robust to unknown domain
changes. Extensive experiments on four DG datasets demonstrate the
effectiveness of our approach.Comment: 8 page
People detection and re-identification from a stationary camera located indoors
Cílem této práce je tvorba systému schopného detekovat a sledovat pohyb osob pomocí informací ze stacionární kamery. Systém také dokáže z detekcí extrahovat biometrické informace jako věk a pohlaví. Využití tohoto systému se nabízí zejména v komerčním prostředí, kde obchod může použít tyto informace k predikování chování zákazníků a/nebo plánování marketingových strategií.The goal of this thesis is the creation of a system, which is able to detect and track persons using information from a stationary camera. This system is also able to extract biometric information of age and gender from the detections. This can be useful for example in a commercial setting, where a retail store can use this information to predict customer behavior and/or plan marketing strategies
A Strong Baseline for Fashion Retrieval with Person Re-Identification Models
Fashion retrieval is the challenging task of finding an exact match for
fashion items contained within an image. Difficulties arise from the
fine-grained nature of clothing items, very large intra-class and inter-class
variance. Additionally, query and source images for the task usually come from
different domains - street photos and catalogue photos respectively. Due to
these differences, a significant gap in quality, lighting, contrast, background
clutter and item presentation exists between domains. As a result, fashion
retrieval is an active field of research both in academia and the industry.
Inspired by recent advancements in Person Re-Identification research, we
adapt leading ReID models to be used in fashion retrieval tasks. We introduce a
simple baseline model for fashion retrieval, significantly outperforming
previous state-of-the-art results despite a much simpler architecture. We
conduct in-depth experiments on Street2Shop and DeepFashion datasets and
validate our results. Finally, we propose a cross-domain (cross-dataset)
evaluation method to test the robustness of fashion retrieval models.Comment: 33 pages, 14 figure
SparseTrack: Multi-Object Tracking by Performing Scene Decomposition based on Pseudo-Depth
Exploring robust and efficient association methods has always been an
important issue in multiple-object tracking (MOT). Although existing tracking
methods have achieved impressive performance, congestion and frequent
occlusions still pose challenging problems in multi-object tracking. We reveal
that performing sparse decomposition on dense scenes is a crucial step to
enhance the performance of associating occluded targets. To this end, we
propose a pseudo-depth estimation method for obtaining the relative depth of
targets from 2D images. Secondly, we design a depth cascading matching (DCM)
algorithm, which can use the obtained depth information to convert a dense
target set into multiple sparse target subsets and perform data association on
these sparse target subsets in order from near to far. By integrating the
pseudo-depth method and the DCM strategy into the data association process, we
propose a new tracker, called SparseTrack. SparseTrack provides a new
perspective for solving the challenging crowded scene MOT problem. Only using
IoU matching, SparseTrack achieves comparable performance with the
state-of-the-art (SOTA) methods on the MOT17 and MOT20 benchmarks. Code and
models are publicly available at \url{https://github.com/hustvl/SparseTrack}.Comment: 12 pages, 8 figure
Learning to Generate Novel Domains for Domain Generalization
This paper focuses on domain generalization (DG), the task of learning from
multiple source domains a model that generalizes well to unseen domains. A main
challenge for DG is that the available source domains often exhibit limited
diversity, hampering the model's ability to learn to generalize. We therefore
employ a data generator to synthesize data from pseudo-novel domains to augment
the source domains. This explicitly increases the diversity of available
training domains and leads to a more generalizable model. To train the
generator, we model the distribution divergence between source and synthesized
pseudo-novel domains using optimal transport, and maximize the divergence. To
ensure that semantics are preserved in the synthesized data, we further impose
cycle-consistency and classification losses on the generator. Our method,
L2A-OT (Learning to Augment by Optimal Transport) outperforms current
state-of-the-art DG methods on four benchmark datasets.Comment: To appear in ECCV'2
Learning Domain Invariant Representations for Generalizable Person Re-Identification
Generalizable person Re-Identification (ReID) has attracted growing attention
in recent computer vision community. In this work, we construct a structural
causal model among identity labels, identity-specific factors (clothes/shoes
color etc), and domain-specific factors (background, viewpoints etc). According
to the causal analysis, we propose a novel Domain Invariant Representation
Learning for generalizable person Re-Identification (DIR-ReID) framework.
Specifically, we first propose to disentangle the identity-specific and
domain-specific feature spaces, based on which we propose an effective
algorithmic implementation for backdoor adjustment, essentially serving as a
causal intervention towards the SCM. Extensive experiments have been conducted,
showing that DIR-ReID outperforms state-of-the-art methods on large-scale
domain generalization ReID benchmarks
- …