5,263 research outputs found
Dual Embedding Expansion for Vehicle Re-identification
Vehicle re-identification plays a crucial role in the management of
transportation infrastructure and traffic flow. However, this is a challenging
task due to the large view-point variations in appearance, environmental and
instance-related factors. Modern systems deploy CNNs to produce unique
representations from the images of each vehicle instance. Most work focuses on
leveraging new losses and network architectures to improve the descriptiveness
of these representations. In contrast, our work concentrates on re-ranking and
embedding expansion techniques. We propose an efficient approach for combining
the outputs of multiple models at various scales while exploiting tracklet and
neighbor information, called dual embedding expansion (DEx). Additionally, a
comparative study of several common image retrieval techniques is presented in
the context of vehicle re-ID. Our system yields competitive performance in the
2020 NVIDIA AI City Challenge with promising results. We demonstrate that DEx
when combined with other re-ranking techniques, can produce an even larger gain
without any additional attribute labels or manual supervision
MLA-BIN: Model-level Attention and Batch-instance Style Normalization for Domain Generalization of Federated Learning on Medical Image Segmentation
The privacy protection mechanism of federated learning (FL) offers an
effective solution for cross-center medical collaboration and data sharing. In
multi-site medical image segmentation, each medical site serves as a client of
FL, and its data naturally forms a domain. FL supplies the possibility to
improve the performance of seen domains model. However, there is a problem of
domain generalization (DG) in the actual de-ployment, that is, the performance
of the model trained by FL in unseen domains will decrease. Hence, MLA-BIN is
proposed to solve the DG of FL in this study. Specifically, the model-level
attention module (MLA) and batch-instance style normalization (BIN) block were
designed. The MLA represents the unseen domain as a linear combination of seen
domain models. The atten-tion mechanism is introduced for the weighting
coefficient to obtain the optimal coefficient ac-cording to the similarity of
inter-domain data features. MLA enables the global model to gen-eralize to
unseen domain. In the BIN block, batch normalization (BN) and instance
normalization (IN) are combined to perform the shallow layers of the
segmentation network for style normali-zation, solving the influence of
inter-domain image style differences on DG. The extensive experimental results
of two medical image seg-mentation tasks demonstrate that the proposed MLA-BIN
outperforms state-of-the-art methods.Comment: 9 pages, 8 figures, 2 table
Automated Synthetic-to-Real Generalization
Models trained on synthetic images often face degraded generalization to real
data. As a convention, these models are often initialized with ImageNet
pre-trained representation. Yet the role of ImageNet knowledge is seldom
discussed despite common practices that leverage this knowledge to maintain the
generalization ability. An example is the careful hand-tuning of early stopping
and layer-wise learning rates, which is shown to improve synthetic-to-real
generalization but is also laborious and heuristic. In this work, we explicitly
encourage the synthetically trained model to maintain similar representations
with the ImageNet pre-trained model, and propose a \textit{learning-to-optimize
(L2O)} strategy to automate the selection of layer-wise learning rates. We
demonstrate that the proposed framework can significantly improve the
synthetic-to-real generalization performance without seeing and training on
real data, while also benefiting downstream tasks such as domain adaptation.
Code is available at: https://github.com/NVlabs/ASG.Comment: Accepted to ICML 202
Automated Synthetic-to-Real Generalization
Models trained on synthetic images often face degraded generalization to real data. As a convention, these models are often initialized with ImageNet pre-trained representation. Yet the role of ImageNet knowledge is seldom discussed despite common practices that leverage this knowledge to maintain the generalization ability. An example is the careful hand-tuning of early stopping and layer-wise learning rates, which is shown to improve synthetic-to-real generalization but is also laborious and heuristic. In this work, we explicitly encourage the synthetically trained model to maintain similar representations with the ImageNet pre-trained model, and propose a \textit{learning-to-optimize (L2O)} strategy to automate the selection of layer-wise learning rates. We demonstrate that the proposed framework can significantly improve the synthetic-to-real generalization performance without seeing and training on real data, while also benefiting downstream tasks such as domain adaptation. Code is available at: this https URL https://github.com/NVlabs/ASG
- …