1,156 research outputs found

    Beyond Intra-modality: A Survey of Heterogeneous Person Re-identification

    Full text link
    An efficient and effective person re-identification (ReID) system relieves the users from painful and boring video watching and accelerates the process of video analysis. Recently, with the explosive demands of practical applications, a lot of research efforts have been dedicated to heterogeneous person re-identification (Hetero-ReID). In this paper, we provide a comprehensive review of state-of-the-art Hetero-ReID methods that address the challenge of inter-modality discrepancies. According to the application scenario, we classify the methods into four categories -- low-resolution, infrared, sketch, and text. We begin with an introduction of ReID, and make a comparison between Homogeneous ReID (Homo-ReID) and Hetero-ReID tasks. Then, we describe and compare existing datasets for performing evaluations, and survey the models that have been widely employed in Hetero-ReID. We also summarize and compare the representative approaches from two perspectives, i.e., the application scenario and the learning pipeline. We conclude by a discussion of some future research directions. Follow-up updates are avaible at: https://github.com/lightChaserX/Awesome-Hetero-reIDComment: Accepted by IJCAI 2020. Project url: https://github.com/lightChaserX/Awesome-Hetero-reI

    A Survey of Dataset Refinement for Problems in Computer Vision Datasets

    Full text link
    Large-scale datasets have played a crucial role in the advancement of computer vision. However, they often suffer from problems such as class imbalance, noisy labels, dataset bias, or high resource costs, which can inhibit model performance and reduce trustworthiness. With the advocacy of data-centric research, various data-centric solutions have been proposed to solve the dataset problems mentioned above. They improve the quality of datasets by re-organizing them, which we call dataset refinement. In this survey, we provide a comprehensive and structured overview of recent advances in dataset refinement for problematic computer vision datasets. Firstly, we summarize and analyze the various problems encountered in large-scale computer vision datasets. Then, we classify the dataset refinement algorithms into three categories based on the refinement process: data sampling, data subset selection, and active learning. In addition, we organize these dataset refinement methods according to the addressed data problems and provide a systematic comparative description. We point out that these three types of dataset refinement have distinct advantages and disadvantages for dataset problems, which informs the choice of the data-centric method appropriate to a particular research objective. Finally, we summarize the current literature and propose potential future research topics.Comment: 33 pages, 10 figures, to be published in ACM Computing Survey

    MetaGCD: Learning to Continually Learn in Generalized Category Discovery

    Full text link
    In this paper, we consider a real-world scenario where a model that is trained on pre-defined classes continually encounters unlabeled data that contains both known and novel classes. The goal is to continually discover novel classes while maintaining the performance in known classes. We name the setting Continual Generalized Category Discovery (C-GCD). Existing methods for novel class discovery cannot directly handle the C-GCD setting due to some unrealistic assumptions, such as the unlabeled data only containing novel classes. Furthermore, they fail to discover novel classes in a continual fashion. In this work, we lift all these assumptions and propose an approach, called MetaGCD, to learn how to incrementally discover with less forgetting. Our proposed method uses a meta-learning framework and leverages the offline labeled data to simulate the testing incremental learning process. A meta-objective is defined to revolve around two conflicting learning objectives to achieve novel class discovery without forgetting. Furthermore, a soft neighborhood-based contrastive network is proposed to discriminate uncorrelated images while attracting correlated images. We build strong baselines and conduct extensive experiments on three widely used benchmarks to demonstrate the superiority of our method.Comment: This paper has been accepted by ICCV202

    Intermediate intraseasonal variability in the western tropical Pacific Ocean: meridional distribution of equatorial Rossby waves influenced by a tilted boundary

    Get PDF
    Author Posting. © American Meteorological Society, 2020. This article is posted here by permission of American Meteorological Society for personal use, not for redistribution. The definitive version was published in Journal of Physical Oceanography 50(4),(2020): 921-933, doi:10.1175/JPO-D-19-0184.1.Intermediate-depth intraseasonal variability (ISV) at a 20–90-day period, as detected in velocity measurements from seven subsurface moorings in the tropical western Pacific, is interpreted in terms of equatorial Rossby waves. The moorings were deployed between 0° and 7.5°N along 142°E from September 2014 to October 2015. The strongest ISV energy at 1200 m occurs at 4.5°N. Peak energy at 4.5°N is also seen in an eddy-resolving global circulation model. An analysis of the model output identifies the source of the ISV as short equatorial Rossby waves with westward phase speed but southeastward and downward group velocity. Additionally, it is shown that a superposition of first three baroclinic modes is required to represent the ISV energy propagation. Further analysis using a 1.5-layer shallow water model suggests that the first meridional mode Rossby wave accounts for the specific meridional distribution of ISV in the western Pacific. The same model suggests that the tilted coastlines of Irian Jaya and Papua New Guinea, which lie to the south of the moorings, shift the location of the northern peak of meridional velocity oscillation from 3°N to near 4.5°N. The tilt of this boundary with respect to a purely zonal alignment therefore needs to be taken into account to explain this meridional shift of the peak. Calculation of the barotropic conversion rate indicates that the intraseasonal kinetic energy below 1000 m can be transferred into the mean flows, suggesting a possible forcing mechanism for intermediate-depth zonal jets.This study is supported by the National Natural Science Foundation of China (Grants 91958204 and 41776022), the China Ocean Mineral Resources Research and Development Association Program (DY135-E2-3-02), and the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant XDA22000000). L. Pratt was supported by the U.S. National Science Foundation Grant OCE-1657870. F. Wang thanks the support from the Scientific and Technological Innovation Project by Qingdao National Laboratory for Marine Science and Technology (Grant 2016ASKJ12), the National Program on Global Change and Air-Sea Interaction (Grant GASI-IPOVAI-01-01), and the National Natural Science Foundation of China (Grants 41730534, 41421005, and U1406401)

    Multi-level feature fusion network combining attention mechanisms for polyp segmentation

    Full text link
    Clinically, automated polyp segmentation techniques have the potential to significantly improve the efficiency and accuracy of medical diagnosis, thereby reducing the risk of colorectal cancer in patients. Unfortunately, existing methods suffer from two significant weaknesses that can impact the accuracy of segmentation. Firstly, features extracted by encoders are not adequately filtered and utilized. Secondly, semantic conflicts and information redundancy caused by feature fusion are not attended to. To overcome these limitations, we propose a novel approach for polyp segmentation, named MLFF-Net, which leverages multi-level feature fusion and attention mechanisms. Specifically, MLFF-Net comprises three modules: Multi-scale Attention Module (MAM), High-level Feature Enhancement Module (HFEM), and Global Attention Module (GAM). Among these, MAM is used to extract multi-scale information and polyp details from the shallow output of the encoder. In HFEM, the deep features of the encoders complement each other by aggregation. Meanwhile, the attention mechanism redistributes the weight of the aggregated features, weakening the conflicting redundant parts and highlighting the information useful to the task. GAM combines features from the encoder and decoder features, as well as computes global dependencies to prevent receptive field locality. Experimental results on five public datasets show that the proposed method not only can segment multiple types of polyps but also has advantages over current state-of-the-art methods in both accuracy and generalization ability