40 research outputs found

    YYDS: Visible-Infrared Person Re-Identification with Coarse Descriptions

    Full text link
    Visible-infrared person re-identification (VI-ReID) is challenging due to considerable cross-modality discrepancies. Existing works mainly focus on learning modality-invariant features while suppressing modality-specific ones. However, retrieving visible images only depends on infrared samples is an extreme problem because of the absence of color information. To this end, we present the Refer-VI-ReID settings, which aims to match target visible images from both infrared images and coarse language descriptions (e.g., "a man with red top and black pants") to complement the missing color information. To address this task, we design a Y-Y-shape decomposition structure, dubbed YYDS, to decompose and aggregate texture and color features of targets. Specifically, the text-IoU regularization strategy is firstly presented to facilitate the decomposition training, and a joint relation module is then proposed to infer the aggregation. Furthermore, the cross-modal version of k-reciprocal re-ranking algorithm is investigated, named CMKR, in which three neighbor search strategies and one local query expansion method are explored to alleviate the modality bias problem of the near neighbors. We conduct experiments on SYSU-MM01, RegDB and LLCM datasets with our manually annotated descriptions. Both YYDS and CMKR achieve remarkable improvements over SOTA methods on all three datasets. Codes are available at https://github.com/dyhBUPT/YYDS.Comment: 14 pages, 6 figure

    iKUN: Speak to Trackers without Retraining

    Full text link
    Referring multi-object tracking (RMOT) aims to track multiple objects based on input textual descriptions. Previous works realize it by simply integrating an extra textual module into the multi-object tracker. However, they typically need to retrain the entire framework and have difficulties in optimization. In this work, we propose an insertable Knowledge Unification Network, termed iKUN, to enable communication with off-the-shelf trackers in a plug-and-play manner. Concretely, a knowledge unification module (KUM) is designed to adaptively extract visual features based on textual guidance. Meanwhile, to improve the localization accuracy, we present a neural version of Kalman filter (NKF) to dynamically adjust process noise and observation noise based on the current motion status. Moreover, to address the problem of open-set long-tail distribution of textual descriptions, a test-time similarity calibration method is proposed to refine the confidence score with pseudo frequency. Extensive experiments on Refer-KITTI dataset verify the effectiveness of our framework. Finally, to speed up the development of RMOT, we also contribute a more challenging dataset, Refer-Dance, by extending public DanceTrack dataset with motion and dressing descriptions. The codes and dataset are available at https://github.com/dyhBUPT/iKUN.Comment: CVPR 2024 camera-read

    Video-based Visible-Infrared Person Re-Identification with Auxiliary Samples

    Full text link
    Visible-infrared person re-identification (VI-ReID) aims to match persons captured by visible and infrared cameras, allowing person retrieval and tracking in 24-hour surveillance systems. Previous methods focus on learning from cross-modality person images in different cameras. However, temporal information and single-camera samples tend to be neglected. To crack this nut, in this paper, we first contribute a large-scale VI-ReID dataset named BUPTCampus. Different from most existing VI-ReID datasets, it 1) collects tracklets instead of images to introduce rich temporal information, 2) contains pixel-aligned cross-modality sample pairs for better modality-invariant learning, 3) provides one auxiliary set to help enhance the optimization, in which each identity only appears in a single camera. Based on our constructed dataset, we present a two-stream framework as baseline and apply Generative Adversarial Network (GAN) to narrow the gap between the two modalities. To exploit the advantages introduced by the auxiliary set, we propose a curriculum learning based strategy to jointly learn from both primary and auxiliary sets. Moreover, we design a novel temporal k-reciprocal re-ranking method to refine the ranking list with fine-grained temporal correlation cues. Experimental results demonstrate the effectiveness of the proposed methods. We also reproduce 9 state-of-the-art image-based and video-based VI-ReID methods on BUPTCampus and our methods show substantial superiority to them. The codes and dataset are available at: https://github.com/dyhBUPT/BUPTCampus.Comment: Accepted by Transactions on Information Forensics & Security 202

    Ad-UDDI: An Active and Distributed Service Registry

    Full text link
    Abstract. In SOA (Service Oriented Architecture), web service providers use service registries to publish services and requestors use registries to find them. The major current service registry specifications, UDDI (Universal Description, Discovery and Integration), has the following drawbacks. First, it replicates all public service publications in all UBR (Universal Business Registry) nodes, which is not scalable and efficient, and second, it collects service information in a passive manner, which means it waits for service publication, updating or discovery request passively and thus cannot guarantee the real-time validity of the services information. In this paper, we propose an active and distributed UDDI architecture called Ad-UDDI, which extends and organizes the private or semi-private UDDIs based on industry classifications. Further, Ad-UDDI adopts an active monitoring mechanism, so that service information can be updated automatically and the service requestors may find the latest service information conveniently. We evaluate Ad-UDDI by comprehensive simulations and experimental results show that it outperforms existing approaches significantly.

    Pushing the Limits of Machine Design: Automated CPU Design with AI

    Full text link
    Design activity -- constructing an artifact description satisfying given goals and constraints -- distinguishes humanity from other animals and traditional machines, and endowing machines with design abilities at the human level or beyond has been a long-term pursuit. Though machines have already demonstrated their abilities in designing new materials, proteins, and computer programs with advanced artificial intelligence (AI) techniques, the search space for designing such objects is relatively small, and thus, "Can machines design like humans?" remains an open question. To explore the boundary of machine design, here we present a new AI approach to automatically design a central processing unit (CPU), the brain of a computer, and one of the world's most intricate devices humanity have ever designed. This approach generates the circuit logic, which is represented by a graph structure called Binary Speculation Diagram (BSD), of the CPU design from only external input-output observations instead of formal program code. During the generation of BSD, Monte Carlo-based expansion and the distance of Boolean functions are used to guarantee accuracy and efficiency, respectively. By efficiently exploring a search space of unprecedented size 10^{10^{540}}, which is the largest one of all machine-designed objects to our best knowledge, and thus pushing the limits of machine design, our approach generates an industrial-scale RISC-V CPU within only 5 hours. The taped-out CPU successfully runs the Linux operating system and performs comparably against the human-designed Intel 80486SX CPU. In addition to learning the world's first CPU only from input-output observations, which may reform the semiconductor industry by significantly reducing the design cycle, our approach even autonomously discovers human knowledge of the von Neumann architecture.Comment: 28 page

    Surface-initiated Cu(0) mediated controlled radical polymerization (SI-CuCRP) using a copper plate

    Get PDF
    Surface engineering with polymer brushes has become one of the most versatile techniques to tailor surface properties of substrates for a broad variety of (bio-) technological applications. We report on a new facile approach to prepare defined and dense polymer brushes on planar substrates by surface-initiated Cu(0) mediated controlled radical polymerization (SI-CuCRP) of numerous vinyl monomers using a copper plate at room temperature. The fabrication of a variety of homo-, block, gradient and patterned polymer brushes as well as polymer brush arrays is demonstrated. The SI-CuCRP was found to be strictly surface-confined, of highly living character, proceeds remarkably fast and results in polymer brushes of very high grafting densities. The brush layer thickness can be modulated by the polymerization time or by the distance of the copper plate to the modified substrate. As the copper plate can be reused multiple times, no additional copper salts are added and only minimal amount of chemicals is needed, the simple and low-cost experimental conditions allows researchers from various fields to prepare tailored polymer brush surfaces for their needs

    PAMI-AD: An Activity Detector Exploiting Part-attention and Motion Information in Surveillance Videos

    Full text link
    Activity detection in surveillance videos is a challenging task caused by small objects, complex activity categories, its untrimmed nature, etc. Existing methods are generally limited in performance due to inaccurate proposals, poor classifiers or inadequate post-processing method. In this work, we propose a comprehensive and effective activity detection system in untrimmed surveillance videos for person-centered and vehicle-centered activities. It consists of four modules, i.e., object localizer, proposal filter, activity classifier and activity refiner. For person-centered activities, a novel part-attention mechanism is proposed to explore detailed features in different body parts. As for vehicle-centered activities, we propose a localization masking method to jointly encode motion and foreground attention features. We conduct experiments on the large-scale activity detection datasets VIRAT, and achieve the best results for both groups of activities. Furthermore, our team won the 1st place in the TRECVID 2021 ActEV challenge.Comment: ICME 2022 Worksho
    corecore