10 research outputs found

    Keypoint-Aligned Embeddings for Image Retrieval and Re-identification

    Full text link
    Learning embeddings that are invariant to the pose of the object is crucial in visual image retrieval and re-identification. The existing approaches for person, vehicle, or animal re-identification tasks suffer from high intra-class variance due to deformable shapes and different camera viewpoints. To overcome this limitation, we propose to align the image embedding with a predefined order of the keypoints. The proposed keypoint aligned embeddings model (KAE-Net) learns part-level features via multi-task learning which is guided by keypoint locations. More specifically, KAE-Net extracts channels from a feature map activated by a specific keypoint through learning the auxiliary task of heatmap reconstruction for this keypoint. The KAE-Net is compact, generic and conceptually simple. It achieves state of the art performance on the benchmark datasets of CUB-200-2011, Cars196 and VeRi-776 for retrieval and re-identification tasks.Comment: 8 pages, 7 figures, accepted to WACV 202

    Learning from limited annotated data for re-identification problem

    No full text
    The project develops machine learning methods for the re-identification task, which is matching images from the same category in a database. The thesis proposes approaches to reduce the influence of two critical challenges in image re-identification: pose variations that affect the appearance of objects and the need to annotate a large dataset to train a neural network. Depending on the domain, these challenges occur to a different extent. Our approach demonstrates superior performance on several benchmarks for people, cars, and animal categories

    Learning geometric equivalence between patterns using embedding neural networks

    No full text
    Despite impressive results in object classification, verification and recognition, most deep neural network based recognition systems become brittle when the view point of the camera changes dramatically. Robustness to geometric trans- formations is highly desirable for applications like wild life monitoring where there is no control on the pose of the objects of interest. The images of different objects viewed from various observation points define equivalence classes where by definition two images are said to be equivalent if they are views from the same object. These equivalence classes can be learned via embeddings that map the input images to vectors of real numbers. During training, equivalent images are mapped to vectors that get pulled closer together, whereas if the images are not equivalent their associated vectors get pulled apart. In this work, we present an effective deep neural network model for learning the homographic equivalence between patterns. The long term aim of this research is to develop more robust manta ray recognizers. Manta rays bear unique natural spot patterns on their bellies. Visual identification based on these patterns from underwater images enables a better understanding of habitat use by monitoring individuals within populations. We test our model on a dataset of artificially generated patterns that resemble natural patterning. Our experiments demonstrate that the proposed architecture is able to discriminate between patterns subjected to large homographic transformations

    Keypoint-Aligned Embeddings for Image Retrieval and Re-Identification

    No full text
    Learning embeddings that are invariant to the pose of the object is crucial in visual image retrieval and re-identification. The existing approaches for person, vehicle, or animal re-identification tasks suffer from high intra-class variance due to deformable shapes and different camera viewpoints. To overcome this limitation, we propose to align the image embedding with a predefined order of the key-points. The proposed keypoint aligned embeddings model (KAE-Net) learns part-level features via multi-task learning which is guided by keypoint locations. More specifically, KAE-Net extracts channels from a feature map activated by a specific keypoint through learning the auxiliary task of heatmap reconstruction for this keypoint. The KAE-Net is compact, generic and conceptually simple. It achieves state of the art performance on the benchmark datasets of CUB-200-2011, Cars196 and VeRi-776 for retrieval and re-identification tasks.<br/

    Learning landmark guided embeddings for animal re-identification

    No full text
    Re-identification of individual animals in images can be ambiguous due to subtle variations in body markings between different individuals and no constraints on the poses of animals in the wild. Person re-identification is a similar task and it has been approached with a deep convolutional neural network (CNN) that learns discriminative embeddings for images of people. However, learning discriminative features for an individual animal is more challenging than for a person’s appearance due to the relatively small size of ecological datasets compared to labelled datasets of person’s identities. We propose to improve embedding learning by exploiting body landmarks information explicitly. Body landmarks are provided to the input of a CNN as confidence heatmaps that can be obtained from a separate body landmark predictor. The model is encouraged to use heatmaps by learning an auxiliary task of reconstructing input heatmaps. Body landmarks guide a feature extraction network to learn the representation of a distinctive pattern and its position on the body. We evaluate the proposed method on a large synthetic dataset and a small real dataset. Our method outperforms the same model without body landmarks input by 26% and 18% on the synthetic and the real datasets respectively. The method is robust to noise in input coordinates and can tolerate an error in coordinates up to 10% of the image size

    Semi-supervised Keypoint Localization

    No full text
    Knowledge about the locations of keypoints of an object in an image can assist in fine-grained classification and identification tasks, particularly for the case of objects that exhibit large variations in poses that greatly influence their visual appearance, such as wild animals. However, supervised training of a keypoint detection network requires annotating a large image dataset for each animal species, which is a labor-intensive task. To reduce the need for labeled data, we propose to learn simultaneously keypoint heatmaps and pose invariant keypoint representations in a semi-supervised manner using a small set of labeled images along with a larger set of unlabeled images. Keypoint representations are learnt with a semantic keypoint consistency constraint that forces the keypoint detection network to learn similar features for the same keypoint across the dataset. Pose invariance is achieved by making keypoint representations for the image and its augmented copies closer together in feature space. Our semi-supervised approach significantly outperforms previous methods on several benchmarks for human and animal body landmark localization

    Robust Re-identification of Manta Rays from Natural Markings by Learning Pose Invariant Embeddings

    No full text
    Visual re-identification of individual animals that bear unique natural body markings is an essential task in wildlife conservation. The photo databases of animal markings grow with each new observation and identifying an individual means matching against thousands of images. We focus on the re-identification of manta rays because the existing process is time-consuming and only semi-automatic. The current solution Manta Matcher requires images of high quality with the pattern of interest in a near frontal view limiting the use of photos sourced from citizen scientists. This paper presents a novel application of a deep convolutional neural network (CNN) for visual re-identification based on natural markings. Our contribution is an experimental demonstration of the superiority of CNNs in learning embeddings for patterns under viewpoint changes on a novel and challenging dataset. We show that our system can handle more variations in viewing angle, occlusions and illumination compared to the current solution. Our system achieves top-10 accuracy of 98% with only 2 matching examples in the database which makes it of practical value and ready for adoption by marine biologists. We also evaluate our system on a dataset of humpback whale flukes to demonstrate that the approach is generic and not species-specific.</p

    Flukebook: an open-source AI platform for cetacean photo identification

    No full text
    Determining which species are at greatest risk, where they are most vulnerable, and what are the trajectories of their communities and populations is critical for conservation and management. Globally distributed, wide-ranging whales and dolphins present a particular challenge in data collection because no single research team can record data over biologically meaningful areas. Flukebook.org is an open-source web platform that addresses these gaps by providing researchers with the latest computational tools. It integrates photo-identification algorithms with data management, sharing, and privacy infrastructure for whale and dolphin research, enabling the global collaborative study of these global species. With seven automatic identification algorithms trained for 15 different species, resulting in 37 species-specific identification pipelines, Flukebook is an extensible foundation that continually incorporates emerging AI techniques and applies them to cetacean photo identification through continued collaboration between computer vision researchers, software engineers, and biologists. With over 2.0 million photos of over 52,000 identified individual animals submitted by over 250 researchers, the platform enables a comprehensive understanding of cetacean populations, fostering international and cross-institutional collaboration while respecting data ownership and privacy. We outline the technology stack and architecture of Flukebook, its performance on real-world cetacean imagery, and its development as an example of scalable, extensible, and reusable open-source conservation software. Flukebook is a step change in our ability to conduct large-scale research on cetaceans across biologically meaningful geographic ranges, to rapidly iterate population assessments and abundance trajectories, and engage the public in actions to protect them.</p
    corecore