23,642 research outputs found

    Local wavelet features for statistical object classification and localisation

    Get PDF
    This article presents a system for texture-based probabilistic classification and localisation of 3D objects in 2D digital images and discusses selected applications. The objects are described by local feature vectors computed using the wavelet transform. In the training phase, object features are statistically modelled as normal density functions. In the recognition phase, a maximisation algorithm compares the learned density functions with the feature vectors extracted from a real scene and yields the classes and poses of objects found in it. Experiments carried out on a real dataset of over 40000 images demonstrate the robustness of the system in terms of classification and localisation accuracy. Finally, two important application scenarios are discussed, namely classification of museum artefacts and classification of metallography images

    Fine-Grained Classification of Pedestrians in Video: Benchmark and State of the Art

    Get PDF
    A video dataset that is designed to study fine-grained categorisation of pedestrians is introduced. Pedestrians were recorded “in-the-wild” from a moving vehicle. Annotations include bounding boxes, tracks, 14 keypoints with occlusion information and the fine-grained categories of age (5 classes), sex (2 classes), weight (3 classes) and clothing style (4 classes). There are a total of 27,454 bounding box and pose labels across 4222 tracks. This dataset is designed to train and test algorithms for fine-grained categorisation of people; it is also useful for benchmarking tracking, detection and pose estimation of pedestrians. State-of-the-art algorithms for fine-grained classification and pose estimation were tested using the dataset and the results are reported as a useful performance baseline

    MinkSORT: A 3D deep feature extractor using sparse convolutions to improve 3D multi-object tracking in greenhouse tomato plants

    Full text link
    The agro-food industry is turning to robots to address the challenge of labour shortage. However, agro-food environments pose difficulties for robots due to high variation and occlusions. In the presence of these challenges, accurate world models, with information about object location, shape, and properties, are crucial for robots to perform tasks accurately. Building such models is challenging due to the complex and unique nature of agro-food environments, and errors in the model can lead to task execution issues. In this paper, we propose MinkSORT, a novel method for generating tracking features using a 3D sparse convolutional network in a deepSORT-like approach to improve the accuracy of world models in agro-food environments. We evaluated our feature extractor network using real-world data collected in a tomato greenhouse, which significantly improved the performance of our baseline model that tracks tomato positions in 3D using a Kalman filter and Mahalanobis distance. Our deep learning feature extractor improved the HOTA from 42.8% to 44.77%, the association accuracy from 32.55% to 35.55%, and the MOTA from 57.63% to 58.81%. We also evaluated different contrastive loss functions for training our deep learning feature extractor and demonstrated that our approach leads to improved performance in terms of three separate precision and recall detection outcomes. Our method improves world model accuracy, enabling robots to perform tasks such as harvesting and plant maintenance with greater efficiency and accuracy, which is essential for meeting the growing demand for food in a sustainable manner

    A Model-Based Approach for Compound Leaves Understanding and Identification

    Get PDF
    International audienceIn this paper, we propose a specific method for the identification of compound-leaved tree species, with the aim of integrating it in an educational smartphone application. Our work is based on dedicated shape models for compound leaves, designed to estimate the number and shape of leaflets. A deformable template approach is used to fit these models and produce a high-level interpretation of the image content. The resulting models are later used for the segmentation of leaves in both plain and natural background images, by the use of multiple region-based active contours. Combined with other botany-inspired descriptors accounting for the morphological properties of the leaves, we propose a classification method that makes a semantic interpretation possible. Results are presented over a set of more than 1000 images from 17 European tree species, and an integration in the existing mobile application Folia is considered
    corecore