196 research outputs found

    Review of Person Re-identification Techniques

    Full text link
    Person re-identification across different surveillance cameras with disjoint fields of view has become one of the most interesting and challenging subjects in the area of intelligent video surveillance. Although several methods have been developed and proposed, certain limitations and unresolved issues remain. In all of the existing re-identification approaches, feature vectors are extracted from segmented still images or video frames. Different similarity or dissimilarity measures have been applied to these vectors. Some methods have used simple constant metrics, whereas others have utilised models to obtain optimised metrics. Some have created models based on local colour or texture information, and others have built models based on the gait of people. In general, the main objective of all these approaches is to achieve a higher-accuracy rate and lowercomputational costs. This study summarises several developments in recent literature and discusses the various available methods used in person re-identification. Specifically, their advantages and disadvantages are mentioned and compared.Comment: Published 201

    Image features for visual teach-and-repeat navigation in changing environments

    Get PDF
    We present an evaluation of standard image features in the context of long-term visual teach-and-repeat navigation of mobile robots, where the environment exhibits significant changes in appearance caused by seasonal weather variations and daily illumination changes. We argue that for long-term autonomous navigation, the viewpoint-, scale- and rotation- invariance of the standard feature extractors is less important than their robustness to the mid- and long-term environment appearance changes. Therefore, we focus our evaluation on the robustness of image registration to variable lighting and naturally-occurring seasonal changes. We combine detection and description components of different image extractors and evaluate their performance on five datasets collected by mobile vehicles in three different outdoor environments over the course of one year. Moreover, we propose a trainable feature descriptor based on a combination of evolutionary algorithms and Binary Robust Independent Elementary Features, which we call GRIEF (Generated BRIEF). In terms of robustness to seasonal changes, the most promising results were achieved by the SpG/CNN and the STAR/GRIEF feature, which was slightly less robust, but faster to calculate

    Multiple human tracking in RGB-depth data: A survey

    Get PDF
    © The Institution of Engineering and Technology. Multiple human tracking (MHT) is a fundamental task in many computer vision applications. Appearance-based approaches, primarily formulated on RGB data, are constrained and affected by problems arising from occlusions and/or illumination variations. In recent years, the arrival of cheap RGB-depth devices has led to many new approaches to MHT, and many of these integrate colour and depth cues to improve each and every stage of the process. In this survey, the authors present the common processing pipeline of these methods and review their methodology based (a) on how they implement this pipeline and (b) on what role depth plays within each stage of it. They identify and introduce existing, publicly available, benchmark datasets and software resources that fuse colour and depth data for MHT. Finally, they present a brief comparative evaluation of the performance of those works that have applied their methods to these datasets

    Automatic Plant Annotation Using 3D Computer Vision

    Get PDF

    Pattern Recognition

    Get PDF
    A wealth of advanced pattern recognition algorithms are emerging from the interdiscipline between technologies of effective visual features and the human-brain cognition process. Effective visual features are made possible through the rapid developments in appropriate sensor equipments, novel filter designs, and viable information processing architectures. While the understanding of human-brain cognition process broadens the way in which the computer can perform pattern recognition tasks. The present book is intended to collect representative researches around the globe focusing on low-level vision, filter design, features and image descriptors, data mining and analysis, and biologically inspired algorithms. The 27 chapters coved in this book disclose recent advances and new ideas in promoting the techniques, technology and applications of pattern recognition

    Pre-processing, classification and semantic querying of large-scale Earth observation spaceborne/airborne/terrestrial image databases: Process and product innovations.

    Get PDF
    By definition of Wikipedia, “big data is the term adopted for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications. The big data challenges typically include capture, curation, storage, search, sharing, transfer, analysis and visualization”. Proposed by the intergovernmental Group on Earth Observations (GEO), the visionary goal of the Global Earth Observation System of Systems (GEOSS) implementation plan for years 2005-2015 is systematic transformation of multisource Earth Observation (EO) “big data” into timely, comprehensive and operational EO value-adding products and services, submitted to the GEO Quality Assurance Framework for Earth Observation (QA4EO) calibration/validation (Cal/Val) requirements. To date the GEOSS mission cannot be considered fulfilled by the remote sensing (RS) community. This is tantamount to saying that past and existing EO image understanding systems (EO-IUSs) have been outpaced by the rate of collection of EO sensory big data, whose quality and quantity are ever-increasing. This true-fact is supported by several observations. For example, no European Space Agency (ESA) EO Level 2 product has ever been systematically generated at the ground segment. By definition, an ESA EO Level 2 product comprises a single-date multi-spectral (MS) image radiometrically calibrated into surface reflectance (SURF) values corrected for geometric, atmospheric, adjacency and topographic effects, stacked with its data-derived scene classification map (SCM), whose thematic legend is general-purpose, user- and application-independent and includes quality layers, such as cloud and cloud-shadow. Since no GEOSS exists to date, present EO content-based image retrieval (CBIR) systems lack EO image understanding capabilities. Hence, no semantic CBIR (SCBIR) system exists to date either, where semantic querying is synonym of semantics-enabled knowledge/information discovery in multi-source big image databases. In set theory, if set A is a strict superset of (or strictly includes) set B, then A B. This doctoral project moved from the working hypothesis that SCBIR computer vision (CV), where vision is synonym of scene-from-image reconstruction and understanding EO image understanding (EO-IU) in operating mode, synonym of GEOSS ESA EO Level 2 product human vision. Meaning that necessary not sufficient pre-condition for SCBIR is CV in operating mode, this working hypothesis has two corollaries. First, human visual perception, encompassing well-known visual illusions such as Mach bands illusion, acts as lower bound of CV within the multi-disciplinary domain of cognitive science, i.e., CV is conditioned to include a computational model of human vision. Second, a necessary not sufficient pre-condition for a yet-unfulfilled GEOSS development is systematic generation at the ground segment of ESA EO Level 2 product. Starting from this working hypothesis the overarching goal of this doctoral project was to contribute in research and technical development (R&D) toward filling an analytic and pragmatic information gap from EO big sensory data to EO value-adding information products and services. This R&D objective was conceived to be twofold. First, to develop an original EO-IUS in operating mode, synonym of GEOSS, capable of systematic ESA EO Level 2 product generation from multi-source EO imagery. EO imaging sources vary in terms of: (i) platform, either spaceborne, airborne or terrestrial, (ii) imaging sensor, either: (a) optical, encompassing radiometrically calibrated or uncalibrated images, panchromatic or color images, either true- or false color red-green-blue (RGB), multi-spectral (MS), super-spectral (SS) or hyper-spectral (HS) images, featuring spatial resolution from low (> 1km) to very high (< 1m), or (b) synthetic aperture radar (SAR), specifically, bi-temporal RGB SAR imagery. The second R&D objective was to design and develop a prototypical implementation of an integrated closed-loop EO-IU for semantic querying (EO-IU4SQ) system as a GEOSS proof-of-concept in support of SCBIR. The proposed closed-loop EO-IU4SQ system prototype consists of two subsystems for incremental learning. A primary (dominant, necessary not sufficient) hybrid (combined deductive/top-down/physical model-based and inductive/bottom-up/statistical model-based) feedback EO-IU subsystem in operating mode requires no human-machine interaction to automatically transform in linear time a single-date MS image into an ESA EO Level 2 product as initial condition. A secondary (dependent) hybrid feedback EO Semantic Querying (EO-SQ) subsystem is provided with a graphic user interface (GUI) to streamline human-machine interaction in support of spatiotemporal EO big data analytics and SCBIR operations. EO information products generated as output by the closed-loop EO-IU4SQ system monotonically increase their value-added with closed-loop iterations

    Computer vision based classification of fruits and vegetables for self-checkout at supermarkets

    Get PDF
    The field of machine learning, and, in particular, methods to improve the capability of machines to perform a wider variety of generalised tasks are among the most rapidly growing research areas in today’s world. The current applications of machine learning and artificial intelligence can be divided into many significant fields namely computer vision, data sciences, real time analytics and Natural Language Processing (NLP). All these applications are being used to help computer based systems to operate more usefully in everyday contexts. Computer vision research is currently active in a wide range of areas such as the development of autonomous vehicles, object recognition, Content Based Image Retrieval (CBIR), image segmentation and terrestrial analysis from space (i.e. crop estimation). Despite significant prior research, the area of object recognition still has many topics to be explored. This PhD thesis focuses on using advanced machine learning approaches to enable the automated recognition of fresh produce (i.e. fruits and vegetables) at supermarket self-checkouts. This type of complex classification task is one of the most recently emerging applications of advanced computer vision approaches and is a productive research topic in this field due to the limited means of representing the features and machine learning techniques for classification. Fruits and vegetables offer significant inter and intra class variance in weight, shape, size, colour and texture which makes the classification challenging. The applications of effective fruit and vegetable classification have significant importance in daily life e.g. crop estimation, fruit classification, robotic harvesting, fruit quality assessment, etc. One potential application for this fruit and vegetable classification capability is for supermarket self-checkouts. Increasingly, supermarkets are introducing self-checkouts in stores to make the checkout process easier and faster. However, there are a number of challenges with this as all goods cannot readily be sold with packaging and barcodes, for instance loose fresh items (e.g. fruits and vegetables). Adding barcodes to these types of items individually is impractical and pre-packaging limits the freedom of choice when selecting fruits and vegetables and creates additional waste, hence reducing customer satisfaction. The current situation, which relies on customers correctly identifying produce themselves leaves open the potential for incorrect billing either due to inadvertent error, or due to intentional fraudulent misclassification resulting in financial losses for the store. To address this identified problem, the main goals of this PhD work are: (a) exploring the types of visual and non-visual sensors that could be incorporated into a self-checkout system for classification of fruits and vegetables, (b) determining a suitable feature representation method for fresh produce items available at supermarkets, (c) identifying optimal machine learning techniques for classification within this context and (d) evaluating our work relative to the state-of-the-art object classification results presented in the literature. An in-depth analysis of related computer vision literature and techniques is performed to identify and implement the possible solutions. A progressive process distribution approach is used for this project where the task of computer vision based fruit and vegetables classification is divided into pre-processing and classification techniques. Different classification techniques have been implemented and evaluated as possible solution for this problem. Both visual and non-visual features of fruit and vegetables are exploited to perform the classification. Novel classification techniques have been carefully developed to deal with the complex and highly variant physical features of fruit and vegetables while taking advantages of both visual and non-visual features. The capability of classification techniques is tested in individual and ensemble manner to achieved the higher effectiveness. Significant results have been obtained where it can be concluded that the fruit and vegetables classification is complex task with many challenges involved. It is also observed that a larger dataset can better comprehend the complex variant features of fruit and vegetables. Complex multidimensional features can be extracted from the larger datasets to generalise on higher number of classes. However, development of a larger multiclass dataset is an expensive and time consuming process. The effectiveness of classification techniques can be significantly improved by subtracting the background occlusions and complexities. It is also worth mentioning that ensemble of simple and less complicated classification techniques can achieve effective results even if applied to less number of features for smaller number of classes. The combination of visual and nonvisual features can reduce the struggle of a classification technique to deal with higher number of classes with similar physical features. Classification of fruit and vegetables with similar physical features (i.e. colour and texture) needs careful estimation and hyper-dimensional embedding of visual features. Implementing rigorous classification penalties as loss function can achieve this goal at the cost of time and computational requirements. There is a significant need to develop larger datasets for different fruit and vegetables related computer vision applications. Considering more sophisticated loss function penalties and discriminative hyper-dimensional features embedding techniques can significantly improve the effectiveness of the classification techniques for the fruit and vegetables applications

    Large-area visually augmented navigation for autonomous underwater vehicles

    Get PDF
    Submitted to the Joint Program in Applied Ocean Science & Engineering in partial fulfillment of the requirements for the degree of Doctor of Philosophy at the Massachusetts Institute of Technology and the Woods Hole Oceanographic Institution June 2005This thesis describes a vision-based, large-area, simultaneous localization and mapping (SLAM) algorithm that respects the low-overlap imagery constraints typical of autonomous underwater vehicles (AUVs) while exploiting the inertial sensor information that is routinely available on such platforms. We adopt a systems-level approach exploiting the complementary aspects of inertial sensing and visual perception from a calibrated pose-instrumented platform. This systems-level strategy yields a robust solution to underwater imaging that overcomes many of the unique challenges of a marine environment (e.g., unstructured terrain, low-overlap imagery, moving light source). Our large-area SLAM algorithm recursively incorporates relative-pose constraints using a view-based representation that exploits exact sparsity in the Gaussian canonical form. This sparsity allows for efficient O(n) update complexity in the number of images composing the view-based map by utilizing recent multilevel relaxation techniques. We show that our algorithmic formulation is inherently sparse unlike other feature-based canonical SLAM algorithms, which impose sparseness via pruning approximations. In particular, we investigate the sparsification methodology employed by sparse extended information filters (SEIFs) and offer new insight as to why, and how, its approximation can lead to inconsistencies in the estimated state errors. Lastly, we present a novel algorithm for efficiently extracting consistent marginal covariances useful for data association from the information matrix. In summary, this thesis advances the current state-of-the-art in underwater visual navigation by demonstrating end-to-end automatic processing of the largest visually navigated dataset to date using data collected from a survey of the RMS Titanic (path length over 3 km and 3100 m2 of mapped area). This accomplishment embodies the summed contributions of this thesis to several current SLAM research issues including scalability, 6 degree of freedom motion, unstructured environments, and visual perception.This work was funded in part by the CenSSIS ERC of the National Science Foundation under grant EEC-9986821, in part by the Woods Hole Oceanographic Institution through a grant from the Penzance Foundation, and in part by a NDSEG Fellowship awarded through the Department of Defense
    corecore