429 research outputs found

    Trademark image retrieval by local features

    Get PDF
    The challenge of abstract trademark image retrieval as a test of machine vision algorithms has attracted considerable research interest in the past decade. Current operational trademark retrieval systems involve manual annotation of the images (the current ‘gold standard’). Accordingly, current systems require a substantial amount of time and labour to access, and are therefore expensive to operate. This thesis focuses on the development of algorithms that mimic aspects of human visual perception in order to retrieve similar abstract trademark images automatically. A significant category of trademark images are typically highly stylised, comprising a collection of distinctive graphical elements that often include geometric shapes. Therefore, in order to compare the similarity of such images the principal aim of this research has been to develop a method for solving the partial matching and shape perception problem. There are few useful techniques for partial shape matching in the context of trademark retrieval, because those existing techniques tend not to support multicomponent retrieval. When this work was initiated most trademark image retrieval systems represented images by means of global features, which are not suited to solving the partial matching problem. Instead, the author has investigated the use of local image features as a means to finding similarities between trademark images that only partially match in terms of their subcomponents. During the course of this work, it has been established that the Harris and Chabat detectors could potentially perform sufficiently well to serve as the basis for local feature extraction in trademark image retrieval. Early findings in this investigation indicated that the well established SIFT (Scale Invariant Feature Transform) local features, based on the Harris detector, could potentially serve as an adequate underlying local representation for matching trademark images. There are few researchers who have used mechanisms based on human perception for trademark image retrieval, implying that the shape representations utilised in the past to solve this problem do not necessarily reflect the shapes contained in these image, as characterised by human perception. In response, a ii practical approach to trademark image retrieval by perceptual grouping has been developed based on defining meta-features that are calculated from the spatial configurations of SIFT local image features. This new technique measures certain visual properties of the appearance of images containing multiple graphical elements and supports perceptual grouping by exploiting the non-accidental properties of their configuration. Our validation experiments indicated that we were indeed able to capture and quantify the differences in the global arrangement of sub-components evident when comparing stylised images in terms of their visual appearance properties. Such visual appearance properties, measured using 17 of the proposed metafeatures, include relative sub-component proximity, similarity, rotation and symmetry. Similar work on meta-features, based on the above Gestalt proximity, similarity, and simplicity groupings of local features, had not been reported in the current computer vision literature at the time of undertaking this work. We decided to adopted relevance feedback to allow the visual appearance properties of relevant and non-relevant images returned in response to a query to be determined by example. Since limited training data is available when constructing a relevance classifier by means of user supplied relevance feedback, the intrinsically non-parametric machine learning algorithm ID3 (Iterative Dichotomiser 3) was selected to construct decision trees by means of dynamic rule induction. We believe that the above approach to capturing high-level visual concepts, encoded by means of meta-features specified by example through relevance feedback and decision tree classification, to support flexible trademark image retrieval and to be wholly novel. The retrieval performance the above system was compared with two other state-of-the-art image trademark retrieval systems: Artisan developed by Eakins (Eakins et al., 1998) and a system developed by Jiang (Jiang et al., 2006). Using relevance feedback, our system achieves higher average normalised precision than either of the systems developed by Eakins’ or Jiang. However, while our trademark image query and database set is based on an image dataset used by Eakins, we employed different numbers of images. It was not possible to access to the same query set and image database used in the evaluation of Jiang’s trademark iii image retrieval system evaluation. Despite these differences in evaluation methodology, our approach would appear to have the potential to improve retrieval effectiveness

    Camera-Based Heart Rate Extraction in Noisy Environments

    Get PDF
    Remote photoplethysmography (rPPG) is a non-invasive technique that benefits from video to measure vital signs such as the heart rate (HR). In rPPG estimation, noise can introduce artifacts that distort rPPG signal and jeopardize accurate HR measurement. Considering that most rPPG studies occurred in lab-controlled environments, the issue of noise in realistic conditions remains open. This thesis aims to examine the challenges of noise in rPPG estimation in realistic scenarios, specifically investigating the effect of noise arising from illumination variation and motion artifacts on the predicted rPPG HR. To mitigate the impact of noise, a modular rPPG measurement framework, comprising data preprocessing, region of interest, signal extraction, preparation, processing, and HR extraction is developed. The proposed pipeline is tested on the LGI-PPGI-Face-Video-Database public dataset, hosting four different candidates and real-life scenarios. In the RoI module, raw rPPG signals were extracted from the dataset using three machine learning-based face detectors, namely Haarcascade, Dlib, and MediaPipe, in parallel. Subsequently, the collected signals underwent preprocessing, independent component analysis, denoising, and frequency domain conversion for peak detection. Overall, the Dlib face detector leads to the most successful HR for the majority of scenarios. In 50% of all scenarios and candidates, the average predicted HR for Dlib is either in line or very close to the average reference HR. The extracted HRs from the Haarcascade and MediaPipe architectures make up 31.25% and 18.75% of plausible results, respectively. The analysis highlighted the importance of fixated facial landmarks in collecting quality raw data and reducing noise

    Learning for Coordination of Vision and Action

    Get PDF
    We define the problem of visuomotor coordination and identify bottleneck problems in the implementation of general purpose vision and action systems. We conjecture that machine learning methods provide a general purpose mechanism for combining specific visual and action modules in a task-independent way. We also maintain that successful learning systems reflect realities of the environment, exploit context information, and identify limitations in perceptual algorithms which cannot be captured by the designer. We then propose a multi-step find-and-fetch mobile robot search and retrieval task. This task illustrates where current learning approaches provide solutions and where future research opportunities exist

    Review of Person Re-identification Techniques

    Full text link
    Person re-identification across different surveillance cameras with disjoint fields of view has become one of the most interesting and challenging subjects in the area of intelligent video surveillance. Although several methods have been developed and proposed, certain limitations and unresolved issues remain. In all of the existing re-identification approaches, feature vectors are extracted from segmented still images or video frames. Different similarity or dissimilarity measures have been applied to these vectors. Some methods have used simple constant metrics, whereas others have utilised models to obtain optimised metrics. Some have created models based on local colour or texture information, and others have built models based on the gait of people. In general, the main objective of all these approaches is to achieve a higher-accuracy rate and lowercomputational costs. This study summarises several developments in recent literature and discusses the various available methods used in person re-identification. Specifically, their advantages and disadvantages are mentioned and compared.Comment: Published 201

    Three hypothesis algorithm with occlusion reasoning for multiple people tracking

    Get PDF
    This work proposes a detection-based tracking algorithm able to locate and keep the identity of multiple people, who may be occluded, in uncontrolled stationary environments. Our algorithm builds a tracking graph that models spatio-temporal relationships among attributes of interacting people to predict and resolve partial and total occlusions. When a total occlusion occurs, the algorithm generates various hypotheses about the location of the occluded person considering three cases: (a) the person keeps the same direction and speed, (b) the person follows the direction and speed of the occluder, and (c) the person remains motionless during occlusion. By analyzing the graph, our algorithm can detect trajectories produced by false alarms and estimate the location of missing or occluded people. Our algorithm performs acceptably under complex conditions, such as partial visibility of individuals getting inside or outside the scene, continuous interactions and occlusions among people, wrong or missing information on the detection of persons, as well as variation of the person’s appearance due to illumination changes and background-clutter distracters. Our algorithm was evaluated on test sequences in the field of intelligent surveillance achieving an overall precision of 93%. Results show that our tracking algorithm outperforms even trajectory-based state-of-the-art algorithms

    Robust Global Tracker based on an Online Estimation of Tracklet Descriptor Reliability

    Get PDF
    International audienceThe complex scene conditions such as light change, high density of mobile objects or object occlusion can cause object mis-detections. When a tracker can not recover these mis-detections, the trajectory of an object is fragmented into some short trajectories called tracklets. As a result, tracking quality is reduced remarkably. In this paper, we propose a new approach to improve the tracking quality by a global tracker which merges all tracklets belonging to an object in the whole video. Particularly, we compute descriptor reliability over time based on their discrimination. On the other hand, a motion model is also combined with appearance descriptors in a flexible way to improve the tracking quality. The proposed approach is evaluated on four benchmark datasets. The obtained results show the robustness and effectiveness of our approach compared to tracking as well as tracklet linking approaches from state of the art

    Object recognition based on shape and function

    Get PDF
    This thesis explores a new approach to computational object recognition by borrowing an idea from child language acquisition studies in developmental psychology. Whereas previous image recognition research used shape to recognize and label a target object, the model proposed in this thesis also uses the function of the object resulting in a more accurate recognition. This thesis makes use of new gaming technology, Microsoftñ€ℱs Kinect, in implementing the proposed new object recognition model. A demonstration of the model developed in this project properly infers different names for similarly shaped objects and the same name for differently shaped objects

    Fast protein superfamily classification using principal component null space analysis.

    Get PDF
    The protein family classification problem, which consists of determining the family memberships of given unknown protein sequences, is very important for a biologist for many practical reasons, such as drug discovery, prediction of molecular functions and medical diagnosis. Neural networks and Bayesian methods have performed well on the protein classification problem, achieving accuracy ranging from 90% to 98% while running relatively slowly in the learning stage. In this thesis, we present a principal component null space analysis (PCNSA) linear classifier to the problem and report excellent results compared to those of neural networks and support vector machines. The two main parameters of PCNSA are linked to the high dimensionality of the dataset used, and were optimized in an exhaustive manner to maximize accuracy. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2005 .F74. Source: Masters Abstracts International, Volume: 44-03, page: 1400. Thesis (M.Sc.)--University of Windsor (Canada), 2005

    Uncertainty Management of Intelligent Feature Selection in Wireless Sensor Networks

    Get PDF
    Wireless sensor networks (WSN) are envisioned to revolutionize the paradigm of monitoring complex real-world systems at a very high resolution. However, the deployment of a large number of unattended sensor nodes in hostile environments, frequent changes of environment dynamics, and severe resource constraints pose uncertainties and limit the potential use of WSN in complex real-world applications. Although uncertainty management in Artificial Intelligence (AI) is well developed and well investigated, its implications in wireless sensor environments are inadequately addressed. This dissertation addresses uncertainty management issues of spatio-temporal patterns generated from sensor data. It provides a framework for characterizing spatio-temporal pattern in WSN. Using rough set theory and temporal reasoning a novel formalism has been developed to characterize and quantify the uncertainties in predicting spatio-temporal patterns from sensor data. This research also uncovers the trade-off among the uncertainty measures, which can be used to develop a multi-objective optimization model for real-time decision making in sensor data aggregation and samplin
    • 

    corecore