189 research outputs found

    Multivariate Boosting with Look-Up Tables for Face Processing

    Get PDF
    This thesis proposes a novel unified boosting framework. We apply this framework to the several face processing tasks, face detection, facial feature localisation, and pose classification, and use the same boosting algorithm and the same pool of features (local binary features). This is in contrast with the standard approaches that make use of a variety of features and models, for example AdaBoost, cascades of boosted classifiers and Active Appearance Models. The unified boosting framework covers multivariate classification and regression problems and it is achieved by interpreting boosting as optimization in the functional space of the weak learners. Thus a wide range of smooth loss functions can be optimized with the same algorithm. There are two general optimization strategies we propose that extend recent works on TaylorBoost and Variational AdaBoost. The first proposition is an empirical expectation formulation that minimizes the average loss and the second is a variational formulation that includes an additional penalty for large variations between predictions. These two boosting formulations are used to train real-time models using local binary features. This is achieved using look-up-tables as weak learners and multi-block Local Binary Patterns as features. The resulting boosting algorithms are simple, efficient and easily scalable with the available resources. Furthermore, we introduce a novel coarse-to-fine feature selection method to handle high resolution models and a bootstrapping algorithm to sample representative training data from very large pools of data. The proposed approach is evaluated for several face processing tasks. These tasks include frontal face detection (binary classification), facial feature localization (multivariate regression) and pose estimation (multivariate classification). Several studies are performed to assess different optimization algorithms, bootstrapping parametrizations and feature sharing methods (for the multivariate case). The results show good performance for all of these tasks. In addition to this, two other contributions are presented. First, we propose a context-based model for removing the false alarms generated by a given generic face detector. Second, we propose a new face detector that predicts the Jaccard distance between the current location and the ground truth. This allows us to formulate the face detection problem as a regression task

    Exploring New Computing Paradigms for Data-Intensive Applications

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    Mathematical modeling for partial object detection.

    Get PDF
    From a computer vision point of view, the image is a scene consisting of objects of interest and a background represented by everything else in the image. The relations and interactions among these objects are the key factors for scene understanding. In this dissertation, a mathematical model is designed for the detection of partially occluded faces captured in unconstrained real life conditions. The proposed model novelty comes from explicitly considering certain objects that are common to occlude faces and embedding them in the face model. This enables the detection of faces in difficult settings and provides more information to subsequent analysis in addition to the bounding box of the face. In the proposed Selective Part Models (SPM), the face is modelled as a collection of parts that can be selected from the visible regular facial parts and some of the occluding objects which commonly interact with faces such as sunglasses, caps, hands, shoulders, and other faces. With the face detection being the first step in the face recognition pipeline, the proposed model does not only detect partially occluded faces efficiently but it also suggests the occluded parts to be excluded from the subsequent recognition step. The model was tested on several recent face detection databases and benchmarks and achieved state of the art performance. In addition, detailed analysis for the performance with respect to different types of occlusion were provided. Moreover, a new database was collected for evaluating face detectors focusing on the partial occlusion problem. This dissertation highlights the importance of explicitly handling the partial occlusion problem in face detection and shows its efficiency in enhancing both the face detection performance and the subsequent recognition performance of partially occluded faces. The broader impact of the proposed detector exceeds the common security applications by using it for human robot interaction. The humanoid robot Nao is used to help in teaching children with autism and the proposed detector is used to achieve natural interaction between the robot and the children by detecting their faces which can be used for recognition or more interestingly for adaptive interaction by analyzing their expressions

    Exploiting Spatio-Temporal Coherence for Video Object Detection in Robotics

    Get PDF
    This paper proposes a method to enhance video object detection for indoor environments in robotics. Concretely, it exploits knowledge about the camera motion between frames to propagate previously detected objects to successive frames. The proposal is rooted in the concepts of planar homography to propose regions of interest where to find objects, and recursive Bayesian filtering to integrate observations over time. The proposal is evaluated on six virtual, indoor environments, accounting for the detection of nine object classes over a total of ∼ 7k frames. Results show that our proposal improves the recall and the F1-score by a factor of 1.41 and 1.27, respectively, as well as it achieves a significant reduction of the object categorization entropy (58.8%) when compared to a two-stage video object detection method used as baseline, at the cost of small time overheads (120 ms) and precision loss (0.92).</p

    Computer vision based classification of fruits and vegetables for self-checkout at supermarkets

    Get PDF
    The field of machine learning, and, in particular, methods to improve the capability of machines to perform a wider variety of generalised tasks are among the most rapidly growing research areas in today’s world. The current applications of machine learning and artificial intelligence can be divided into many significant fields namely computer vision, data sciences, real time analytics and Natural Language Processing (NLP). All these applications are being used to help computer based systems to operate more usefully in everyday contexts. Computer vision research is currently active in a wide range of areas such as the development of autonomous vehicles, object recognition, Content Based Image Retrieval (CBIR), image segmentation and terrestrial analysis from space (i.e. crop estimation). Despite significant prior research, the area of object recognition still has many topics to be explored. This PhD thesis focuses on using advanced machine learning approaches to enable the automated recognition of fresh produce (i.e. fruits and vegetables) at supermarket self-checkouts. This type of complex classification task is one of the most recently emerging applications of advanced computer vision approaches and is a productive research topic in this field due to the limited means of representing the features and machine learning techniques for classification. Fruits and vegetables offer significant inter and intra class variance in weight, shape, size, colour and texture which makes the classification challenging. The applications of effective fruit and vegetable classification have significant importance in daily life e.g. crop estimation, fruit classification, robotic harvesting, fruit quality assessment, etc. One potential application for this fruit and vegetable classification capability is for supermarket self-checkouts. Increasingly, supermarkets are introducing self-checkouts in stores to make the checkout process easier and faster. However, there are a number of challenges with this as all goods cannot readily be sold with packaging and barcodes, for instance loose fresh items (e.g. fruits and vegetables). Adding barcodes to these types of items individually is impractical and pre-packaging limits the freedom of choice when selecting fruits and vegetables and creates additional waste, hence reducing customer satisfaction. The current situation, which relies on customers correctly identifying produce themselves leaves open the potential for incorrect billing either due to inadvertent error, or due to intentional fraudulent misclassification resulting in financial losses for the store. To address this identified problem, the main goals of this PhD work are: (a) exploring the types of visual and non-visual sensors that could be incorporated into a self-checkout system for classification of fruits and vegetables, (b) determining a suitable feature representation method for fresh produce items available at supermarkets, (c) identifying optimal machine learning techniques for classification within this context and (d) evaluating our work relative to the state-of-the-art object classification results presented in the literature. An in-depth analysis of related computer vision literature and techniques is performed to identify and implement the possible solutions. A progressive process distribution approach is used for this project where the task of computer vision based fruit and vegetables classification is divided into pre-processing and classification techniques. Different classification techniques have been implemented and evaluated as possible solution for this problem. Both visual and non-visual features of fruit and vegetables are exploited to perform the classification. Novel classification techniques have been carefully developed to deal with the complex and highly variant physical features of fruit and vegetables while taking advantages of both visual and non-visual features. The capability of classification techniques is tested in individual and ensemble manner to achieved the higher effectiveness. Significant results have been obtained where it can be concluded that the fruit and vegetables classification is complex task with many challenges involved. It is also observed that a larger dataset can better comprehend the complex variant features of fruit and vegetables. Complex multidimensional features can be extracted from the larger datasets to generalise on higher number of classes. However, development of a larger multiclass dataset is an expensive and time consuming process. The effectiveness of classification techniques can be significantly improved by subtracting the background occlusions and complexities. It is also worth mentioning that ensemble of simple and less complicated classification techniques can achieve effective results even if applied to less number of features for smaller number of classes. The combination of visual and nonvisual features can reduce the struggle of a classification technique to deal with higher number of classes with similar physical features. Classification of fruit and vegetables with similar physical features (i.e. colour and texture) needs careful estimation and hyper-dimensional embedding of visual features. Implementing rigorous classification penalties as loss function can achieve this goal at the cost of time and computational requirements. There is a significant need to develop larger datasets for different fruit and vegetables related computer vision applications. Considering more sophisticated loss function penalties and discriminative hyper-dimensional features embedding techniques can significantly improve the effectiveness of the classification techniques for the fruit and vegetables applications

    A Construction Kit for Efficient Low Power Neural Network Accelerator Designs

    Get PDF
    Implementing embedded neural network processing at the edge requires efficient hardware acceleration that couples high computational performance with low power consumption. Driven by the rapid evolution of network architectures and their algorithmic features, accelerator designs are constantly updated and improved. To evaluate and compare hardware design choices, designers can refer to a myriad of accelerator implementations in the literature. Surveys provide an overview of these works but are often limited to system-level and benchmark-specific performance metrics, making it difficult to quantitatively compare the individual effect of each utilized optimization technique. This complicates the evaluation of optimizations for new accelerator designs, slowing-down the research progress. This work provides a survey of neural network accelerator optimization approaches that have been used in recent works and reports their individual effects on edge processing performance. It presents the list of optimizations and their quantitative effects as a construction kit, allowing to assess the design choices for each building block separately. Reported optimizations range from up to 10'000x memory savings to 33x energy reductions, providing chip designers an overview of design choices for implementing efficient low power neural network accelerators

    Pattern Recognition

    Get PDF
    Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition
    • …
    corecore