446 research outputs found

    Computer vision based classification of fruits and vegetables for self-checkout at supermarkets

    Get PDF
    The field of machine learning, and, in particular, methods to improve the capability of machines to perform a wider variety of generalised tasks are among the most rapidly growing research areas in today’s world. The current applications of machine learning and artificial intelligence can be divided into many significant fields namely computer vision, data sciences, real time analytics and Natural Language Processing (NLP). All these applications are being used to help computer based systems to operate more usefully in everyday contexts. Computer vision research is currently active in a wide range of areas such as the development of autonomous vehicles, object recognition, Content Based Image Retrieval (CBIR), image segmentation and terrestrial analysis from space (i.e. crop estimation). Despite significant prior research, the area of object recognition still has many topics to be explored. This PhD thesis focuses on using advanced machine learning approaches to enable the automated recognition of fresh produce (i.e. fruits and vegetables) at supermarket self-checkouts. This type of complex classification task is one of the most recently emerging applications of advanced computer vision approaches and is a productive research topic in this field due to the limited means of representing the features and machine learning techniques for classification. Fruits and vegetables offer significant inter and intra class variance in weight, shape, size, colour and texture which makes the classification challenging. The applications of effective fruit and vegetable classification have significant importance in daily life e.g. crop estimation, fruit classification, robotic harvesting, fruit quality assessment, etc. One potential application for this fruit and vegetable classification capability is for supermarket self-checkouts. Increasingly, supermarkets are introducing self-checkouts in stores to make the checkout process easier and faster. However, there are a number of challenges with this as all goods cannot readily be sold with packaging and barcodes, for instance loose fresh items (e.g. fruits and vegetables). Adding barcodes to these types of items individually is impractical and pre-packaging limits the freedom of choice when selecting fruits and vegetables and creates additional waste, hence reducing customer satisfaction. The current situation, which relies on customers correctly identifying produce themselves leaves open the potential for incorrect billing either due to inadvertent error, or due to intentional fraudulent misclassification resulting in financial losses for the store. To address this identified problem, the main goals of this PhD work are: (a) exploring the types of visual and non-visual sensors that could be incorporated into a self-checkout system for classification of fruits and vegetables, (b) determining a suitable feature representation method for fresh produce items available at supermarkets, (c) identifying optimal machine learning techniques for classification within this context and (d) evaluating our work relative to the state-of-the-art object classification results presented in the literature. An in-depth analysis of related computer vision literature and techniques is performed to identify and implement the possible solutions. A progressive process distribution approach is used for this project where the task of computer vision based fruit and vegetables classification is divided into pre-processing and classification techniques. Different classification techniques have been implemented and evaluated as possible solution for this problem. Both visual and non-visual features of fruit and vegetables are exploited to perform the classification. Novel classification techniques have been carefully developed to deal with the complex and highly variant physical features of fruit and vegetables while taking advantages of both visual and non-visual features. The capability of classification techniques is tested in individual and ensemble manner to achieved the higher effectiveness. Significant results have been obtained where it can be concluded that the fruit and vegetables classification is complex task with many challenges involved. It is also observed that a larger dataset can better comprehend the complex variant features of fruit and vegetables. Complex multidimensional features can be extracted from the larger datasets to generalise on higher number of classes. However, development of a larger multiclass dataset is an expensive and time consuming process. The effectiveness of classification techniques can be significantly improved by subtracting the background occlusions and complexities. It is also worth mentioning that ensemble of simple and less complicated classification techniques can achieve effective results even if applied to less number of features for smaller number of classes. The combination of visual and nonvisual features can reduce the struggle of a classification technique to deal with higher number of classes with similar physical features. Classification of fruit and vegetables with similar physical features (i.e. colour and texture) needs careful estimation and hyper-dimensional embedding of visual features. Implementing rigorous classification penalties as loss function can achieve this goal at the cost of time and computational requirements. There is a significant need to develop larger datasets for different fruit and vegetables related computer vision applications. Considering more sophisticated loss function penalties and discriminative hyper-dimensional features embedding techniques can significantly improve the effectiveness of the classification techniques for the fruit and vegetables applications

    Cylinders extraction in non-oriented point clouds as a clustering problem

    Get PDF
    Finding geometric primitives in 3D point clouds is a fundamental task in many engineering applications such as robotics, autonomous-vehicles and automated industrial inspection. Among all solid shapes, cylinders are frequently found in a variety of scenes, comprising natural or man-made objects. Despite their ubiquitous presence, automated extraction and fitting can become challenging if performed ”in-the-wild”, when the number of primitives is unknown or the point cloud is noisy and not oriented. In this paper we pose the problem of extracting multiple cylinders in a scene by means of a Game-Theoretic inlier selection process exploiting the geometrical relations between pairs of axis candidates. First, we formulate the similarity between two possible cylinders considering the rigid motion aligning the two axes to the same line. This motion is represented with a unitary dual-quaternion so that the distance between two cylinders is induced by the length of the shortest geodesic path in SE(3). Then, a Game-Theoretical process exploits such similarity function to extract sets of primitives maximizing their inner mutual consensus. The outcome of the evolutionary process consists in a probability distribution over the sets of candidates (ie axes), which in turn is used to directly estimate the final cylinder parameters. An extensive experimental section shows that the proposed algorithm offers a high resilience to noise, since the process inherently discards inconsistent data. Compared to other methods, it does not need point normals and does not require a fine tuning of multiple parameters

    A review of clustering techniques and developments

    Full text link
    © 2017 Elsevier B.V. This paper presents a comprehensive study on clustering: exiting methods and developments made at various times. Clustering is defined as an unsupervised learning where the objects are grouped on the basis of some similarity inherent among them. There are different methods for clustering the objects such as hierarchical, partitional, grid, density based and model based. The approaches used in these methods are discussed with their respective states of art and applicability. The measures of similarity as well as the evaluation criteria, which are the central components of clustering, are also presented in the paper. The applications of clustering in some fields like image segmentation, object and character recognition and data mining are highlighted

    Contrastive Bi-Projector for Unsupervised Domain Adaption

    Full text link
    This paper proposes a novel unsupervised domain adaption (UDA) method based on contrastive bi-projector (CBP), which can improve the existing UDA methods. It is called CBPUDA here, which effectively promotes the feature extractors (FEs) to reduce the generation of ambiguous features for classification and domain adaption. The CBP differs from traditional bi-classifier-based methods at that these two classifiers are replaced with two projectors of performing a mapping from the input feature to two distinct features. These two projectors and the FEs in the CBPUDA can be trained adversarially to obtain more refined decision boundaries so that it can possess powerful classification performance. Two properties of the proposed loss function are analyzed here. The first property is to derive an upper bound of joint prediction entropy, which is used to form the proposed loss function, contrastive discrepancy (CD) loss. The CD loss takes the advantages of the contrastive learning and the bi-classifier. The second property is to analyze the gradient of the CD loss and then overcome the drawback of the CD loss. The result of the second property is utilized in the development of the gradient scaling (GS) scheme in this paper. The GS scheme can be exploited to tackle the unstable problem of the CD loss because training the CBPUDA requires using contrastive learning and adversarial learning at the same time. Therefore, using the CD loss with the GS scheme overcomes the problem mentioned above to make features more compact for intra-class and distinguishable for inter-class. Experimental results express that the CBPUDA is superior to conventional UDA methods under consideration in this paper for UDA and fine-grained UDA tasks

    HandNeRF: Learning to Reconstruct Hand-Object Interaction Scene from a Single RGB Image

    Full text link
    This paper presents a method to learn hand-object interaction prior for reconstructing a 3D hand-object scene from a single RGB image. The inference as well as training-data generation for 3D hand-object scene reconstruction is challenging due to the depth ambiguity of a single image and occlusions by the hand and object. We turn this challenge into an opportunity by utilizing the hand shape to constrain the possible relative configuration of the hand and object geometry. We design a generalizable implicit function, HandNeRF, that explicitly encodes the correlation of the 3D hand shape features and 2D object features to predict the hand and object scene geometry. With experiments on real-world datasets, we show that HandNeRF is able to reconstruct hand-object scenes of novel grasp configurations more accurately than comparable methods. Moreover, we demonstrate that object reconstruction from HandNeRF ensures more accurate execution of a downstream task, such as grasping for robotic hand-over.Comment: 9 pages, 4 tables, 7 figure

    Color image segmentation using saturated RGB colors and decoupling the intensity from the hue

    Get PDF
    Although the RGB space is accepted to represent colors, it is not adequate for color processing. In related works the colors are usually mapped to other color spaces more suitable for color processing, but it may imply an important computational load because of the non-linear operations involved to map the colors between spaces; nevertheless, it is common to find in the state-of-the-art works using the RGB space. In this paper we introduce an approach for color image segmentation, using the RGB space to represent and process colors; where the chromaticity and the intensity are processed separately, mimicking the human perception of color, reducing the underlying sensitiveness to intensity of the RGB space. We show the hue of colors can be processed by training a self-organizing map with chromaticity samples of the most saturated colors, where the training set is small but very representative; once the neural network is trained it can be employed to process any given image without training it again. We create an intensity channel by extracting the magnitudes of the color vectors; by using the Otsu method, we compute the threshold values to divide the intensity range in three classes. We perform experiments with the Berkeley segmentation database; in order to show the benefits of our proposal, we perform experiments with a neural network trained with different colors by subsampling the RGB space, where the chromaticity and the intensity are processed jointly. We evaluate and compare quantitatively the segmented images obtained with both approaches. We claim to obtain competitive results with respect to related works

    Improving the image quality in compressed sensing MRI by the exploitation of data properties

    Get PDF

    Hyperspectral Image Analysis of Food for Nutritional Intake

    Full text link
    The primary object of this dissertation is to investigate the application of hyperspectral technology to accommodate for the growing demand in the automatic dietary assessment applications. Food intake is one of the main factors that contribute to human health. In other words, it is necessary to get information about the amount of nutrition and vitamins that a human body requires through a daily diet. Manual dietary assessments are time-consuming and are also not precise enough, especially when the information is used for the care and treatment of hospitalized patients. Moreover, the data must be analyzed by nutritional experts. Therefore, researchers have developed various semiautomatic or automatic dietary assessment systems; most of them are based on the conventional color images such as RGB. The main disadvantage of such systems is their inability to differentiate foods of similar color or same ingredients in various colors, or different forms such as cooked or mixed forms. Although adding features such as shape, size and texture improve the overall performance, they are sensitive to changes in the illumination, rotation, scale, etc. A balance between quality and quantity of features representation, and system efficiency must also be considered. Hyperspectral technology combines conventional imaging technology with spectroscopy in a three-dimensional data-cube to obtain both the spatial and spectral information of the objects. However, the high dimensionality of hyperspectral data in addition to the redundancy between spectral bands limits performance, especially in online or onboard data processing applications. Thus, various features selection/extraction are also used to select the optimal feature subsets. The results are promising and verify the feasibility of using hyperspectral technology in dietary assessment applications
    • …
    corecore