12 research outputs found

    Statistical methods for fine-grained retail product recognition

    Get PDF
    In recent years, computer vision has become a major instrument in automating retail processes with emerging smart applications such as shopper assistance, visual product search (e.g., Google Lens), no-checkout stores (e.g., Amazon Go), real-time inventory tracking, out-of-stock detection, and shelf execution. At the core of these applications lies the problem of product recognition, which poses a variety of new challenges in contrast to generic object recognition. Product recognition is a special instance of fine-grained classification. Considering the sheer diversity of packaged goods in a typical hypermarket, we are confronted with up to tens of thousands of classes, which, particularly if under the same product brand, tend to have only minute visual differences in shape, packaging texture, metric size, etc., making them very difficult to discriminate from one another. Another challenge is the limited number of available datasets, which either have only a few training examples per class that are taken under ideal studio conditions, hence requiring cross-dataset generalization, or are captured from the shelf in an actual retail environment and thus suffer from issues like blur, low resolution, occlusions, unexpected backgrounds, etc. Thus, an effective product classification system requires substantially more information in addition to the knowledge obtained from product images alone. In this thesis, we propose statistical methods for a fine-grained retail product recognition. In our first framework, we propose a novel context-aware hybrid classification system for the fine-grained retail product recognition problem. In the second framework, state-of-the-art convolutional neural networks are explored and adapted to fine-grained recognition of products. The third framework, which is the most significant contribution of this thesis, presents a new approach for fine-grained classification of retail products that learns and exploits statistical context information about likely product arrangements on shelves, incorporates visual hierarchies across brands, and returns recognition results as "confidence sets" that are guaranteed to contain the true class at a given confidence leve

    Context-aware confidence sets for fine-grained product recognition

    Get PDF
    We present a new approach for fine-grained classification of retail products, which learns and exploits statistical context information about likely product arrangements on shelves, incorporates visual hierarchies across brands, and returns recognition results as “confidence sets” that are guaranteed to contain the true class at a given confidence level. Our system consists of three important components: 1) a nested hierarchy of product classes are automatically constructed based on visual similarities, 2) a confidence set predictor is trained based on class posteriors by using coarse-to-fine binary classifiers to discriminate each nested cluster of the hierarchy from the remainder of classes and a Bayesian network (BN) model that encodes the joint distribution of classifier scores with the fine-level class variable, and 3) n hidden Markov model (HMM) is trained with nested hidden states from the class hierarchy to model spatial transition across the nodes of product class hierarchy and resolve errors in the context-free confidence set results. Novel aspects of the proposed method include 1) combining confidence sets and context information via a HMM, 2) applying this concept to fine grained recognition of products arranged in retail shelves, and 3) presenting experimental results on four large datasets, collected from actual retail stores. We compare our approach with existing confidence set approaches and state-of-the-art convolutional neural networks classifiers including SENet-154, DenseNet-161, B-CNN, and Inception-Resnet-v2. Our approach performs comparably or better than state-of-the-art deep classifiers and exhibits high accuracy for relatively small confidence set sizes

    Context-aware hybrid classification system for fine-grained retail product recognition

    Get PDF
    We present a context-aware hybrid classification system for the problem of fine-grained product class recognition in computer vision. Recently, retail product recognition has become an interesting computer vision research topic. We focus on the classification of products on shelves in a store. This is a very challenging classification problem because many product classes are visually similar in terms of shape, color, texture, and metric size. In shelves, same or similar products are more likely to appear adjacent to each other and displayed in certain arrangements rather than at random. The arrangement of the products on the shelves has a spatial continuity both in brand and metric size. By using this context information, the co-occurrence of the products and the adjacency relations between the products can be statistically modeled. The proposed hybrid approach improves the accuracy of context-free image classifiers such as Support Vector Machines (SVMs), by combining them with a probabilistic graphical model such as Hidden Markov Models (HMMs) or Conditional Random Fields (CRFs). The fundamental goal of this paper is using contextual relationships in retail shelves to improve the classification accuracy by executing a context-aware approach

    Retail product recognition with a graphical shelf model (Çizgisel raf modeli ile perakende ürün tanıma)

    Get PDF
    Recently, retail product recognition has become an interesting computer vision research topic. The classification of products on shelves is a very challenging classification problem because many product classes are visually similar in terms of shape, color, texture, and metric size. In shelves, same or similar products are more likely to appear adjacent to each other and displayed in certain arrangements rather than at random. The arrangement of the products on the shelves has a spatial continuity both in brand and metric size. By using this context information, the co-occurrence of the products and the adjacency relations between the products can be statistically modeled. In this work, we present a context-aware hybrid classification system for the problem of fine-grained product class recognition. The proposed hybrid approach improves the accuracy of the context-free image classifiers, by combining them with a probabilistic graphical model based on Hidden Markov Models. The fundamental goal of this paper is to use contextual relationships in retail shelves to improve accuracy of the product classifier

    Dynamically adaptive real-time disparity estimation hardware using iterative refinement

    Get PDF
    The computational complexity of disparity estimation algorithms and the need of large size and bandwidth for the external and internal memory make the real-time processing of disparity estimation challenging, especially for High Resolution (HR) images. This paper proposes a hardware-oriented adaptive window size disparity estimation (AWDE) algorithm and its real-time reconfigurable hardware implementation that targets HR video with high quality disparity results. Moreover, an enhanced version of the AWDE implementation that uses iterative refinement (AWDE-IR) is presented. The AWDE and AWDE-IR algorithms dynamically adapt the window size considering the local texture of the image to increase the disparity estimation quality. The proposed reconfigurable hardware architectures of the AWDE and AWDE-IR algorithms enable handling 60 frames per second on a Virtex-5 FPGA at a 1024×768 XGA video resolution for a 128 pixel disparity range

    On comparison of different classification techniques for the fine-grained retail product recognition problem (Farklı sınıflandırma yöntemlerinin çoklu benzer perakende ürünlerin sınıflandırılması problemi için karşılaştırılması)

    Get PDF
    Classification systems of retail products have recently been gaining more importance. There are many classes of retail products and the resemblance of these products makes the design of product recognition systems, which have many application areas, more challenging. In this paper, we present a comparison of different classification techniques that are widely used in computer vision for image classification on retail product images taken by smart-phones

    Compressed look-up-table based real-time rectification hardware

    Get PDF
    Stereo image rectification is a pre-processing step of disparity estimation intended to remove image distortions and to enable stereo matching along an epipolar line. A real-time disparity estimation system needs to perform real-time rectification which requires solving the models of lens distortions, image translations and rotations. Look-up-table based rectification algorithms allow image rectification without demanding high complexity operations. However, they require an external memory to store large size look-up-tables. In this work, we present an intermediate solution that compresses the rectification information to fit the look-up-table into the onchip memory of a Virtex-5 FPGA. The low-complexity decompression process requires a negligible amount of hardware resources for its real-time implementation. The proposed image rectification hardware consumes 0.28% of the DFF and 0.32% of the LUT resources of the Virtex-5 XCUVP-110T FPGA, it can process 347 frames per second for a 1024×768 pixels image resolution, and it does not need the availability of an external memory

    A Hardware-Oriented Dynamically Adaptive Disparity Estimation Algorithm and its Real-Time Hardware

    Get PDF
    The computational complexity of disparity estimation algorithms and the need of large size and bandwidth for the external and internal memory make the real-time processing of disparity estimation challenging, especially for High Resolution (HR) images. This paper proposes a hardware-oriented adaptive window size disparity estimation (AWDE) algorithm and its real time reconfigurable hardware implementation that targets HR video with high quality disparity results. The proposed algorithm is a hybrid solution involving the Sum of Absolute Differences and the Census cost computation methods to vote and select the best suitable disparity candidates. It utilizes a pixel intensity based refinement step to remove faulty disparity computations. The AWDE algorithm dynamically adapts the window size considering the local texture of the image to increase the disparity estimation quality. The proposed reconfigurable hardware of the AWDE algorithm enables handling 60 frames per second on Virtex-5 FPGA at a 1024×768 XGA video resolution for a 120 pixel disparity range

    Trinocular Adaptive Window Size Disparity Estimation Algorithm and Its Real-Time Hardware

    No full text
    This paper proposes a hardware-oriented trinocular adaptive window size disparity estimation (T-AWDE) algorithm and the first real-time trinocular disparity estimation (DE) hardware that targets high-resolution images with high-quality disparity results. The proposed trinocular DE hardware is the enhanced version of the recently published binocular AWDE implementation. The T-AWDE hardware generates a very high quality depth map by merging two depth maps obtained from the center-left and center-right camera pairs. The T-AWDE hardware enhances disparity results by applying a double checking scheme which solves most of the occlusion problems existing in the AWDE implementation while providing correct disparity results even for objects located at left or right edge of the center image. The proposed T-AWDE hardware architecture enables handling 55 frames per second on a Virtex-7 FPGA at a 1024×768 XGA video resolution for a 128 pixels disparity range
    corecore