244 research outputs found

    Car make and model recognition system using rear-lamp features and convolutional neural networks

    Get PDF
    Recognizing cars based on their features is a difficult task. We propose a solution that uses a convolutional neural network (CNN) and image binarization method for car make and model classification. Unlike many previous works in this area, we use a feature extraction method combined with a binarization method. In the first stage of the pre-processing part we normalize and change the size of an image. The image is then used to recognize where the rear-lamps are placed on the image. We extract the region and use the image binarization method. The binarized image is used as input to the CNN network that finds the features of a specific car model. We have tested the combinations of three different neural network architectures and eight binarization methods. The convolutional neural network with parameters of the highest quality metrics value is used to find the characteristics of the rear lamps on the binary image. The convolutional network is tested with four different gradient algorithms. We have tested the method on two data sets which differ in the way the images were taken. Each data set consists of three subsets of the same car, but is scaled to different image dimensions. Compared to related works that are based on CNN, we use rear view images in different position and light exposure. The proposed method gives better results compared to most available methods. It is also less complex, and faster to train compared to other methods. The proposed approach achieves an average accuracy of 93,9% on the first data set and 84,5% on the second set

    OCR Applied for Identification of Vehicles with Irregular Documentation Using IoT

    Get PDF
    Given the lack of investments in surveillance in remote places, this paper presents a prototype that identifies vehicles in irregular conditions, notifying a group of people, such as a network of neighbors, through a low-cost embedded system based on the Internet of things (IoT). The developed prototype allows the visualization of the location, date and time of the event, and vehicle information such as license plate, make, model, color, city, state, passenger capacity and restrictions. It also offers a responsive interface in two languages: Portuguese and English. The proposed device addresses technical concepts pertinent to image processing such as binarization, analysis of possible characters on the plate, plate border location, perspective transformation, character segmentation, optical character recognition (OCR) and post-processing. The embedded system is based on a Raspberry having support to GPS, solar panels, communication via 3G modem, wi-fi, camera and motion sensors. Tests were performed regarding the vehicle’s positioning and the percentage of assertiveness in image processing, where the vehicles are at different angles, speeds and distances. The prototype can be a viable alternative because the results were satisfactory concerning the recognition of the license plates, mobility and autonomy

    Segmentation of slap fingerprints

    Get PDF
    This thesis describes a novel algorithm that segments the individual fingerprints in a multi-print image. The algorithm identifies the distal phalanx portion of each finger that appears in the image and labels them as an index, middle, little or ring finger. The accuracy of this algorithm is compared with the publicly-available reference implementation, NFSEG, part of the NIST Biometric Image Software (NBIS) suite developed at National Institute of Standards and Technology (NIST). The comparison is performed over large set of fingerprint images captured from unique individuals

    An automatic vision guided position controller in a conveyor belt pick and place system

    Get PDF
    Thesis (Master)--Izmir Institute of Technology, Computer Engineering, Izmir, 2006Includes bibliographical references (leaves: 64-65)Text in English; Abstract: Turkish and Englishxii, 67 leavesAn automatic vision guided position controller system is developed as for possible applications such as handling and packaging that require position and orientation control. The aim here is to minimize the production cycle time, and to improve the economic performance and system productivity. The system designed can be partitioned into five major parts: vision module, pneumatic automation module, manipulator, conveyor-belt and a software that manages and integrates these modules. The developed software captures raw image data from a camera that is connected to a PC via usb port. Using image processing methods, this software determines the proper coordinates and pose of the moving parts on the conveyor belt in real time. The pick and place system locates the parts to the packaging area as part.s predefined orientation. The software communicates with a controller card via serial port, manages and synchronizes the peripherals (conveyor belt stepper motors- pneumatic valves,etc) of the system. C programming language is used in the implementation. OpenCV library is utilized for image acquisition. The system has the following characteristics: The Conveyor belt runs with a constant speed and objects on the conveyor belt may have arbitrary position and orientation. The vision system detects parts with their position and orientation on the moving conveyor belt based on a reference position. The manipulator picks the part and then corrects its position comparing the information obtained by vision system with predefined position, and it places the object to the packaging area. System can be trained for the desired position of the object

    Document image processing using irregular pyramid structure

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Skeleton-based fingerprint minutiae extraction.

    Get PDF
    by Zhao Feng.Thesis (M.Phil.)--Chinese University of Hong Kong, 2002.Includes bibliographical references (leaves 64-68).Abstracts in English and Chinese.Abstract --- p.iAcknowledgments --- p.viTable of Contents --- p.viiList of Figures --- p.ixList of Tables --- p.xChapter Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Automatic Personal Identification --- p.1Chapter 1.2 --- Biometrics --- p.2Chapter 1.2.1 --- Objectives --- p.2Chapter 1.2.2 --- Operational Mode --- p.3Chapter 1.2.3 --- Requirements --- p.3Chapter 1.2.4 --- Performance Evaluation --- p.4Chapter 1.2.5 --- Biometric Technologies --- p.4Chapter 1.3 --- Fingerprint --- p.6Chapter 1.3.1 --- Applications --- p.6Chapter 1.3.2 --- Advantages of Fingerprint Identification --- p.7Chapter 1.3.3 --- Permanence and Uniqueness --- p.8Chapter 1.4 --- Thesis Overview --- p.8Chapter 1.5 --- Summary --- p.9Chapter Chapter 2 --- Fingerprint Identification --- p.10Chapter 2.1 --- History of Fingerprints --- p.10Chapter 2.2 --- AFIS Architecture --- p.12Chapter 2.3 --- Fingerprint Acquisition --- p.15Chapter 2.4 --- Fingerprint Representation --- p.16Chapter 2.5 --- Fingerprint Classification --- p.18Chapter 2.6 --- Fingerprint Matching --- p.20Chapter 2.7 --- Challenges --- p.21Chapter 2.8 --- Combination Schemes --- p.22Chapter 2.9 --- Summary --- p.23Chapter Chapter 3 --- Live-Scan Fingerprint Database --- p.24Chapter 3.1 --- Live-Scan Fingerprint Sensors --- p.24Chapter 3.2 --- Database Features --- p.24Chapter 3.3 --- Filename Description --- p.28Chapter Chapter 4 --- Preprocessing for Skeleton-Based Minutiae Extraction --- p.30Chapter 4.1 --- Review of Minutiae-based Methods --- p.31Chapter 4.2 --- Skeleton-based Minutiae Extraction --- p.32Chapter 4.2.1 --- Preprocessing --- p.33Chapter 4.2.2 --- Validation of Bug Pixels and Minutiae Extraction --- p.40Chapter 4.3 --- Experimental Results --- p.42Chapter 4.4 --- Summary --- p.44Chapter Chapter 5 --- Post-Processing --- p.46Chapter 5.1 --- Review of Post-Processing Methods --- p.46Chapter 5.2 --- Post-Processing Algorithms --- p.47Chapter 5.2.1 --- H-Point --- p.47Chapter 5.2.2 --- Termination/Bifurcation Duality --- p.48Chapter 5.2.3 --- Post-Processing Procedure --- p.49Chapter 5.3 --- Experimental Results --- p.52Chapter 5.4 --- Summary --- p.54Chapter Chapter 6 --- Conclusions and Future Work --- p.58Chapter 6.1 --- Conclusions --- p.58Chapter 6.2 --- Problems and Future Works --- p.59Chapter 6.2.1 --- Problem 1 --- p.59Chapter 6.2.2 --- Problem 2 --- p.61Chapter 6.2.3 --- Problem 3 --- p.61Chapter 6.2.4 --- Future Works --- p.62Bibliography --- p.6

    Vision Based Extraction of Nutrition Information from Skewed Nutrition Labels

    Get PDF
    An important component of a healthy diet is the comprehension and retention of nutritional information and understanding of how different food items and nutritional constituents affect our bodies. In the U.S. and many other countries, nutritional information is primarily conveyed to consumers through nutrition labels (NLs) which can be found in all packaged food products. However, sometimes it becomes really challenging to utilize all this information available in these NLs even for consumers who are health conscious as they might not be familiar with nutritional terms or find it difficult to integrate nutritional data collection into their daily activities due to lack of time, motivation, or training. So it is essential to automate this data collection and interpretation process by integrating Computer Vision based algorithms to extract nutritional information from NLs because it improves the user’s ability to engage in continuous nutritional data collection and analysis. To make nutritional data collection more manageable and enjoyable for the users, we present a Proactive NUTrition Management System (PNUTS). PNUTS seeks to shift current research and clinical practices in nutrition management toward persuasion, automated nutritional information processing, and context-sensitive nutrition decision support. PNUTS consists of two modules, firstly a barcode scanning module which runs on smart phones and is capable of vision-based localization of One Dimensional (1D) Universal Product Code (UPC) and International Article Number (EAN) barcodes with relaxed pitch, roll, and yaw camera alignment constraints. The algorithm localizes barcodes in images by computing Dominant Orientations of Gradients (DOGs) of image segments and grouping smaller segments with similar DOGs into larger connected components. Connected components that pass given morphological criteria are marked as potential barcodes. The algorithm is implemented in a distributed, cloud-based system. The system’s front end is a smartphone application that runs on Android smartphones with Android 4.2 or higher. The system’s back end is deployed on a five node Linux cluster where images are processed. The algorithm was evaluated on a corpus of 7,545 images extracted from 506 videos of bags, bottles, boxes, and cans in a supermarket. The DOG algorithm was coupled to our in-place scanner for 1D UPC and EAN barcodes. The scanner receives from the DOG algorithm the rectangular planar dimensions of a connected component and the component’s dominant gradient orientation angle referred to as the skew angle. The scanner draws several scan lines at that skew angle within the component to recognize the barcode in place without any rotations. The scanner coupled to the localizer was tested on the same corpus of 7,545 images. Laboratory experiments indicate that the system can localize and scan barcodes of any orientation in the yaw plane, of up to 73.28 degrees in the pitch plane, and of up to 55.5 degrees in the roll plane. The videos have been made public for all interested research communities to replicate our findings or to use them in their own research. The front end Android application is available for free download at Google Play under the title of NutriGlass. This module is also coupled to a comprehensive NL database from which nutritional information can be retrieved on demand. Currently our NL database consists of more than 230,000 products. The second module of PNUTS is an algorithm whose objective is to determine the text skew angle of an NL image without constraining the angle’s magnitude. The horizontal, vertical, and diagonal matrices of the (Two Dimensional) 2D Haar Wavelet Transform are used to identify 2D points with significant intensity changes. The set of points is bounded with a minimum area rectangle whose rotation angle is the text’s skew. The algorithm’s performance is compared with the performance of five text skew detection algorithms on 1001 U.S. nutrition label images and 2200 single- and multi-column document images in multiple languages. To ensure the reproducibility of the reported results, the source code of the algorithm and the image data have been made publicly available. If the skew angle is estimated correctly, optical character recognition (OCR) techniques can be used to extract nutrition information

    Automatic Segmentation of Monofilament Testing Sites in Plantar Images for Diabetic Foot Management

    Get PDF
    Diabetic peripheral neuropathy is a major complication of diabetes mellitus, and it is the leading cause of foot ulceration and amputations. The Semmes–Weinstein monofilament examination (SWME) is a widely used, low-cost, evidence-based tool for predicting the prognosis of diabetic foot patients. The examination can be quick, but due to the high prevalence of the disease, many healthcare professionals can be assigned to this task several days per month. In an ongoing project, it is our objective to minimize the intervention of humans in the SWME by using an automated testing system relying on computer vision. In this paper we present the project’s first part, constituting a system for automatically identifying the SWME testing sites from digital images. For this, we have created a database of plantar images and developed a segmentation system, based on image processing and deep learning—both of which are novelties. From the 9 testing sites, the system was able to correctly identify most 8 in more than 80% of the images, and 3 of the testing sites were correctly identified in more than 97.8% of the images.Partially supported by FCT-UIDB/04730/2020 and FCT-UIDB/50014/2020 projects.info:eu-repo/semantics/publishedVersio

    Digit Recognition Using Composite Features With Decision Tree Strategy

    Get PDF
    At present, check transactions are one of the most common forms of money transfer in the market. The information for check exchange is printed using magnetic ink character recognition (MICR), widely used in the banking industry, primarily for processing check transactions. However, the magnetic ink card reader is specialized and expensive, resulting in general accounting departments or bookkeepers using manual data registration instead. An organization that deals with parts or corporate services might have to process 300 to 400 checks each day, which would require a considerable amount of labor to perform the registration process. The cost of a single-sided scanner is only 1/10 of the MICR; hence, using image recognition technology is an economical solution. In this study, we aim to use multiple features for character recognition of E13B, comprising ten numbers and four symbols. For the numeric part, we used statistical features such as image density features, geometric features, and simple decision trees for classification. The symbols of E13B are composed of three distinct rectangles, classified according to their size and relative position. Using the same sample set, MLP, LetNet-5, Alexnet, and hybrid CNN-SVM were used to train the numerical part of the artificial intelligence network as the experimental control group to verify the accuracy and speed of the proposed method. The results of this study were used to verify the performance and usability of the proposed method. Our proposed method obtained all test samples correctly, with a recognition rate close to 100%. A prediction time of less than one millisecond per character, with an average value of 0.03 ms, was achieved, over 50 times faster than state-of-the-art methods. The accuracy rate is also better than all comparative state-of-the-art methods. The proposed method was also applied to an embedded device to ensure the CPU would be used for verification instead of a high-end GPU
    corecore