177 research outputs found

    A Robust Algorithm for Emoji Detection in Smartphone Screenshot Images

    Get PDF
    The increasing use of smartphones and social media apps for communication results in a massive number of screenshot images. These images enrich the written language through text and emojis. In this regard, several studies in the image analysis field have considered text. However, they ignored the use of emojis. In this study, a robust two-stage algorithm for detecting emojis in screenshot images is proposed. The first stage localizes the regions of candidate emojis by using the proposed RGB-channel analysis method followed by a connected component method with a set of proposed rules. In the second verification stage, each of the emojis and non-emojis are classified by using proposed features with a decision tree classifier. Experiments were conducted to evaluate each stage independently and assess the performance of the proposed algorithm completely by using a self-collected dataset. The results showed that the proposed RGB-channel analysis method achieved better performance than the Niblack and Sauvola methods. Moreover, the proposed feature extraction method with decision tree classifier achieved more satisfactory performance than the LBP feature extraction method with all Bayesian network, perceptron neural network, and decision table rules. Overall, the proposed algorithm exhibited high efficiency in detecting emojis in screenshot images

    Parking lot monitoring system using an autonomous quadrotor UAV

    Get PDF
    The main goal of this thesis is to develop a drone-based parking lot monitoring system using low-cost hardware and open-source software. Similar to wall-mounted surveillance cameras, a drone-based system can monitor parking lots without affecting the flow of traffic while also offering the mobility of patrol vehicles. The Parrot AR Drone 2.0 is the quadrotor drone used in this work due to its modularity and cost efficiency. Video and navigation data (including GPS) are communicated to a host computer using a Wi-Fi connection. The host computer analyzes navigation data using a custom flight control loop to determine control commands to be sent to the drone. A new license plate recognition pipeline is used to identify license plates of vehicles from video received from the drone

    Recognition of Characters from Streaming Videos

    Get PDF
    Non

    Extraction of Text from Images and Videos

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    A study of holistic strategies for the recognition of characters in natural scene images

    Get PDF
    Recognition and understanding of text in scene images is an important and challenging task. The importance can be seen in the context of tasks such as assisted navigation for the blind, providing directions to driverless cars, e.g. Google car, etc. Other applications include automated document archival services, mining text from images, and so on. The challenge comes from a variety of factors, like variable typefaces, uncontrolled imaging conditions, and various sources of noise corrupting the captured images. In this work, we study and address the fundamental problem of recognition of characters extracted from natural scene images, and contribute three holistic strategies to deal with this challenging task. Scene text recognition (STR) has been a known problem in computer vision and pattern recognition community for over two decades, and is still an active area of research owing to the fact that the recognition performance has still got a lot of room for improvement. Recognition of characters lies at the heart of STR and is a crucial component for a reliable STR system. Most of the current methods heavily rely on discriminative power of local features, such as histograms of oriented gradient (HoG), scale invariant feature transform (SIFT), shape contexts (SC), geometric blur (GB), etc. One of the problems with such methods is that the local features are rasterized in an ad hoc manner to get a single vector for subsequent use in recognition. This rearrangement of features clearly perturbs the spatial correlations that may carry crucial information vis-á-vis recognition. Moreover, such approaches, in general, do not take into account the rotational invariance property that often leads to failed recognition in cases where characters in scene images do not occur in upright position. To eliminate this local feature dependency and the associated problems, we propose the following three holistic solutions: The first one is based on modelling character images of a class as a 3-mode tensor and then factoring it into a set of rank-1 matrices and the associated mixing coefficients. Each set of rank-1 matrices spans the solution subspace of a specific image class and enables us to capture the required holistic signature for each character class along with the mixing coefficients associated with each character image. During recognition, we project each test image onto the candidate subspaces to derive its mixing coefficients, which are eventually used for final classification. The second approach we study in this work lets us form a novel holistic feature for character recognition based on active contour model, also known as snakes. Our feature vector is based on two variables, direction and distance, cumulatively traversed by each point as the initial circular contour evolves under the force field induced by the character image. The initial contour design in conjunction with cross-correlation based similarity metric enables us to account for rotational variance in the character image. Our third approach is based on modelling a 3-mode tensor via rotation of a single image. This is different from our tensor based approach described above in that we form the tensor using a single image instead of collecting a specific number of samples of a particular class. In this case, to generate a 3D image cube, we rotate an image through a predefined range of angles. This enables us to explicitly capture rotational variance and leads to better performance than various local approaches. Finally, as an application, we use our holistic model to recognize word images extracted from natural scenes. Here we first use our novel word segmentation method based on image seam analysis to split a scene word into individual character images. We then apply our holistic model to recognize individual letters and use a spell-checker module to get the final word prediction. Throughout our work, we employ popular scene text datasets, like Chars74K-Font, Chars74K-Image, SVT, and ICDAR03, which include synthetic and natural image sets, to test the performance of our strategies. We compare results of our recognition models with several baseline methods and show comparable or better performance than several local feature-based methods justifying thus the importance of holistic strategies

    IMPROVED LICENSE PLATE LOCALIZATION ALGORITHM BASED ON MORPHOLOGICAL OPERATIONS

    Get PDF
    Automatic License Plate Recognition (ALPR) systems have become an important tool to track stolen cars, access control, and monitor traffic. ALPR system consists of locating the license plate in an image, followed by character detection and recognition. Since the license plate can exist anywhere within an image, localization is the most important part of ALPR and requires greater processing time. Most ALPR systems are computationally intensive and require a high-performance computer. The proposed algorithm differs significantly from those utilized in previous ALPR technologies by offering a fast algorithm, composed of structural elements which more precisely conducts morphological operations within an image, and can be implemented in portable devices with low computation capabilities. The proposed algorithm is able to accurately detect and differentiate license plates in complex images. This method was first tested through MATLAB with an on-line public database of Greek license plates which is a popular benchmark used in previous works. The proposed algorithm was 100% accurate in all clear images, and achieved 98.45% accuracy when using the entire database which included complex backgrounds and license plates obscured by shadow and dirt. Second, the efficiency of the algorithm was tested in devices with low computational processing power, by translating the code to Python, and was 300% faster than previous work
    corecore