425 research outputs found

    Single Slice Grouping Mechanism for Recognition of Cursive Handwritten Courtesy Amounts of Malaysian Bank Cheques

    Get PDF
    Mechanism to group single slice for recognition involves the process of cutting vertically across an image slice by slice, group every slice at a certain width and tested for recognition using a trained Neural network. The image contains cursive handwritten courtesy Amounts of Malaysian bank cheques. A three layer neural Network architecture with the new error function of Backpropagation learning algorithm is used. This approach yields good recognition results with faster convergence rates

    Classification of reduction invariants with improved backpropagation

    Get PDF
    Data reduction is a process of feature extraction that transforms the data space into a feature space of much lower dimension compared to the original data space, yet it retains most of the intrinsic information content of the data. This can be done by using a number of methods, such as principal component analysis (PCA), factor analysis, and feature clustering. Principal components are extracted from a collection of multivariate cases as a way of accounting for as much of the variation in that collection as possible by means of as few variables as possible. On the other hand, backpropagation network has been used extensively in classification problems such as XOR problems, share prices prediction, and pattern recognition. This paper proposes an improved error signal of backpropagation network for classification of the reduction invariants using principal component analysis, for extracting the bulk of the useful information present in moment invariants of handwritten digits, leaving the redundant information behind. Higher order centralised scale- invariants are used to extract features of handwritten digits before PCA, and the reduction invariants are sent to the improved backpropagation model for classification purposes

    Feature Extraction Methods for Character Recognition

    Get PDF
    Not Include

    Improving Bags-of-Words model for object categorization

    Get PDF
    In the past decade, Bags-of-Words (BOW) models have become popular for the task of object recognition, owing to their good performance and simplicity. Some of the most effective recent methods for computer-based object recognition work by detecting and extracting local image features, before quantizing them according to a codebook rule such as k-means clustering, and classifying these with conventional classifiers such as Support Vector Machines and Naive Bayes. In this thesis, a Spatial Object Recognition Framework is presented that consists of the four main contributions of the research. The first contribution, frequent keypoint pattern discovery, works by combining pairs and triplets of frequent keypoints in order to discover intermediate representations for object classes. Based on the same frequent keypoints principle, algorithms for locating the region-of-interest in training images is then discussed. Extensions to the successful Spatial Pyramid Matching scheme, in order to better capture spatial relationships, are then proposed. The pairs frequency histogram and shapes frequency histogram work by capturing more redefined spatial information between local image features. Finally, alternative techniques to Spatial Pyramid Matching for capturing spatial information are presented. The proposed techniques, variations of binned log-polar histograms, divides the image into grids of different scale and different orientation. Thus captures the distribution of image features both in distance and orientation explicitly. Evaluations on the framework are focused on several recent and popular datasets, including image retrieval, object recognition, and object categorization. Overall, while the effectiveness of the framework is limited in some of the datasets, the proposed contributions are nevertheless powerful improvements of the BOW model

    New human action recognition scheme with geometrical feature representation and invariant discretization for video surveillance

    Get PDF
    Human action recognition is an active research area in computer vision because of its immense application in the field of video surveillance, video retrieval, security systems, video indexing and human computer interaction. Action recognition is classified as the time varying feature data generated by human under different viewpoint that aims to build mapping between dynamic image information and semantic understanding. Although a great deal of progress has been made in recognition of human actions during last two decades, few proposed approaches in literature are reported. This leads to a need for much research works to be conducted in addressing on going challenges leading to developing more efficient approaches to solve human action recognition. Feature extraction is the main tasks in action recognition that represents the core of any action recognition procedure. The process of feature extraction involves transforming the input data that describe the shape of a segmented silhouette of a moving person into the set of represented features of action poses. In video surveillance, global moment invariant based on Geometrical Moment Invariant (GMI) is widely used in human action recognition. However, there are many drawbacks of GMI such that it lack of granular interpretation of the invariants relative to the shape. Consequently, the representation of features has not been standardized. Hence, this study proposes a new scheme of human action recognition (HAR) with geometrical moment invariants for feature extraction and supervised invariant discretization in identifying actions uniqueness in video sequencing. The proposed scheme is tested using IXMAS dataset in video sequence that has non rigid nature of human poses that resulting from drastic illumination changes, changing in pose and erratic motion patterns. The invarianceness of the proposed scheme is validated based on the intra-class and inter-class analysis. The result of the proposed scheme yields better performance in action recognition compared to the conventional scheme with an average of more than 99% accuracy while preserving the shape of the human actions in video images

    Boosting Deep Neural Networks with Geometrical Prior Knowledge: A Survey

    Full text link
    While Deep Neural Networks (DNNs) achieve state-of-the-art results in many different problem settings, they are affected by some crucial weaknesses. On the one hand, DNNs depend on exploiting a vast amount of training data, whose labeling process is time-consuming and expensive. On the other hand, DNNs are often treated as black box systems, which complicates their evaluation and validation. Both problems can be mitigated by incorporating prior knowledge into the DNN. One promising field, inspired by the success of convolutional neural networks (CNNs) in computer vision tasks, is to incorporate knowledge about symmetric geometrical transformations of the problem to solve. This promises an increased data-efficiency and filter responses that are interpretable more easily. In this survey, we try to give a concise overview about different approaches to incorporate geometrical prior knowledge into DNNs. Additionally, we try to connect those methods to the field of 3D object detection for autonomous driving, where we expect promising results applying those methods.Comment: Survey Pape

    The robustness of animated text CAPTCHAs

    Get PDF
    PhD ThesisCAPTCHA is standard security technology that uses AI techniques to tells computer and human apart. The most widely used CAPTCHA are text-based CAPTCHA schemes. The robustness and usability of these CAPTCHAs relies mainly on the segmentation resistance mechanism that provides robustness against individual character recognition attacks. However, many CAPTCHAs have been shown to have critical flaws caused by many exploitable invariants in their design, leaving only a few CAPTCHA schemes resistant to attacks, including ReCAPTCHA and the Wikipedia CAPTCHA. Therefore, new alternative approaches to add motion to the CAPTCHA are used to add another dimension to the character cracking algorithms by animating the distorted characters and the background, which are also supported by tracking resistance mechanisms that prevent the attacks from identifying the main answer through frame-toframe attacks. These technologies are used in many of the new CAPTCHA schemes including the Yahoo CAPTCHA, CAPTCHANIM, KillBot CAPTCHAs, non-standard CAPTCHA and NuCAPTCHA. Our first question: can the animated techniques included in the new CAPTCHA schemes provide the required level of robustness against the attacks? Our examination has shown many of the CAPTCHA schemes that use the animated features can be broken through tracking attacks including the CAPTCHA schemes that uses complicated tracking resistance mechanisms. The second question: can the segmentation resistance mechanism used in the latest standard text-based CAPTCHA schemes still provide the additional required level of resistance against attacks that are not present missed in animated schemes? Our test against the latest version of ReCAPTCHA and the Wikipedia CAPTCHA exposed vulnerability problems against the novel attacks mechanisms that achieved a high success rate against them. The third question: how much space is available to design an animated text-based CAPTCHA scheme that could provide a good balance between security and usability? We designed a new animated text-based CAPTCHA using guidelines we designed based on the results of our attacks on standard and animated text-based CAPTCHAs, and we then tested its security and usability to answer this question. ii In this thesis, we put forward different approaches to examining the robustness of animated text-based CAPTCHA schemes and other standard text-based CAPTCHA schemes against segmentation and tracking attacks. Our attacks included several methodologies that required thinking skills in order to distinguish the animated text from the other animated noises, including the text distorted by highly tracking resistance mechanisms that displayed them partially as animated segments and which looked similar to noises in other CAPTCHA schemes. These attacks also include novel attack mechanisms and other mechanisms that uses a recognition engine supported by attacking methods that exploit the identified invariants to recognise the connected characters at once. Our attacks also provided a guideline for animated text-based CAPTCHAs that could provide resistance to tracking and segmentation attacks which we designed and tested in terms of security and usability, as mentioned before. Our research also contributes towards providing a toolbox for breaking CAPTCHAs in addition to a list of robustness and usability issues in the current CAPTCHA design that can be used to provide a better understanding of how to design a more resistant CAPTCHA scheme

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Representation Learning: A Review and New Perspectives

    Full text link
    The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, auto-encoders, manifold learning, and deep networks. This motivates longer-term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation and manifold learning
    corecore