2,838 research outputs found

    Sub-Classifier Construction for Error Correcting Output Code Using Minimum Weight Perfect Matching

    Full text link
    Multi-class classification is mandatory for real world problems and one of promising techniques for multi-class classification is Error Correcting Output Code. We propose a method for constructing the Error Correcting Output Code to obtain the suitable combination of positive and negative classes encoded to represent binary classifiers. The minimum weight perfect matching algorithm is applied to find the optimal pairs of subset of classes by using the generalization performance as a weighting criterion. Based on our method, each subset of classes with positive and negative labels is appropriately combined for learning the binary classifiers. Experimental results show that our technique gives significantly higher performance compared to traditional methods including the dense random code and the sparse random code both in terms of accuracy and classification times. Moreover, our method requires significantly smaller number of binary classifiers while maintaining accuracy compared to the One-Versus-One.Comment: 7 pages, 3 figure

    Iris Indexing and Ear Classification

    Get PDF
    To identify an individual using a biometric system, the input biometric data has to be typically compared against that of each and every identity in the existing database during the matching stage. The response time of the system increases with the increase in number of individuals (i.e., database size), which is not acceptable in real time monitoring or when working on large scale data. This thesis addresses the problem of reducing the number of database candidates to be considered during matching in the context of iris and ear recognition. In the case of iris, an indexing mechanism based on Burrows Wheeler Transform (BWT) is proposed. Experiments on the CASIA version 3 iris database show a significant reduction in both search time and search space, suggesting the potential of this scheme for indexing iris databases. The ear classification scheme proposed in the thesis is based on parameterizing the shape of the ear and assigning it to one of four classes: round, rectangle, oval and triangle. Experiments on the MAGNA database suggest the potential of this scheme for classifying ear databases

    On the performance of helper data template protection schemes

    Get PDF
    The use of biometrics looks promising as it is already being applied in elec- tronic passports, ePassports, on a global scale. Because the biometric data has to be stored as a reference template on either a central or personal storage de- vice, its wide-spread use introduces new security and privacy risks such as (i) identity fraud, (ii) cross-matching, (iii) irrevocability and (iv) leaking sensitive medical information. Mitigating these risks is essential to obtain the accep- tance from the subjects of the biometric systems and therefore facilitating the successful implementation on a large-scale basis. A solution to mitigate these risks is to use template protection techniques. The required protection properties of the stored reference template according to ISO guidelines are (i) irreversibility, (ii) renewability and (iii) unlinkability. A known template protection scheme is the helper data system (HDS). The fun- damental principle of the HDS is to bind a key with the biometric sample with use of helper data and cryptography, as such that the key can be reproduced or released given another biometric sample of the same subject. The identity check is then performed in a secure way by comparing the hash of the key. Hence, the size of the key determines the amount of protection. This thesis extensively investigates the HDS system, namely (i) the the- oretical classication performance, (ii) the maximum key size, (iii) the irre- versibility and unlinkability properties, and (iv) the optimal multi-sample and multi-algorithm fusion method. The theoretical classication performance of the biometric system is deter- mined by assuming that the features extracted from the biometric sample are Gaussian distributed. With this assumption we investigate the in uence of the bit extraction scheme on the classication performance. With use of the the- oretical framework, the maximum size of the key is determined by assuming the error-correcting code to operate on Shannon's bound. We also show three vulnerabilities of HDS that aect the irreversibility and unlinkability property and propose solutions. Finally, we study the optimal level of applying multi- sample and multi-algorithm fusion with the HDS at either feature-, score-, or decision-level

    Pattern Recognition

    Get PDF
    A wealth of advanced pattern recognition algorithms are emerging from the interdiscipline between technologies of effective visual features and the human-brain cognition process. Effective visual features are made possible through the rapid developments in appropriate sensor equipments, novel filter designs, and viable information processing architectures. While the understanding of human-brain cognition process broadens the way in which the computer can perform pattern recognition tasks. The present book is intended to collect representative researches around the globe focusing on low-level vision, filter design, features and image descriptors, data mining and analysis, and biologically inspired algorithms. The 27 chapters coved in this book disclose recent advances and new ideas in promoting the techniques, technology and applications of pattern recognition

    Information-Theoretic Multiclass Classification Based on Binary Classifiers: On Coding Matrix Design, Reliability and Maximum Number of Classes

    Get PDF
    In this paper, we consider the multiclass classification problem based on sets of independent binary classifiers. Each binary classifier represents the output of a quantized projection of training data onto a randomly generated orthonormal basis vector thus producing a binary label. The ensemble of all binary labels forms an analogue of a coding matrix. The properties of such kind of matrices and their impact on the maximum number of uniquely distinguishable classes are analyzed in this paper from an information-theoretic point of view. We also consider a concept of reliability for such kind of coding matrix generation that can be an alternative to other adaptive training techniques and investigate the impact on the bit error probability. We demonstrate that it is equivalent to the considered random coding matrix without any bit reliability information in terms of recognition rat

    Methods of Disambiguating and De-anonymizing Authorship in Large Scale Operational Data

    Get PDF
    Operational data from software development, social networks and other domains are often contaminated with incorrect or missing values. Examples include misspelled or changed names, multiple emails belonging to the same person and user profiles that vary in different systems. Such digital traces are extensively used in research and practice to study collaborating communities of various kinds. To achieve a realistic representation of the networks that represent these communities, accurate identities are essential. In this work, we aim to identify, model, and correct identity errors in data from open-source software repositories, which include more than 23M developer IDs and nearly 1B Git commits (developer activity records). Our investigation into the nature and prevalence of identity errors in software activity data reveals that they are different and occur at much higher rates than other domains. Existing techniques relying on string comparisons can only disambiguate Synonyms, but not Homonyms, which are common in software activity traces. Therefore, we introduce measures of behavioral fingerprinting to improve the accuracy of Synonym resolution, and to disambiguate Homonyms. Fingerprints are constructed from the traces of developers’ activities, such as, the style of writing in commit messages, the patterns in files modified and projects participated in by developers, and the patterns related to the timing of the developers’ activity. Furthermore, to address the lack of training data necessary for the supervised learning approaches that are used in disambiguation, we design a specific active learning procedure that minimizes the manual effort necessary to create training data in the domain of developer identity matching. We extensively evaluate the proposed approach, using over 16,000 OpenStack developers in 1200 projects, against commercial and most recent research approaches, and further on recent research on a much larger sample of over 2,000,000 IDs. Results demonstrate that our method is significantly better than both the recent research and commercial methods. We also conduct experiments to demonstrate that such erroneous data have significant impact on developer networks. We hope that the proposed approach will expedite research progress in the domain of software engineering, especially in applications for which graphs of social networks are critical

    Multi Criteria Mapping Based on SVM and Clustering Methods

    Get PDF
    There are many more ways to automate the application process like using some commercial software’s that are used in big organizations to scan bills and forms, but this application is only for the static frames or formats. In our application, we are trying to automate the non-static frames as the study certificate we get are from different counties with different universities. Each and every university have there one format of certificates, so we try developing a very new application that can commonly work for all the frames or formats. As we observe many applicants are from same university which have a common format of the certificate, if we implement this type of tools, then we can analyze this sort of certificates in a simple way within very less time. To make this process more accurate we try implementing SVM and Clustering methods. With these methods we can accurately map courses in certificates to ASE study path if not to exclude list. A grade calculation is done for courses which are mapped to an ASE list by separating the data for both labs and courses in it. At the end, we try to award some points, which includes points from ASE related courses, work experience, specialization certificates and German language skills. Finally, these points are provided to the chair to select the applicant for master course ASE

    Decoding algorithms for surface codes

    Full text link
    Quantum technologies have the potential to solve computationally hard problems that are intractable via classical means. Unfortunately, the unstable nature of quantum information makes it prone to errors. For this reason, quantum error correction is an invaluable tool to make quantum information reliable and enable the ultimate goal of fault-tolerant quantum computing. Surface codes currently stand as the most promising candidates to build error corrected qubits given their two-dimensional architecture, a requirement of only local operations, and high tolerance to quantum noise. Decoding algorithms are an integral component of any error correction scheme, as they are tasked with producing accurate estimates of the errors that affect quantum information, so that it can subsequently be corrected. A critical aspect of decoding algorithms is their speed, since the quantum state will suffer additional errors with the passage of time. This poses a connundrum-like tradeoff, where decoding performance is improved at the expense of complexity and viceversa. In this review, a thorough discussion of state-of-the-art surface code decoding algorithms is provided. The core operation of these methods is described along with existing variants that show promise for improved results. In addition, both the decoding performance, in terms of error correction capability, and decoding complexity, are compared. A review of the existing software tools regarding surface code decoding is also provided.Comment: 54 pages, 31 figure
    • …
    corecore