724 research outputs found
Advances in Character Recognition
This book presents advances in character recognition, and it consists of 12 chapters that cover wide range of topics on different aspects of character recognition. Hopefully, this book will serve as a reference source for academic research, for professionals working in the character recognition field and for all interested in the subject
Recommended from our members
Parallelizing support vector machines for scalable image annotation
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them Support Vector Machines (SVMs) are used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large.
In this thesis distributed computing paradigms have been investigated to speed up SVM training, by partitioning a large training dataset into small data chunks and process each chunk in parallel utilizing the resources of a cluster of computers. A resource aware parallel SVM algorithm is introduced for large scale image annotation in parallel using a cluster of computers. A genetic algorithm based load balancing scheme is designed to optimize the performance of the algorithm in heterogeneous computing environments.
SVM was initially designed for binary classifications. However, most classification problems arising in domains such as image annotation usually involve more than two classes. A resource aware parallel multiclass SVM algorithm for large scale image annotation in parallel using a cluster of computers is introduced.
The combination of classifiers leads to substantial reduction of classification error in a wide range of applications. Among them SVM ensembles with bagging is shown to outperform a single SVM in terms of classification accuracy. However, SVM ensembles training are notably a computationally intensive process especially when the number replicated samples based on bootstrapping is large. A distributed SVM ensemble algorithm for image annotation is introduced which re-samples the training data based on bootstrapping and training SVM on each sample in parallel using a cluster of computers.
The above algorithms are evaluated in both experimental and simulation environments showing that the distributed SVM algorithm, distributed multiclass SVM algorithm, and distributed SVM ensemble algorithm, reduces the training time significantly while maintaining a high level of accuracy in classifications
An aesthetics of touch: investigating the language of design relating to form
How well can designers communicate qualities of touch?
This paper presents evidence that they have some capability to do so, much of which appears to have been learned, but at present make limited use of such language. Interviews with graduate designer-makers suggest that they are aware of and value the importance of touch and materiality in their work, but lack a vocabulary to fully relate to their detailed explanations of other aspects such as their intent or selection of materials. We believe that more attention should be paid to the verbal dialogue that happens in the design process, particularly as other researchers show that even making-based learning also has a strong verbal element to it. However, verbal language alone does not appear to be adequate for a comprehensive language of touch. Graduate designers-makers’ descriptive practices combined non-verbal manipulation within verbal accounts. We thus argue that haptic vocabularies do not simply describe material qualities, but rather are situated competences that physically demonstrate the presence of haptic qualities. Such competencies are more important than groups of verbal vocabularies in isolation. Design support for developing and extending haptic competences must take this wide range of considerations into account to comprehensively improve designers’ capabilities
Recommended from our members
NBS monograph
From Introduction: "This report is the first of a series intended to provide a selective overview of research and development efforts and requirements in the somewhat overlapping fields of the computer and information sciences and technologies. The projected series of reports will attempt to outline the probable range of R & D activities in the computer and information sciences and technologies through selective reviews of the literature and to develop a reasonable consensus with respect to the opinions of workers in these and potentially related fields as to areas of continuing R & D concern for research program planning or review in these areas.
Recommended from our members
The Computational Attitude in Music Theory
Music studies’s turn to computation during the twentieth century has engendered particular habits of thought about music, habits that remain in operation long after the music scholar has stepped away from the computer. The computational attitude is a way of thinking about music that is learned at the computer but can be applied away from it. It may be manifest in actual computer use, or in invocations of computationalism, a theory of mind whose influence on twentieth-century music theory is palpable. It may also be manifest in more informal discussions about music, which make liberal use of computational metaphors. In Chapter 1, I describe this attitude, the stakes for considering the computer as one of its instruments, and the kinds of historical sources and methodologies we might draw on to chart its ascendance. The remainder of this dissertation considers distinct and varied cases from the mid-twentieth century in which computers or computationalist musical ideas were used to pursue new musical objects, to quantify and classify musical scores as data, and to instantiate a generally music-structuralist mode of analysis.
I present an account of the decades-long effort to prepare an exhaustive and accurate catalog of the all-interval twelve-tone series (Chapter 2). This problem was first posed in the 1920s but was not solved until 1959, when the composer Hanns Jelinek collaborated with the computer engineer Heinz Zemanek to jointly develop and run a computer program. Recognizing the transformation wrought on modern statistics and communications technology by information theory, I revisit Abraham Moles’s book Information Theory and Esthetic Perception (orig. 1958) and use its vocabulary to contextualize contemporary information-theoretic work on music that various evokes the computational mind by John. R. Pierce and Mary Shannon, Wilhelm Fucks, and Henry Quastler (Chapter 3). I conclude with a detailed look into a score-segmentation algorithm of the influential American music theorist Allen Forte (Chapter 4). Forte was a skilled programmer who spent several years at MIT in the 1960s, with cutting-edge computers and the company of first-rank figures in the nascent fields of computer science and artificial intelligence. Each one of the researchers whose work is treated in these case studies—at some stage in their relationship with music—adopted what I call the computational attitude to music, to varying degrees and for diverse ends. Of the many questions this dissertation seeks to answer: what was gained by adopting such an attitude? What was lost? Having understood these past explorations of the computational attitude to music, we are better suited ask of ourselves the same questions today
Reinstatement of Cortical Outcome Representations during Higher-Order Learning
Naturalistic learning scenarios are characterized by infrequent experience of external feedback to guide behavior. Higher-order learning mechanisms like second-order conditioning (SOC) may allow stimuli that were never experienced together with reinforcement to acquire motivational value. Despite its explanatory potential for real-world learning, surprisingly little is known about the neural mechanism underlying such associative transfer of value in SOC. Here, we used multivariate cross-session, cross-modality searchlight classification on functional magnetic resonance imaging data obtained from humans during SOC. We show that visual first-order conditioned stimuli (CS) reinstate cortical patterns representing previously paired gustatory outcomes in the lateral orbitofrontal cortex (OFC). During SOC, this OFC region showed increased functional covariation with amygdala, where neural pattern similarity between second-order CS and outcomes increased from early to late stages of SOC. Our data suggest a mechanism by which motivational value is conferred to stimuli that were never paired with reinforcement
Discriminative and Generative Learning with Style Information
Conventional machine learning approaches usually assume that the patterns follow the identical and independent distribution (i.i.d.). However, in many empirical cases, such condition might be violated when data are equipped with diverse and inconsistent style information. The effectiveness of those traditional predictors may be limited due to the violation of the i.i.d. assumption brought by the existence of the style inconsistency. In this thesis, we investigate how the style information can be appropriately utilized for further lifting up the performance of machine learning models. It is fulfilled by not only introducing the style information into some state-of-the-art models, some new architectures, frameworks are also designed and implemented with specific purposes to make proper use of the style information. The main work is listed as the following summaries: First, the idea of the style averaging is initially introduced by an example of an image process based sunglasses recovery algorithm to perform robust one-shot facial expression recognition task. It is named as Style Elimination Transformation (SET). By recovering the pixels corrupted by the dark colors of the sunglasses brought by the proposed algorithm, the classification performance is promoted on several state-of-the-art machine learning classifiers even in a one-shot training setting. Then the investigation of the style normalization and style neutralization is investigated with both discriminative and generative machine learning approaches respectively. In discriminative learning models with style information, the style normalization transformation (SNT) is integrated into the support vector machines (SVM) for both classification and regression, named as the field support vector classification (F-SVC) and field support vector regression (F-SVR) respectively. The SNT can be represented with the nonlinearity by mapping the sufficiently complicated style information to the high-dimensional reproducing kernel Hilbert space. The learned SNT would normalize the inconsistent style information, producing i.i.d. examples, on which the SVM will be applied. Furthermore, a self-training based transductive framework will be introduced to incorporate with the unseen styles during training. The transductive SNT (T-SNT) is learned by transferring the trained styles to the unknown ones. Besides, in generative learning with style information, the style neutralization generative adversarial classifier (SN-GAC) is investigated to incorporate with the style information when performing the classification. As a neural network based framework, the SN-GAC enables the nonlinear mapping due to the nature of the nonlinearity of the neural network transformation with the generative manner. As a generalized and novel classification framework, it is capable of synthesizing style-neutralized high-quality humanunderstandable patterns given any style-inconsistent ones. Being learned with the adversarial training strategy in the first step, the final classification performance will be further promoted by fine-tuning the classifier when those style-neutralized examples can be well generated. Finally, the reversed task of the upon-mentioned style neutralization in the SN-GAC model, namely, the generation of arbitrary-style patterns, is also investigated in this thesis. By introducing the W-Net, a deep architecture upgraded from the famous U-Net model for image-to-image translation tasks, the few-shot (even the one-shot) arbitrary-style Chinese character generation task will be fulfilled. Same as the SN-GAC model, the W-Net is also trained with the adversarial training strategy proposed by the generative adversarial network. Such W-Net architecture is capable of generating any Chinese characters with the similar style as those given a few, or even one single, stylized examples. For all the proposed algorithms, frameworks, and models mentioned above for both the prediction and generation tasks, the inconsistent style information is taken into appropriate consideration. Inconsistent sunglasses information is eliminated by an image processing based sunglasses recovery algorithm in the SET, producing style-consistent patterns. The facial expression recognition is performed based on those transformed i.i.d. examples. The SNT is integrated into the SVM model, normalizing the inconsistent style information nonlinearly with the kernelized mapping. The T-SNT further enables the field prediction on those unseen styles during training. In the SN-GAC model, the style neutralization is performed by the neural network based upgraded U-Net architecture. Trained with separated steps with the adversarial optimization strategy included, it produces the high-quality style-neutralized i.i.d. patterns. The following classification is learned to produce superior performance with no additional computation involved. The W-Net architecture enables the free manipulation of the style data generation task with only a few, or even one single, style reference(s) available. It makes the Few-shot, or even the One-shot, Chinese Character Generation with the Arbitrary-style information task to be realized. Such appealing property is hardly seen in the literature
Reliable pattern recognition system with novel semi-supervised learning approach
Over the past decade, there has been considerable progress in the design of statistical machine learning strategies, including Semi-Supervised Learning (SSL) approaches. However, researchers still have difficulties in applying most of these learning strategies when two or more classes overlap, and/or when each class has a bimodal/multimodal distribution. In this thesis, an efficient, robust, and reliable recognition system with a novel SSL scheme has been developed to overcome overlapping problems between two classes and bimodal distribution within each class. This system was based on the nature of category learning and recognition to enhance the system's performance in relevant applications. In the training procedure, besides the supervised learning strategy, the unsupervised learning approach was applied to retrieve the "extra information" that could not be obtained from the images themselves. This approach was very helpful for the classification between two confusing classes. In this SSL scheme, both the training data and the test data were utilized in the final classification. In this thesis, the design of a promising supervised learning model with advanced state-of-the-art technologies is firstly presented, and a novel rejection measurement for verification of rejected samples, namely Linear Discriminant Analysis Measurement (LDAM), is defined. Experiments on CENPARMI's Hindu-Arabic Handwritten Numeral Database, CENPARMI's Numerals Database, and NIST's Numerals Database were conducted in order to evaluate the efficiency of LDAM. Moreover, multiple verification modules, including a Writing Style Verification (WSV) module, have been developed according to four newly defined error categories. The error categorization was based on the different costs of misclassification. The WSV module has been developed by the unsupervised learning approach to automatically retrieve the person's writing styles so that the rejected samples can be classified and verified accordingly. As a result, errors on CENPARMI's Hindu-Arabic Handwritten Numeral Database (24,784 training samples, 6,199 testing samples) were reduced drastically from 397 to 59, and the final recognition rate of this HAHNR reached 99.05%, a significantly higher rate compared to other experiments on the same database. When the rejection option was applied on this database, the recognition rate, error rate, and reliability were 97.89%, 0.63%, and 99.28%, respectivel
Minimally-supervised Methods for Arabic Named Entity Recognition
Named Entity Recognition (NER) has attracted much attention over the past twenty years, as a main task of Information Extraction. The current dominant techniques for addressing NER are supervised methods that can achieve high performance, but require new manually annotated data for every new domain and/or genre change. Our work focuses on approaches that make it possible to tackle new domains with minimal human intervention to identify Named Entities (NEs) in Arabic text. Specifically, we investigate two minimally-supervised methods: semi-supervised learning and distant learning. Our semi-supervised algorithm for identifying NEs does not require annotated training data or gazetteers. It only requires, for each NE type, a seed list of a few instances to initiate the learning process. Novel aspects of our algorithm include (i) a new way to produce and generalise the extraction patterns (ii) a new filtering criterion to remove noisy patterns (iii) a comparison of two ranking measures for determining the most reliable candidate NEs. Next, we present our methodology to exploit Wikipedia structure to automatically develop an Arabic NE annotated corpus. A novel mechanism is introduced, based on the high coverage of Wikipedia, in order to address two challenges particular to tagging NEs in Arabic text: rich morphology and the absence of capitalisation. Neither technique has yet achieved performance levels comparable to those of supervised methods. Semi-supervised algorithms tend to have high precision but comparatively low recall, whereas distant learning tends to achieve higher recall but lower precision. Therefore, we present a novel approach to Arabic NER using a combination of semi-supervised and distant learning techniques. We used a variety of classifier combination schemes, including the Bayesian Classifier Combination (BCC) procedure, recently proposed for sentiment analysis. According to our results, the BCC model leads to an increase in performance of 8 percentage points over the best minimally-supervised classifier
- …