408 research outputs found

    An analysis of learning in weightless neural systems

    No full text
    This thesis brings together two strands of neural networks research - weightless systems and statistical learning theory - in an attempt to understand better the learning and generalisation abilities of a class of pattern classifying machines. The machines under consideration are n-tuple classifiers. While their analysis falls outside the domain of more widespread neural networks methods the method has found considerable application since its first publication in 1959. The larger class of learning systems to which the n-tuple classifier belongs is known as the set of weightless or RAM-based systems, because of the fact that they store all their modifiable information in the nodes rather than as weights on the connections. The analytical tools used are those of statistical learning theory. Learning methods and machines are considered in terms of a formal learning problem which allows the precise definition of terms such as learning and generalisation (in this context). Results relating the empirical error of the machine on the training set, the number of training examples and the complexity of the machine (as measured by the Vapnik- Chervonenkis dimension) to the generalisation error are derived. In the thesis this theoretical framework is applied for the first time to weightless systems in general and to n-tuple classifiers in particular. Novel theoretical results are used to inspire the design of related learning machines and empirical tests are used to assess the power of these new machines. Also data-independent theoretical results are compared with data-dependent results to explain the apparent anomalies in the n-tuple classifier's behaviour. The thesis takes an original approach to the study of weightless networks, and one which gives new insights into their strengths as learning machines. It also allows a new family of learning machines to be introduced and a method for improving generalisation to be applied.Open Acces

    Probabilistic and Deep Learning Algorithms for the Analysis of Imagery Data

    Get PDF
    Accurate object classification is a challenging problem for various low to high resolution imagery data. This applies to both natural as well as synthetic image datasets. However, each object recognition dataset poses its own distinct set of domain-specific problems. In order to address these issues, we need to devise intelligent learning algorithms which require a deep understanding and careful analysis of the feature space. In this thesis, we introduce three new learning frameworks for the analysis of both airborne images (NAIP dataset) and handwritten digit datasets without and with noise (MNIST and n-MNIST respectively). First, we propose a probabilistic framework for the analysis of the NAIP dataset which includes (1) an unsupervised segmentation module based on the Statistical Region Merging algorithm, (2) a feature extraction module that extracts a set of standard hand-crafted texture features from the images, (3) a supervised classification algorithm based on Feedforward Backpropagation Neural Networks, and (4) a structured prediction framework using Conditional Random Fields that integrates the results of the segmentation and classification modules into a single composite model to generate the final class labels. Next, we introduce two new datasets SAT-4 and SAT-6 sampled from the NAIP imagery and use them to evaluate a multitude of Deep Learning algorithms including Deep Belief Networks (DBN), Convolutional Neural Networks (CNN) and Stacked Autoencoders (SAE) for generating class labels. Finally, we propose a learning framework by integrating hand-crafted texture features with a DBN. A DBN uses an unsupervised pre-training phase to perform initialization of the parameters of a Feedforward Backpropagation Neural Network to a global error basin which can then be improved using a round of supervised fine-tuning using Feedforward Backpropagation Neural Networks. These networks can subsequently be used for classification. In the following discussion, we show that the integration of hand-crafted features with DBN shows significant improvement in performance as compared to traditional DBN models which take raw image pixels as input. We also investigate why this integration proves to be particularly useful for aerial datasets using a statistical analysis based on Distribution Separability Criterion. Then we introduce a new dataset called noisy-MNIST (n-MNIST) by adding (1) additive white gaussian noise (AWGN), (2) motion blur and (3) Reduced contrast and AWGN to the MNIST dataset and present a learning algorithm by combining probabilistic quadtrees and Deep Belief Networks. This dynamic integration of the Deep Belief Network with the probabilistic quadtrees provide significant improvement over traditional DBN models on both the MNIST and the n-MNIST datasets. Finally, we extend our experiments on aerial imagery to the class of general texture images and present a theoretical analysis of Deep Neural Networks applied to texture classification. We derive the size of the feature space of textural features and also derive the Vapnik-Chervonenkis dimension of certain classes of Neural Networks. We also derive some useful results on intrinsic dimension and relative contrast of texture datasets and use these to highlight the differences between texture datasets and general object recognition datasets

    Ranking for Scalable Information Extraction

    Get PDF
    Information extraction systems are complex software tools that discover structured information in natural language text. For instance, an information extraction system trained to extract tuples for an Occurs-in(Natural Disaster, Location) relation may extract the tuple from the sentence: "A tsunami swept the coast of Hawaii." Having information in structured form enables more sophisticated querying and data mining than what is possible over the natural language text. Unfortunately, information extraction is a time-consuming task. For example, a state-of-the-art information extraction system to extract Occurs-in tuples may take up to two hours to process only 1,000 text documents. Since document collections routinely contain millions of documents or more, improving the efficiency and scalability of the information extraction process over these collections is critical. As a significant step towards this goal, this dissertation presents approaches for (i) enabling the deployment of efficient information extraction systems and (ii) scaling the information extraction process to large volumes of text. To enable the deployment of efficient information extraction systems, we have developed two crucial building blocks for this task. As a first contribution, we have created REEL, a toolkit to easily implement, evaluate, and deploy full-fledged relation extraction systems. REEL, in contrast to existing toolkits, effectively modularizes the key components involved in relation extraction systems and can integrate other long-established text processing and machine learning toolkits. To define a relation extraction system for a new relation and text collection, users only need to specify the desired configuration, which makes REEL a powerful framework for both research and application building. As a second contribution, we have addressed the problem of building representative extraction task-specific document samples from collections, a step often required by approaches for efficient information extraction. Specifically, we devised fully automatic document sampling techniques for information extraction that can produce better-quality document samples than the state-of-the-art sampling strategies; furthermore, our techniques are substantially more efficient than the existing alternative approaches. To scale the information extraction process to large volumes of text, we have developed approaches that address the efficiency and scalability of the extraction process by focusing the extraction effort on the collections, documents, and sentences worth processing for a given extraction task. For collections, we have studied both (adaptations of) state-of-the art approaches for estimating the number of documents in a collection that lead to the extraction of tuples as well as information extraction-specific approaches. Using these estimations we can identify the collections worth processing and ignore the rest, for efficiency. For documents, we have developed an adaptive document ranking approach that relies on learning-to-rank techniques to prioritize the documents that are likely to produce tuples for an extraction task of choice. Our approach revises the (learned) ranking decisions periodically as the extraction process progresses and new characteristics of the useful documents are revealed. Finally, for sentences, we have developed an approach based on the sparse group selection problem that identifies sentences|modeled as groups of words|that best characterize the extraction task. Beyond identifying sentences worth processing, our approach aims at selecting sentences that lead to the extraction of unseen, novel tuples. Our approaches are lightweight and efficient, and dramatically improve the efficiency and scalability of the information extraction process. We can often complete the extraction task by focusing on just a very small fraction of the available text, namely, the text that contains relevant information for the extraction task at hand. Our approaches therefore constitute a substantial step towards efficient and scalable information extraction over large volumes of text
    • …
    corecore