103 research outputs found

    When Does a Mixture of Products Contain a Product of Mixtures?

    Full text link
    We derive relations between theoretical properties of restricted Boltzmann machines (RBMs), popular machine learning models which form the building blocks of deep learning models, and several natural notions from discrete mathematics and convex geometry. We give implications and equivalences relating RBM-representable probability distributions, perfectly reconstructible inputs, Hamming modes, zonotopes and zonosets, point configurations in hyperplane arrangements, linear threshold codes, and multi-covering numbers of hypercubes. As a motivating application, we prove results on the relative representational power of mixtures of product distributions and products of mixtures of pairs of product distributions (RBMs) that formally justify widely held intuitions about distributed representations. In particular, we show that a mixture of products requiring an exponentially larger number of parameters is needed to represent the probability distributions which can be obtained as products of mixtures.Comment: 32 pages, 6 figures, 2 table

    Exact Classification with Two-Layer Neural Nets

    Get PDF
    AbstractThis paper considers the classification properties of two-layer networks of McCulloch–Pitts units from a theoretical point of view. In particular we consider their ability to realise exactly, as opposed to approximate, bounded decision regions in R2. The main result shows that a two-layer network can realise exactly any finite union of bounded polyhedra in R2whose bounding lines lie in general position, except for some well-characterised exceptions. The exceptions are those unions whose boundaries contain a line which is “inconsistent,” as described in the text. Some of the results are valid for Rn,n⩾2, and the problem of generalising the main result to higher-dimensional situations is discussed

    Integrating topological features to enhance cardiac disease diagnosis from 3D CMR images

    Full text link
    Treballs Finals de Grau de Matemàtiques, Facultat de Matemàtiques, Universitat de Barcelona, Any: 2023, Director: Carles Casacuberta i Polyxeni Gkontra[en] Persistent homology is a technique from the field of algebraic topology for the analysis and characterization of the shape and structure of datasets in multiple dimensions. Its use is based on the identification and quantification of topological patterns in the dataset across various scales. In this thesis, persistent homology is applied with the objective of extracting topological descriptors from three-dimensional cardiovascular magnetic resonance (CMR) imaging. Thereafter, topological descriptors are used for the detection of cardiovascular diseases by means of Machine Learning (ML) techniques. Radiomics has been one of the recently proposed approaches for disease diagnosis. This method involves the extraction and subsequent analysis of a significant number of quantitative descriptors from medical images. These descriptors offer a characterization of the spatial distribution, texture, and intensity of the structures present in the images. This study demonstrates that radiomics and topological descriptors achieve comparable results, providing complementary insights into the underlying structures and characteristics of anatomical tissues. Moreover, the combination of these two methods leads to a further improvement of the performance of ML models, thereby enhancing medical diagnosis

    Guaranteeing generalisation in neural networks

    Get PDF
    Neural networks need to be able to guarantee their intrinsic generalisation abilities if they are to be used reliably. Mitchell's concept and version spaces technique is able to guarantee generalisation in the symbolic concept-learning environment in which it is implemented. Generalisation, according to Mitchell, is guaranteed when there is no alternative concept that is consistent with all the examples presented so far, except the current concept, given the bias of the user. A form of bidirectional convergence is used by Mitchell to recognise when the no-alternative situation has been reached. Mitchell's technique has problems of search and storage feasibility in its symbolic environment. This thesis aims to show that by evolving the technique further in a neural environment, these problems can be overcome. Firstly, the biasing factors which affect the kind of concept that can be learned are explored in a neural network context. Secondly, approaches for abstracting the underlying features of the symbolic technique that enable recognition of the no-alternative situation are discussed. The discussion generates neural techniques for guaranteeing generalisation and culminates in a neural technique which is able to recognise when the best fit neural weight state has been found for a given set of data and topology


    Get PDF
    Applications based on three-dimensional object models are today very common, and can be found in many fields as design, archeology, medicine, and entertainment. A digital 3D model can be obtained by means of physical object measurements performed by using a 3D scanner. In this approach, an important step of the 3D model building process consists of creating the object's surface representation from a cloud of noisy points sampled on the object itself. This process can be viewed as the estimation of a function from a finite subset of its points. Both in statistics and machine learning this is known as a regression problem. Machine learning views the function estimation as a learning problem to be addressed by using computational intelligence techniques: the points represent a set of examples and the surface to be reconstructed represents the law that has generated them. On the other hand, in many applications the cloud of sampled points may become available only progressively during system operation. The conventional approaches to regression are therefore not suited to deal efficiently with this operating condition. The aim of the thesis is to introduce innovative approaches to the regression problem suited for achieving high reconstruction accuracy, while limiting the computational complexity, and appropriate for online operation. Two classical computational intelligence paradigms have been considered as basic tools to address the regression problem: namely the Radial Basis Functions and the Support Vector Machines. The original and innovative aspect introduced by this thesis is the extension of these tools toward a multi-scale incremental structure, based on hierarchical schemes and suited for online operation. This allows for obtaining modular, scalable, accurate and efficient modeling procedures with training algorithms appropriate for dealing with online learning. Radial Basis Function Networks have a fast configuration procedure that, operating locally, does not require iterative algorithms. On the other side, the computational complexity of the configuration procedure of Support Vector Machines is independent from the number of input variables. These two approaches have been considered in order to analyze advantages and limits of each of them due to the differences in their intrinsic nature

    Feature Driven Learning Techniques for 3D Shape Segmentation

    Get PDF
    Segmentation is a fundamental problem in 3D shape analysis and machine learning. The abil-ity to partition a 3D shape into meaningful or functional parts is a vital ingredient of many down stream applications like shape matching, classification and retrieval. Early segmentation methods were based on approaches like fitting primitive shapes to parts or extracting segmen-tations from feature points. However, such methods had limited success on shapes with more complex geometry. Observing this, research began using geometric features to aid the segmen-tation, as certain features (e.g. Shape Diameter Function (SDF)) are less sensitive to complex geometry. This trend was also incorporated in the shift to set-wide segmentations, called co-segmentation, which provides a consistent segmentation throughout a shape dataset, meaning similar parts have the same segment identifier. The idea of co-segmentation is that a set of same class shapes (i.e. chairs) contain more information about the class than a single shape would, which could lead to an overall improvement to the segmentation of the individual shapes. Over the past decade many different approaches of co-segmentation have been explored covering supervised, unsupervised and even user-driven active learning. In each of the areas, there has been widely adopted use of geometric features to aid proposed segmentation algorithms, with each method typically using different combinations of features. The aim of this thesis is to ex-plore these different areas of 3D shape segmentation, perform an analysis of the effectiveness of geometric features in these areas and tackle core issues that currently exist in the literature.Initially, we explore the area of unsupervised segmentation, specifically looking at co-segmentation, and perform an analysis of several different geometric features. Our analysis is intended to compare the different features in a single unsupervised pipeline to evaluate their usefulness and determine their strengths and weaknesses. Our analysis also includes several features that have not yet been explored in unsupervised segmentation but have been shown effective in other areas.Later, with the ever increasing popularity of deep learning, we explore the area of super-vised segmentation and investigate the current state of Neural Network (NN) driven techniques. We specifically observe limitations in the current state-of-the-art and propose a novel Convolu-tional Neural Network (CNN) based method which operates on multi-scale geometric features to gain more information about the shapes being segmented. We also perform an evaluation of several different supervised segmentation methods using the same input features, but with vary-ing complexity of model design. This is intended to see if the more complex models provide a significant performance increase.Lastly, we explore the user-driven area of active learning, to tackle the large amounts of inconsistencies in current ground truth segmentation, which are vital for most segmentation methods. Active learning has been used to great effect for ground truth generation in the past, so we present a novel active learning framework using deep learning and geometric features to assist the user in co-segmentation of a dataset. Our method emphasises segmentation accu-racy while minimising user effort, providing an interactive visualisation for co-segmentation analysis and the application of automated optimisation tools.In this thesis we explore the effectiveness of different geometric features across varying segmentation tasks, providing an in-depth analysis and comparison of state-of-the-art methods

    Recognising and localising human actions

    Get PDF
    Human action recognition in challenging video data is becoming an increasingly important research area. Given the growing number of cameras and robots pointing their lenses at humans, the need for automatic recognition of human actions arises, promising Google-style video search and automatic video summarisation/description. Furthermore, for any autonomous robotic system to interact with humans, it must rst be able to understand and quickly react to human actions. Although the best action classication methods aggregate features from the entire video clip in which the action unfolds, this global representation may include irrelevant scene context and movements which are shared amongst multiple action classes. For example, a waving action may be performed whilst walking, however if the walking movement appears in distinct action classes, then it should not be included in training a waving movement classier. For this reason, we propose an action classication framework in which more discriminative action subvolumes are learned in a weakly supervised setting, owing to the diculty of manually labelling massive video datasets. The learned models are used to simultaneously classify video clips and to localise actions to a given space-time subvolume. Each subvolume is cast as a bag-of-features (BoF) instance in a multiple-instance-learning framework, which in turn is used to learn its class membership. We demonstrate quantitatively that even with single xed-sized subvolumes, the classication performance of our proposed algorithm is superior to our BoF baseline on the majority of performance measures, and shows promise for space-time action localisation on the most challenging video datasets. Exploiting spatio-temporal structure in the video should also improve results, just as deformable part models have proven highly successful in object recognition. However, whereas objects have clear boundaries which means we can easily dene a ground truth for initialisation, 3D space-time actions are inherently ambiguous and expensive to annotate in large datasets. Thus, it is desirable to adapt pictorial star models to action datasets without location annotation, and to features invariant to changes in pose such as bag-of-feature and Fisher vectors, rather than low-level HoG. Thus, we propose local deformable spatial bag-of-features (LDSBoF) in which local discriminative regions are split into axed grid of parts that are allowed to deform in both space and time at test-time. In our experimental evaluation we demonstrate that by using local, deformable space-time action parts, we are able to achieve very competitive classification performance, whilst being able to localise actions even in the most challenging video datasets. A recent trend in action recognition is towards larger and more challenging datasets, an increasing number of action classes and larger visual vocabularies. For the global classication of human action video clips, the bag-of-visual-words pipeline is currently the best performing. However, the strategies chosen to sample features and construct a visual vocabulary are critical to performance, in fact often dominating performance. Thus, we provide a critical evaluation of various approaches to building a vocabulary and show that good practises do have a signicant impact. By subsampling and partitioning features strategically, we are able to achieve state-of-the-art results on 5 major action recognition datasets using relatively small visual vocabularies. Another promising approach to recognise human actions first encodes the action sequence via a generative dynamical model. However, using classical distances for their classication does not necessarily deliver good results. Therefore we propose a general framework for learning distance functions between dynamical models, given a training set of labelled videos. The optimal distance function is selected among a family of `pullback' ones, induced by a parametrised mapping of the space of models. We focus here on hidden Markov models and their model space, and show how pullback distance learning greatly improves action recognition performances with respect to base distances. Finally, the action classication systems that use a single global representation for each video clip are tailored for oine batch classication benchmarks. For human-robot interaction however, current systems fall short, either because they can only detect one human action per video frame, or because they assume the video is available ahead of time. In this work we propose an online human action detection system that can incrementally detect multiple concurrent space-time actions. In this way, it becomes possible to learn new action classes on-the-fly, allowing multiple people to actively teach and interact with a robot

    Pattern recognition methods applied to medical imaging: lung nodule detection in computed tomography images

    Get PDF
    Lung cancer is one of the main public health issues in developed countries. The overall 5-year survival rate is only 10−16%, although the mortality rate among men in the United States has started to decrease by about 1.5% per year since 1991 and a similar trend for the male population has been observed in most European countries. By contrast, in the case of the female population, the survival rate is still decreasing, despite a decline in the mortality of young women has been ob- served over the last decade. Approximately 70% of lung cancers are diagnosed at too advanced stages for the treatments to be effective. The five-year survival rate for early-stage lung cancers (stage I), which can reach 70%, is sensibly higher than for cancers diagnosed at more advanced stages. Lung cancer most commonly manifests itself as non-calcified pulmonary nodules. The CT has been shown as the most sensitive imaging modality for the detection of small pulmonary nodules, particularly since the introduction of the multi-detector-row and helical CT technologies. Screening programs based on Low Dose Computed Tomography (LDCT) may be regarded as a promising technique for detecting small, early-stage lung cancers. The efficacy of screening programs based on CT in reducing the mortality rate for lung cancer has not been fully demonstrated yet, and different and opposing opinions are being pointed out on this topic by many experts. However, the recent results obtained by the National Lung Screening Trial (NLST), involving 53454 high risk patients, show a 20% reduction of mortality when the screening program was carried out with the helical CT, rather than with a conventional chest X-ray. LDCT settings are currently recommended by the screening trial protocols. However, it is not trivial in this case to identify small pulmonary nodules,due to the noisier appearance of the images in low-dose CT with respect to the standard-dose CT. Moreover, thin slices are generally used in screening programs, thus originating datasets of about 300 − 400 slices per study. De- pending on the screening trial protocol they joined, radiologists can be asked to identify even very small lung nodules, which is a very difficult and time- consuming task. Lung nodules are rather spherical objects, characterized by very low CT values and/or low contrast. Nodules may have CT values in the same range of those of blood vessels, airway walls, pleura and may be strongly connected to them. It has been demonstrated, that a large percent- age of nodules (20 − 35%) is actually missed in screening diagnoses. To support radiologists in the identification of early-stage pathological objects, about one decade ago, researchers started to develop CAD methods to be applied to CT examinations. Within this framework, two CAD sub-systems are proposed: CAD for internal nodules (CADI), devoted to the identification of small nodules embedded in the lung parenchyma, i.e. Internal Nodules (INs) and CADJP, devoted the identification of nodules originating on the pleura surface, i.e. Juxta-Pleural Nodules (JPNs) respectively. As the training and validation sets may drastically influence the performance of a CAD system, the presented approaches have been trained, developed and tested on different datasets of CT scans (Lung Image Database Consortium (LIDC), ITALUNG − CT) and finally blindly validated on the ANODE09 dataset. The two CAD sub-systems are implemented in the ITK framework, an open source C++ framework for segmentation and registration of medical im- ages, and the rendering of the obtained results are achieved using VTK, a freely available software system for 3D computer graphics, image processing and visualization. The Support Vector Machines (SVMs) are implemented in SVMLight. The two proposed approaches have been developed to detect solid nodules, since the number of Ground Glass Opacity (GGO) contained in the available datasets has been considered too low. This thesis is structured as follows: in the first chapter the basic concepts about CT and lung anatomy are explained. The second chapter deals with CAD systems and their evaluation methods. In the third chapter the datasets used for this work are described. In chapter 4 the lung segmentation algorithm is explained in details, and in chapter 5 and 6 the algorithms to detect internal and juxta-pleural candidates are discussed. In chapter 7 the reduction of false positives findings is explained. In chapter 8 results of the train and validation sessions are shown. Finally in the last chapter the conclusions are drawn