16 research outputs found

    Coping with new Challenges in Clustering and Biomedical Imaging

    Get PDF
    The last years have seen a tremendous increase of data acquisition in different scientific fields such as molecular biology, bioinformatics or biomedicine. Therefore, novel methods are needed for automatic data processing and analysis of this large amount of data. Data mining is the process of applying methods like clustering or classification to large databases in order to uncover hidden patterns. Clustering is the task of partitioning points of a data set into distinct groups in order to minimize the intra cluster similarity and to maximize the inter cluster similarity. In contrast to unsupervised learning like clustering, the classification problem is known as supervised learning that aims at the prediction of group membership of data objects on the basis of rules learned from a training set where the group membership is known. Specialized methods have been proposed for hierarchical and partitioning clustering. However, these methods suffer from several drawbacks. In the first part of this work, new clustering methods are proposed that cope with problems from conventional clustering algorithms. ITCH (Information-Theoretic Cluster Hierarchies) is a hierarchical clustering method that is based on a hierarchical variant of the Minimum Description Length (MDL) principle which finds hierarchies of clusters without requiring input parameters. As ITCH may converge only to a local optimum we propose GACH (Genetic Algorithm for Finding Cluster Hierarchies) that combines the benefits from genetic algorithms with information-theory. In this way the search space is explored more effectively. Furthermore, we propose INTEGRATE a novel clustering method for data with mixed numerical and categorical attributes. Supported by the MDL principle our method integrates the information provided by heterogeneous numerical and categorical attributes and thus naturally balances the influence of both sources of information. A competitive evaluation illustrates that INTEGRATE is more effective than existing clustering methods for mixed type data. Besides clustering methods for single data objects we provide a solution for clustering different data sets that are represented by their skylines. The skyline operator is a well-established database primitive for finding database objects which minimize two or more attributes with an unknown weighting between these attributes. In this thesis, we define a similarity measure, called SkyDist, for comparing skylines of different data sets that can directly be integrated into different data mining tasks such as clustering or classification. The experiments show that SkyDist in combination with different clustering algorithms can give useful insights into many applications. In the second part, we focus on the analysis of high resolution magnetic resonance images (MRI) that are clinically relevant and may allow for an early detection and diagnosis of several diseases. In particular, we propose a framework for the classification of Alzheimer's disease in MR images combining the data mining steps of feature selection, clustering and classification. As a result, a set of highly selective features discriminating patients with Alzheimer and healthy people has been identified. However, the analysis of the high dimensional MR images is extremely time-consuming. Therefore we developed JGrid, a scalable distributed computing solution designed to allow for a large scale analysis of MRI and thus an optimized prediction of diagnosis. In another study we apply efficient algorithms for motif discovery to task-fMRI scans in order to identify patterns in the brain that are characteristic for patients with somatoform pain disorder. We find groups of brain compartments that occur frequently within the brain networks and discriminate well among healthy and diseased people

    Proceedings of the Third International Workshop on Mathematical Foundations of Computational Anatomy - Geometrical and Statistical Methods for Modelling Biological Shape Variability

    Get PDF
    International audienceComputational anatomy is an emerging discipline at the interface of geometry, statistics and image analysis which aims at modeling and analyzing the biological shape of tissues and organs. The goal is to estimate representative organ anatomies across diseases, populations, species or ages, to model the organ development across time (growth or aging), to establish their variability, and to correlate this variability information with other functional, genetic or structural information. The Mathematical Foundations of Computational Anatomy (MFCA) workshop aims at fostering the interactions between the mathematical community around shapes and the MICCAI community in view of computational anatomy applications. It targets more particularly researchers investigating the combination of statistical and geometrical aspects in the modeling of the variability of biological shapes. The workshop is a forum for the exchange of the theoretical ideas and aims at being a source of inspiration for new methodological developments in computational anatomy. A special emphasis is put on theoretical developments, applications and results being welcomed as illustrations. Following the successful rst edition of this workshop in 20061 and second edition in New-York in 20082, the third edition was held in Toronto on September 22 20113. Contributions were solicited in Riemannian and group theoretical methods, geometric measurements of the anatomy, advanced statistics on deformations and shapes, metrics for computational anatomy, statistics of surfaces, modeling of growth and longitudinal shape changes. 22 submissions were reviewed by three members of the program committee. To guaranty a high level program, 11 papers only were selected for oral presentation in 4 sessions. Two of these sessions regroups classical themes of the workshop: statistics on manifolds and diff eomorphisms for surface or longitudinal registration. One session gathers papers exploring new mathematical structures beyond Riemannian geometry while the last oral session deals with the emerging theme of statistics on graphs and trees. Finally, a poster session of 5 papers addresses more application oriented works on computational anatomy

    Gaze-Based Human-Robot Interaction by the Brunswick Model

    Get PDF
    We present a new paradigm for human-robot interaction based on social signal processing, and in particular on the Brunswick model. Originally, the Brunswick model copes with face-to-face dyadic interaction, assuming that the interactants are communicating through a continuous exchange of non verbal social signals, in addition to the spoken messages. Social signals have to be interpreted, thanks to a proper recognition phase that considers visual and audio information. The Brunswick model allows to quantitatively evaluate the quality of the interaction using statistical tools which measure how effective is the recognition phase. In this paper we cast this theory when one of the interactants is a robot; in this case, the recognition phase performed by the robot and the human have to be revised w.r.t. the original model. The model is applied to Berrick, a recent open-source low-cost robotic head platform, where the gazing is the social signal to be considered

    Harnessing Spatial Intensity Fluctuations for Optical Imaging and Sensing

    Get PDF
    Properties of light such as amplitude and phase, temporal and spatial coherence, polarization, etc. are abundantly used for sensing and imaging. Regardless of the passive or active nature of the sensing method, optical intensity fluctuations are always present! While these fluctuations are usually regarded as noise, there are situations where one can harness the intensity fluctuations to enhance certain attributes of the sensing procedure. In this thesis, we developed different sensing methodologies that use statistical properties of optical fluctuations for gauging specific information. We examine this concept in the context of three different aspects of computational optical imaging and sensing. First, we study imposing specific statistical properties to the probing field to image or characterize certain properties of an object through a statistical analysis of the spatially integrated scattered intensity. This offers unique capabilities for imaging and sensing techniques operating in highly perturbed environments and low-light conditions. Next, we examine optical sensing in the presence of strong perturbations that preclude any controllable field modification. We demonstrate that inherent properties of diffused coherent fields and fluctuations of integrated intensity can be used to track objects hidden behind obscurants. Finally, we address situations where, due to coherent noise, image accuracy is severely degraded by intensity fluctuations. By taking advantage of the spatial coherence properties of optical fields, we show that this limitation can be effectively mitigated and that a significant improvement in the signal-to-noise ratio can be achieved even in one single-shot measurement. The findings included in this dissertation illustrate different circumstances where optical fluctuations can affect the efficacy of computational optical imaging and sensing. A broad range of applications, including biomedical imaging and remote sensing, could benefit from the new approaches to suppress, enhance, and exploit optical fluctuations, which are described in this dissertation

    Machine Learning Approaches and Web-Based System to the Application of Disease Modifying Therapy for Sickle Cell

    Get PDF
    Sickle cell disease (SCD) is a common serious genetic disease, which has a severe impact due to red blood cell (RBCs) abnormality. According to the World Health Organisation, 7 million newborn babies each year suffer either from the congenital anomaly or from an inherited disease, primarily from thalassemia and sickle cell disease. In the case of SCD, recent research has shown the beneficial effects of a drug called hydroxyurea/hydroxycarbamide in modifying the disease phenotype. The clinical management of this disease-modifying therapy is difficult and time consuming for clinical staff. This includes finding an optimal classifier that can help to solve the issues with missing values, multi-class datasets, and features selection. For the classification and discriminant analysis of SCD datasets, 7 classifiers based on machine learning models are selected representing linear and non-linear methods. After running these classifiers with a single model, the results revealed that a single classifier has provided us with effective outcomes in terms of the classification performance evaluation metric. In order to produce such an optimal outcome, this research proposed and designed combined classifiers (ensemble classifiers) among the neural network’s models, the random forest classifier, and the K-nearest neighbour classifier. In this aspect, combining the levenberg-marquardt algorithm, the voted perceptron classifier, the radial basis neural classifier, and random forest classifier obtain the highest rate of performance and accuracy. This ensemble classifier receives better results during the training set and testing set process. Recent technology advances based on smart devices have improved the medical facilities and become increasingly popular in association with real-time health monitoring and remote/personal health-care. The web-based system developed under the supervision of the haematology specialist at the Alder Hey Children’s Hospital in order to produce such an effective and useful system for both patients and clinicians. To sum up, the simulation experiment concludes that using machine learning and the web-based system platforms represents an alternative procedure that could assist healthcare professionals, particularly for the specialist nurse and junior doctor to improve the quality of care with sickle cell disorder

    Using MapReduce Streaming for Distributed Life Simulation on the Cloud

    Get PDF
    Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp

    Object Recognition

    Get PDF
    Vision-based object recognition tasks are very familiar in our everyday activities, such as driving our car in the correct lane. We do these tasks effortlessly in real-time. In the last decades, with the advancement of computer technology, researchers and application developers are trying to mimic the human's capability of visually recognising. Such capability will allow machine to free human from boring or dangerous jobs
    corecore