68 research outputs found

    Median topographic maps for biomedical data sets

    Full text link
    Median clustering extends popular neural data analysis methods such as the self-organizing map or neural gas to general data structures given by a dissimilarity matrix only. This offers flexible and robust global data inspection methods which are particularly suited for a variety of data as occurs in biomedical domains. In this chapter, we give an overview about median clustering and its properties and extensions, with a particular focus on efficient implementations adapted to large scale data analysis

    Stratification of patient subgroups using high-dimensional and time-series observations

    Get PDF
    Precision medicine and patient stratification are expanding as a result of innovations in high-throughput technologies applied to clinical medicine. Stratification can explain differences in disease trajectories and outcomes in heterogeneous cohorts. Thus, approaches employed for patient treatment can be tailored by taking into account individual variabilities and specificities. This thesis focuses on clustering approaches and how they can be applied to both single time points and time-series high-dimensional data for the identification of disease subtypes defined by distinct mechanisms, also called endotypes, in complex and/or heterogeneous diseases. Multiple carefully selected clustering strategies were compared to highlight which would produce the most relevant stratification in terms of mathematical robustness and biological meaning, both of which quantified using standardised methods. More specifically, this strategy was applied to time-series multi-omics data from a cohort of patients with acute pancreatitis, an inflammatory disease of the pancreas. Using this high-dimensional multi-omics data as well as routine lab and clinical measurements, the cohort was stratified into four subgroups. Findings from the analysis of acute pancreatitis data showed that two of the four subgroups could be detected in another syndrome, acute respiratory distress syndrome, suggesting that inflammatory signatures are comparable between diseases. With the aim of applying these principles to other diseases and using preliminary results from other studies suggesting that relevant subgroups might be highlighted, data from inflammatory bowel disease and Parkinson's disease cohorts was analysed. Results from our analyses confirmed that disease knowledge could be gained using this approach. Work from this thesis provides novel approaches for the application and evaluation of stratification methods. Furthermore, results may constitute a basis for the development of tailored treatment approaches for acute pancreatitis, acute respiratory distress syndrome, inflammatory bowel disease and Parkinson’s disease. Also, the observation of commonalities between distinct inflammatory diseases will broaden the perspectives when analysing disease data and more specifically, in biomarker discovery and drug development processes

    Projection-Based Clustering through Self-Organization and Swarm Intelligence

    Get PDF
    It covers aspects of unsupervised machine learning used for knowledge discovery in data science and introduces a data-driven approach to cluster analysis, the Databionic swarm (DBS). DBS consists of the 3D landscape visualization and clustering of data. The 3D landscape enables 3D printing of high-dimensional data structures. The clustering and number of clusters or an absence of cluster structure are verified by the 3D landscape at a glance. DBS is the first swarm-based technique that shows emergent properties while exploiting concepts of swarm intelligence, self-organization and the Nash equilibrium concept from game theory. It results in the elimination of a global objective function and the setting of parameters. By downloading the R package DBS can be applied to data drawn from diverse research fields and used even by non-professionals in the field of data mining

    Clustering and its Application in Requirements Engineering

    Get PDF
    Large scale software systems challenge almost every activity in the software development life-cycle, including tasks related to eliciting, analyzing, and specifying requirements. Fortunately many of these complexities can be addressed through clustering the requirements in order to create abstractions that are meaningful to human stakeholders. For example, the requirements elicitation process can be supported through dynamically clustering incoming stakeholders’ requests into themes. Cross-cutting concerns, which have a significant impact on the architectural design, can be identified through the use of fuzzy clustering techniques and metrics designed to detect when a theme cross-cuts the dominant decomposition of the system. Finally, traceability techniques, required in critical software projects by many regulatory bodies, can be automated and enhanced by the use of cluster-based information retrieval methods. Unfortunately, despite a significant body of work describing document clustering techniques, there is almost no prior work which directly addresses the challenges, constraints, and nuances of requirements clustering. As a result, the effectiveness of software engineering tools and processes that depend on requirements clustering is severely limited. This report directly addresses the problem of clustering requirements through surveying standard clustering techniques and discussing their application to the requirements clustering process

    Projection-Based Clustering through Self-Organization and Swarm Intelligence: Combining Cluster Analysis with the Visualization of High-Dimensional Data

    Get PDF
    Cluster Analysis; Dimensionality Reduction; Swarm Intelligence; Visualization; Unsupervised Machine Learning; Data Science; Knowledge Discovery; 3D Printing; Self-Organization; Emergence; Game Theory; Advanced Analytics; High-Dimensional Data; Multivariate Data; Analysis of Structured Dat

    Relational data clustering algorithms with biomedical applications

    Get PDF

    Enhanced Learning Strategies for Tactile Shape Estimation and Grasp Planning of Unknown Objects

    Get PDF
    Grasping is one of the key capabilities for a robot operating and interacting with humans in a real environment. The conventional approaches require accurate information on both object shape and robotic system modeling. The performance, therefore, can be easily influenced by any noise sensor data or modeling errors. Moreover, identifying the shape of an unknown object under some vision-denied conditions is still a challenging problem in the robotics eld. To address this issue, this thesis investigates the estimation of unknown object shape using tactile exploration and the task-oriented grasp planning for a novel object using enhanced learning techniques. In order to rapidly estimate the shape of an unknown object, this thesis presents a novel multi- fidelity-based optimal sampling method which attempts to improve the existing shape estimation via tactile exploration. Gaussian process regression is used for implicit surface modeling with sequential sampling strategy. The main objective is to make the process of sample point selection more efficient and systematic such that the unknown shape can be estimated fast and accurately with highly limited sample points (e.g., less than 1% of number of data set for the true shape). Specifically, we propose to select the next best sample point based on two optimization criteria: 1) the mutual information (MI) for uncertainty reduction, and 2) the local curvature for fidelity enhancement. The combination of these two objectives leads to an optimal sampling process that balances between the exploration of the whole shape and the exploitation of the local area where the higher fidelity (or more sampling) is required. Simulation and experimental results successfully demonstrate the advantage of the proposed method in terms of estimation speed and accuracy over the conventional one, which allows us to reconstruct recognizable 3D shapes using only around optimally selected 0.4% of the original data set. With the available object shape, this thesis also introduces a knowledge-based approach to quickly generate a task-oriented grasp for a novel object. A comprehensive training dataset which consists of specific tasks and geometrical and physical knowledge of grasping is built up from physical experiment. To analyze and e fficiently utilize the training data, a multi-step clustering algorithm is developed based on a self-organizing map. A number of representative grasps are then selected from the entire training dataset and used to generate a suitable grasp for a novel object. The number of representative grasps is automatically determined using the proposed auto-growing method. In addition, to improve the accuracy and efficiency of the proposed clustering algorithm, we also develop a novel method to localize the initial centroids while capturing the outliers. The results of simulation illustrate that the proposed initialization method and the auto-growing method outperform some conventional approaches in terms of accuracy and efficiency. Furthermore, the proposed knowledge-based grasp planning is also validated on a real robot. The results demonstrate the effectiveness of this approach to generate task-oriented grasps for novel objects
    • …
    corecore