85 research outputs found

    Genetic heterogeneity analysis using genetic algorithm and network science

    Full text link
    Through genome-wide association studies (GWAS), disease susceptible genetic variables can be identified by comparing the genetic data of individuals with and without a specific disease. However, the discovery of these associations poses a significant challenge due to genetic heterogeneity and feature interactions. Genetic variables intertwined with these effects often exhibit lower effect-size, and thus can be difficult to be detected using machine learning feature selection methods. To address these challenges, this paper introduces a novel feature selection mechanism for GWAS, named Feature Co-selection Network (FCSNet). FCS-Net is designed to extract heterogeneous subsets of genetic variables from a network constructed from multiple independent feature selection runs based on a genetic algorithm (GA), an evolutionary learning algorithm. We employ a non-linear machine learning algorithm to detect feature interaction. We introduce the Community Risk Score (CRS), a synthetic feature designed to quantify the collective disease association of each variable subset. Our experiment showcases the effectiveness of the utilized GA-based feature selection method in identifying feature interactions through synthetic data analysis. Furthermore, we apply our novel approach to a case-control colorectal cancer GWAS dataset. The resulting synthetic features are then used to explain the genetic heterogeneity in an additional case-only GWAS dataset

    Emergent relational schemas for RDF

    Get PDF

    Mining a Small Medical Data Set by Integrating the Decision Tree and t-test

    Get PDF
    [[abstract]]Although several researchers have used statistical methods to prove that aspiration followed by the injection of 95% ethanol left in situ (retention) is an effective treatment for ovarian endometriomas, very few discuss the different conditions that could generate different recovery rates for the patients. Therefore, this study adopts the statistical method and decision tree techniques together to analyze the postoperative status of ovarian endometriosis patients under different conditions. Since our collected data set is small, containing only 212 records, we use all of these data as the training data. Therefore, instead of using a resultant tree to generate rules directly, we use the value of each node as a cut point to generate all possible rules from the tree first. Then, using t-test, we verify the rules to discover some useful description rules after all possible rules from the tree have been generated. Experimental results show that our approach can find some new interesting knowledge about recurrent ovarian endometriomas under different conditions.[[journaltype]]國外[[incitationindex]]EI[[booktype]]紙本[[countrycodes]]FI

    Projection-Based Clustering through Self-Organization and Swarm Intelligence

    Get PDF
    It covers aspects of unsupervised machine learning used for knowledge discovery in data science and introduces a data-driven approach to cluster analysis, the Databionic swarm (DBS). DBS consists of the 3D landscape visualization and clustering of data. The 3D landscape enables 3D printing of high-dimensional data structures. The clustering and number of clusters or an absence of cluster structure are verified by the 3D landscape at a glance. DBS is the first swarm-based technique that shows emergent properties while exploiting concepts of swarm intelligence, self-organization and the Nash equilibrium concept from game theory. It results in the elimination of a global objective function and the setting of parameters. By downloading the R package DBS can be applied to data drawn from diverse research fields and used even by non-professionals in the field of data mining

    Contributions for the exploitation of Semantic Technologies in Industry 4.0

    Get PDF
    120 p.En este trabajo de investigación se promueve la utilización de las tecnologías semánticas, en el entorno de la Industria 4.0, a través de tres contribuciones enfocadas en temas correspondientes a la fabricación inteligente: las descripciones enriquecidas de componentes, la visualización y el análisis de los datos, y la implementación de la Industria 4.0 en PyMEs.La primera contribución es una ontología llamada ExtruOnt, la cual contiene descripciones semánticas de un tipo de máquina de fabricación (la extrusora). En esta ontología se describen los componentes, sus conexiones espaciales, sus características, sus representaciones en tres dimensiones y, finalmente, los sensores utilizados para capturar los datos. La segunda contribución corresponde a un sistema de consulta visual en el cual se utiliza la ontología ExtruOnt y una representación en 2D de la extrusora para facilitar a los expertos de dominio la visualización y la extracción de conocimiento sobre el proceso de fabricación de una manera rápida y sencilla. La tercera contribución consiste en una metodología para la implementación de la Industria 4.0 en PyMEs, orientada al ciclo de vida del cliente y potenciada por el uso de tecnologías Semánticas y tecnologías de renderizado 3D.Las contribuciones han sido desarrolladas, aplicadas y validadas bajo un escenario de fabricación real

    Projection-Based Clustering through Self-Organization and Swarm Intelligence: Combining Cluster Analysis with the Visualization of High-Dimensional Data

    Get PDF
    Cluster Analysis; Dimensionality Reduction; Swarm Intelligence; Visualization; Unsupervised Machine Learning; Data Science; Knowledge Discovery; 3D Printing; Self-Organization; Emergence; Game Theory; Advanced Analytics; High-Dimensional Data; Multivariate Data; Analysis of Structured Dat

    The robust optimization of non-linear requirements models

    Get PDF
    Solutions to non-linear requirements engineering problems may be brittle ; i.e. small changes may dramatically alter solution effectiveness. Hence, it is not enough to just generate solutions to requirements problems---we must also assess solution robustness. This thesis aims to address two concerns: (a) Is demonstrating robustness a time consuming task? and (b) Is it necessary that solution quality be traded off against solution robustness?;Using a Bayesian ranking heuristic, the KEYS2 algorithm fixes a small number of important variables, rapidly pushing the search into a stable, optimal plateau. By design, KEYS2 generates decision ordering diagrams (in time experimentally shown to be O(N2)). Once generated, these diagrams can confirm solution robustness in linear time. When assessed in terms of reducing inference times, increasing solution quality, and decreasing the variance of the generated solution, KEYS2 out-performs other search algorithms (simulated annealing, A*, MaxWalkSat)

    Towards a resilient networked service system

    Get PDF
    Large service systems today are of highly network structures. In this thesis, these large service systems are called networked service systems. The network nature of these systems has no doubt brought mass customized services but has also created challenges in the management of their safety. The safety of service systems is an important issue due to their critical influences on the functioning of society. Traditional safety engineering methods focus on maintaining service systems in a safe state, in particular aiming to maintain systems to be reliable and robust. However, resilience cannot be absent from safety out of many recent disasters that occur in society. The goal of this thesis is to improve the resilience of networked service systems. Four major works have been performed to achieve this goal. First, a unified definition of service systems was proposed and its relationship to other system concepts was unfolded. Upon the new definition, a domain model of service systems was established by a FCBPSS framework, followed by developing a computational model. Second, a definition of resilience for service systems was proposed, based on which the relationship among three safety properties (i.e., reliability, robustness and resilience) was clarified, followed by developing a framework for resilience analysis. Third, a methodology of resilience measurement for service systems was proposed by four measurement axioms along with corresponding mathematical models. The methodology focused on the potential ability of a service system to create optimal rebalancing solutions. Two typical service systems, transportation system and enterprise information system, were employed to validate the methodology. Fourth, a methodology of enhancing resilience for service systems was proposed by integrating three types of reconfigurations of systems, namely design, planning and management, along with the corresponding mathematical model. This methodology was validated by an example of transportation system. Several conclusions can be drawn from the work above: (1) a service system has a unique characteristic that it meets humans' demand directly, and its safety relies on the balance between the supplies and demands; (2) different from reliability and robustness, the resilience of a service system focuses on the rebalancing ability from imbalanced situations; (3) it makes sense to measure the resilience of a service system only for a particular imbalanced situation and based on evaluation of rebalancing solutions; and (4) integration of design, planning and management is an effective approach for improvement of the resilience for a service system. The contributions of this thesis can be summarized. Scientifically, this thesis work has improved our understanding of service systems and their resilience property; furthermore, this work has advanced the state of knowledge of safety science in particular having successfully responded to two questions: is a service system safe and how to make a service system safer? Technologically or methodologically, the work has advanced the knowledge for modeling and optimization of networked service systems in particular with multiple layer models along with the algorithms for integrated decision making on design, planning, and management
    corecore