218 research outputs found

    Clustering Algorithms: Their Application to Gene Expression Data

    Get PDF
    Gene expression data hide vital information required to understand the biological process that takes place in a particular organism in relation to its environment. Deciphering the hidden patterns in gene expression data proffers a prodigious preference to strengthen the understanding of functional genomics. The complexity of biological networks and the volume of genes present increase the challenges of comprehending and interpretation of the resulting mass of data, which consists of millions of measurements; these data also inhibit vagueness, imprecision, and noise. Therefore, the use of clustering techniques is a first step toward addressing these challenges, which is essential in the data mining process to reveal natural structures and iden-tify interesting patterns in the underlying data. The clustering of gene expression data has been proven to be useful in making known the natural structure inherent in gene expression data, understanding gene functions, cellular processes, and subtypes of cells, mining useful information from noisy data, and understanding gene regulation. The other benefit of clustering gene expression data is the identification of homology, which is very important in vaccine design. This review examines the various clustering algorithms applicable to the gene expression data in order to discover and provide useful knowledge of the appropriate clustering technique that will guarantee stability and high degree of accuracy in its analysis procedure

    Relational data clustering algorithms with biomedical applications

    Get PDF

    On the Synthesis of fuzzy neural systems.

    Get PDF
    by Chung, Fu Lai.Thesis (Ph.D.)--Chinese University of Hong Kong, 1995.Includes bibliographical references (leaves 166-174).ACKNOWLEDGEMENT --- p.iiiABSTRACT --- p.ivChapter 1. --- Introduction --- p.1Chapter 1.1 --- Integration of Fuzzy Systems and Neural Networks --- p.1Chapter 1.2 --- Objectives of the Research --- p.7Chapter 1.2.1 --- Fuzzification of Competitive Learning Algorithms --- p.7Chapter 1.2.2 --- Capacity Analysis of FAM and FRNS Models --- p.8Chapter 1.2.3 --- Structure and Parameter Identifications of FRNS --- p.9Chapter 1.3 --- Outline of the Thesis --- p.9Chapter 2. --- A Fuzzy System Primer --- p.11Chapter 2.1 --- Basic Concepts of Fuzzy Sets --- p.11Chapter 2.2 --- Fuzzy Set-Theoretic Operators --- p.15Chapter 2.3 --- "Linguistic Variable, Fuzzy Rule and Fuzzy Inference" --- p.19Chapter 2.4 --- Basic Structure of a Fuzzy System --- p.22Chapter 2.4.1 --- Fuzzifier --- p.22Chapter 2.4.2 --- Fuzzy Knowledge Base --- p.23Chapter 2.4.3 --- Fuzzy Inference Engine --- p.24Chapter 2.4.4 --- Defuzzifier --- p.28Chapter 2.5 --- Concluding Remarks --- p.29Chapter 3. --- Categories of Fuzzy Neural Systems --- p.30Chapter 3.1 --- Introduction --- p.30Chapter 3.2 --- Fuzzification of Neural Networks --- p.31Chapter 3.2.1 --- Fuzzy Membership Driven Models --- p.32Chapter 3.2.2 --- Fuzzy Operator Driven Models --- p.34Chapter 3.2.3 --- Fuzzy Arithmetic Driven Models --- p.35Chapter 3.3 --- Layered Network Implementation of Fuzzy Systems --- p.36Chapter 3.3.1 --- Mamdani's Fuzzy Systems --- p.36Chapter 3.3.2 --- Takagi and Sugeno's Fuzzy Systems --- p.37Chapter 3.3.3 --- Fuzzy Relation Based Fuzzy Systems --- p.38Chapter 3.4 --- Concluding Remarks --- p.40Chapter 4. --- Fuzzification of Competitive Learning Networks --- p.42Chapter 4.1 --- Introduction --- p.42Chapter 4.2 --- Crisp Competitive Learning --- p.44Chapter 4.2.1 --- Unsupervised Competitive Learning Algorithm --- p.46Chapter 4.2.2 --- Learning Vector Quantization Algorithm --- p.48Chapter 4.2.3 --- Frequency Sensitive Competitive Learning Algorithm --- p.50Chapter 4.3 --- Fuzzy Competitive Learning --- p.50Chapter 4.3.1 --- Unsupervised Fuzzy Competitive Learning Algorithm --- p.53Chapter 4.3.2 --- Fuzzy Learning Vector Quantization Algorithm --- p.54Chapter 4.3.3 --- Fuzzy Frequency Sensitive Competitive Learning Algorithm --- p.58Chapter 4.4 --- Stability of Fuzzy Competitive Learning --- p.58Chapter 4.5 --- Controlling the Fuzziness of Fuzzy Competitive Learning --- p.60Chapter 4.6 --- Interpretations of Fuzzy Competitive Learning Networks --- p.61Chapter 4.7 --- Simulation Results --- p.64Chapter 4.7.1 --- Performance of Fuzzy Competitive Learning Algorithms --- p.64Chapter 4.7.2 --- Performance of Monotonically Decreasing Fuzziness Control Scheme --- p.74Chapter 4.7.3 --- Interpretation of Trained Networks --- p.76Chapter 4.8 --- Concluding Remarks --- p.80Chapter 5. --- Capacity Analysis of Fuzzy Associative Memories --- p.82Chapter 5.1 --- Introduction --- p.82Chapter 5.2 --- Fuzzy Associative Memories (FAMs) --- p.83Chapter 5.3 --- Storing Multiple Rules in FAMs --- p.87Chapter 5.4 --- A High Capacity Encoding Scheme for FAMs --- p.90Chapter 5.5 --- Memory Capacity --- p.91Chapter 5.6 --- Rule Modification --- p.93Chapter 5.7 --- Inference Performance --- p.99Chapter 5.8 --- Concluding Remarks --- p.104Chapter 6. --- Capacity Analysis of Fuzzy Relational Neural Systems --- p.105Chapter 6.1 --- Introduction --- p.105Chapter 6.2 --- Fuzzy Relational Equations and Fuzzy Relational Neural Systems --- p.107Chapter 6.3 --- Solving a System of Fuzzy Relational Equations --- p.109Chapter 6.4 --- New Solvable Conditions --- p.112Chapter 6.4.1 --- Max-t Fuzzy Relational Equations --- p.112Chapter 6.4.2 --- Min-s Fuzzy Relational Equations --- p.117Chapter 6.5 --- Approximate Resolution --- p.119Chapter 6.6 --- System Capacity --- p.123Chapter 6.7 --- Inference Performance --- p.125Chapter 6.8 --- Concluding Remarks --- p.127Chapter 7. --- Structure and Parameter Identifications of Fuzzy Relational Neural Systems --- p.129Chapter 7.1 --- Introduction --- p.129Chapter 7.2 --- Modelling Nonlinear Dynamic Systems by Fuzzy Relational Equations --- p.131Chapter 7.3 --- A General FRNS Identification Algorithm --- p.138Chapter 7.4 --- An Evolutionary Computation Approach to Structure and Parameter Identifications --- p.139Chapter 7.4.1 --- Guided Evolutionary Simulated Annealing --- p.140Chapter 7.4.2 --- An Evolutionary Identification (EVIDENT) Algorithm --- p.143Chapter 7.5 --- Simulation Results --- p.146Chapter 7.6 --- Concluding Remarks --- p.158Chapter 8. --- Conclusions --- p.159Chapter 8.1 --- Summary of Contributions --- p.160Chapter 8.1.1 --- Fuzzy Competitive Learning --- p.160Chapter 8.1.2 --- Capacity Analysis of FAM and FRNS --- p.160Chapter 8.1.3 --- Numerical Identification of FRNS --- p.161Chapter 8.2 --- Further Investigations --- p.162Appendix A Publication List of the Candidate --- p.164BIBLIOGRAPHY --- p.16


    Get PDF
    The k-means clustering algorithm (k-means for short) provides a method offinding structure in input examples. It is also called the Lloyd–Forgy algorithm as it was independently introduced by both Stuart Lloyd and Edward Forgy. k-means, like other algorithms you will study in this part of the book, is an unsupervised learning algorithm and, as such, does not require labels associated with input examples. Recall that unsupervised learning algorithms provide a way of discovering some inherent structure in the input examples. This is in contrast with supervised learning algorithms, which require input examples and associated labels so as to fit a hypothesis function that maps input examples to one or more output variables


    Get PDF
    The k-means clustering algorithm (k-means for short) provides a method offinding structure in input examples. It is also called the Lloyd–Forgy algorithm as it was independently introduced by both Stuart Lloyd and Edward Forgy. k-means, like other algorithms you will study in this part of the book, is an unsupervised learning algorithm and, as such, does not require labels associated with input examples. Recall that unsupervised learning algorithms provide a way of discovering some inherent structure in the input examples. This is in contrast with supervised learning algorithms, which require input examples and associated labels so as to fit a hypothesis function that maps input examples to one or more output variables

    Advanced and novel modeling techniques for simulation, optimization and monitoring chemical engineering tasks with refinery and petrochemical unit applications

    Get PDF
    Engineers predict, optimize, and monitor processes to improve safety and profitability. Models automate these tasks and determine precise solutions. This research studies and applies advanced and novel modeling techniques to automate and aid engineering decision-making. Advancements in computational ability have improved modeling software’s ability to mimic industrial problems. Simulations are increasingly used to explore new operating regimes and design new processes. In this work, we present a methodology for creating structured mathematical models, useful tips to simplify models, and a novel repair method to improve convergence by populating quality initial conditions for the simulation’s solver. A crude oil refinery application is presented including simulation, simplification tips, and the repair strategy implementation. A crude oil scheduling problem is also presented which can be integrated with production unit models. Recently, stochastic global optimization (SGO) has shown to have success of finding global optima to complex nonlinear processes. When performing SGO on simulations, model convergence can become an issue. The computational load can be decreased by 1) simplifying the model and 2) finding a synergy between the model solver repair strategy and optimization routine by using the initial conditions formulated as points to perturb the neighborhood being searched. Here, a simplifying technique to merging the crude oil scheduling problem and the vertically integrated online refinery production optimization is demonstrated. To optimize the refinery production a stochastic global optimization technique is employed. Process monitoring has been vastly enhanced through a data-driven modeling technique Principle Component Analysis. As opposed to first-principle models, which make assumptions about the structure of the model describing the process, data-driven techniques make no assumptions about the underlying relationships. Data-driven techniques search for a projection that displays data into a space easier to analyze. Feature extraction techniques, commonly dimensionality reduction techniques, have been explored fervidly to better capture nonlinear relationships. These techniques can extend data-driven modeling’s process-monitoring use to nonlinear processes. Here, we employ a novel nonlinear process-monitoring scheme, which utilizes Self-Organizing Maps. The novel techniques and implementation methodology are applied and implemented to a publically studied Tennessee Eastman Process and an industrial polymerization unit

    Validação de heterogeneidade estrutural em dados de Crio-ME por comitês de agrupadores

    Get PDF
    Orientadores: Fernando José Von Zuben, Rodrigo Villares PortugalDissertação (mestrado) - Universidade Estadual de Campinas, Faculdade de Engenharia Elétrica e de ComputaçãoResumo: Análise de Partículas Isoladas é uma técnica que permite o estudo da estrutura tridimensional de proteínas e outros complexos macromoleculares de interesse biológico. Seus dados primários consistem em imagens de microscopia eletrônica de transmissão de múltiplas cópias da molécula em orientações aleatórias. Tais imagens são bastante ruidosas devido à baixa dose de elétrons utilizada. Reconstruções 3D podem ser obtidas combinando-se muitas imagens de partículas em orientações similares e estimando seus ângulos relativos. Entretanto, estados conformacionais heterogêneos frequentemente coexistem na amostra, porque os complexos moleculares podem ser flexíveis e também interagir com outras partículas. Heterogeneidade representa um desafio na reconstrução de modelos 3D confiáveis e degrada a resolução dos mesmos. Entre os algoritmos mais populares usados para classificação estrutural estão o agrupamento por k-médias, agrupamento hierárquico, mapas autoorganizáveis e estimadores de máxima verossimilhança. Tais abordagens estão geralmente entrelaçadas à reconstrução dos modelos 3D. No entanto, trabalhos recentes indicam ser possível inferir informações a respeito da estrutura das moléculas diretamente do conjunto de projeções 2D. Dentre estas descobertas, está a relação entre a variabilidade estrutural e manifolds em um espaço de atributos multidimensional. Esta dissertação investiga se um comitê de algoritmos de não-supervisionados é capaz de separar tais "manifolds conformacionais". Métodos de "consenso" tendem a fornecer classificação mais precisa e podem alcançar performance satisfatória em uma ampla gama de conjuntos de dados, se comparados a algoritmos individuais. Nós investigamos o comportamento de seis algoritmos de agrupamento, tanto individualmente quanto combinados em comitês, para a tarefa de classificação de heterogeneidade conformacional. A abordagem proposta foi testada em conjuntos sintéticos e reais contendo misturas de imagens de projeção da proteína Mm-cpn nos estados "aberto" e "fechado". Demonstra-se que comitês de agrupadores podem fornecer informações úteis na validação de particionamentos estruturais independetemente de algoritmos de reconstrução 3DAbstract: Single Particle Analysis is a technique that allows the study of the three-dimensional structure of proteins and other macromolecular assemblies of biological interest. Its primary data consists of transmission electron microscopy images from multiple copies of the molecule in random orientations. Such images are very noisy due to the low electron dose employed. Reconstruction of the macromolecule can be obtained by averaging many images of particles in similar orientations and estimating their relative angles. However, heterogeneous conformational states often co-exist in the sample, because the molecular complexes can be flexible and may also interact with other particles. Heterogeneity poses a challenge to the reconstruction of reliable 3D models and degrades their resolution. Among the most popular algorithms used for structural classification are k-means clustering, hierarchical clustering, self-organizing maps and maximum-likelihood estimators. Such approaches are usually interlaced with the reconstructions of the 3D models. Nevertheless, recent works indicate that it is possible to infer information about the structure of the molecules directly from the dataset of 2D projections. Among these findings is the relationship between structural variability and manifolds in a multidimensional feature space. This dissertation investigates whether an ensemble of unsupervised classification algorithms is able to separate these "conformational manifolds". Ensemble or "consensus" methods tend to provide more accurate classification and may achieve satisfactory performance across a wide range of datasets, when compared with individual algorithms. We investigate the behavior of six clustering algorithms both individually and combined in ensembles for the task of structural heterogeneity classification. The approach was tested on synthetic and real datasets containing a mixture of images from the Mm-cpn chaperonin in the "open" and "closed" states. It is shown that cluster ensembles can provide useful information in validating the structural partitionings independently of 3D reconstruction methodsMestradoEngenharia de ComputaçãoMestre em Engenharia Elétric

    Traveling Salesman Problem

    Get PDF
    This book is a collection of current research in the application of evolutionary algorithms and other optimal algorithms to solving the TSP problem. It brings together researchers with applications in Artificial Immune Systems, Genetic Algorithms, Neural Networks and Differential Evolution Algorithm. Hybrid systems, like Fuzzy Maps, Chaotic Maps and Parallelized TSP are also presented. Most importantly, this book presents both theoretical as well as practical applications of TSP, which will be a vital tool for researchers and graduate entry students in the field of applied Mathematics, Computing Science and Engineering