749 research outputs found

    OctaSOM - An octagonal based SOM lattice structure for biomedical problems

    Get PDF
    In this study, an octagonal-based self-organizing network’s lattice structure is proposed to allow more exploration and exploitation in updating the weights for better mapping and classification performances.The neighborhood of the octagonal-based lattice structure provides more nodes for the weights updating than standard hexagonal-based lattice structure. Based on our experiment, the octagonal-based lattice structure performance is better than standard hexagonal lattice structure on biomedical datasets for classification problem. This indicates that proposed algorithm is an alternative lattice structure for self-organizing network which give more wisdom to classification problems especially in the biomedical domains

    Computational Methods for Conformational Sampling of Biomolecules

    Get PDF

    Augmenting Latent Dirichlet Allocation and Rank Threshold Detection with Ontologies

    Get PDF
    In an ever-increasing data rich environment, actionable information must be extracted, filtered, and correlated from massive amounts of disparate often free text sources. The usefulness of the retrieved information depends on how we accomplish these steps and present the most relevant information to the analyst. One method for extracting information from free text is Latent Dirichlet Allocation (LDA), a document categorization technique to classify documents into cohesive topics. Although LDA accounts for some implicit relationships such as synonymy (same meaning) it often ignores other semantic relationships such as polysemy (different meanings), hyponym (subordinate), meronym (part of), and troponomys (manner). To compensate for this deficiency, we incorporate explicit word ontologies, such as WordNet, into the LDA algorithm to account for various semantic relationships. Experiments over the 20 Newsgroups, NIPS, OHSUMED, and IED document collections demonstrate that incorporating such knowledge improves perplexity measure over LDA alone for given parameters. In addition, the same ontology augmentation improves recall and precision results for user queries

    Structural aspects of molecular recognition

    Get PDF
    This thesis describes the design, implementation and application of a novel docking algorithm. Chapter 1 reviews some important facts about proteins and protein structure. Several molecular recognition systems are examined in detail. This Chapter also reviews a representative set of recent protein/protein docking methods and discusses their relative merits. Chapter 2 sets out the aims of the new docking algorithm, called DAPMatch, and gives full details of its implementation on a parallel architecture computer. The testing of the algorithm is also discussed. Subsequent chapters describe the application of the DAPMatch algorithm to a number of docking problems. DAPMatch is used to reconstruct the known structures of three antibody/lysozyme complexes, using the unbound structure of lysozyme. For the first time a model of the D1.3 antibody is used as a target molecule for a docking algorithm. These results are presented in Chapter 3 and analysed in detail to demonstrate their significance; non-native solutions are also examined. Chapter 4 describes the practical use of the DAPMatch algorithm in a modelling situation, to construct a hypothetical structure for the high molecular weight epidermal growth factor complex. Chapter 5 describes the adaptation of the DAPMatch algorithm to investigate α-helix/α-helix docking, and presents the results obtained. Chapter 6 explains the conclusions that were derived from this work, and suggests possible future enhancements to the algorithm

    Path finding on a spherical self-organizing map using distance transformations

    Get PDF
    Spatialization methods create visualizations that allow users to analyze high-dimensional data in an intuitive manner and facilitates the extraction of meaningful information. Just as geographic maps are simpli ed representations of geographic spaces, these visualizations are esssentially maps of abstract data spaces that are created through dimensionality reduction. While we are familiar with geographic maps for path planning/ nding applications, research into using maps of high-dimensional spaces for such purposes has been largely ignored. However, literature has shown that it is possible to use these maps to track temporal and state changes within a high-dimensional space. A popular dimensionality reduction method that produces a mapping for these purposes is the Self-Organizing Map. By using its topology preserving capabilities with a colour-based visualization method known as the U-Matrix, state transitions can be visualized as trajectories on the resulting mapping. Through these trajectories, one can gather information on the transition path between two points in the original high-dimensional state space. This raises the interesting question of whether or not the Self-Organizing Map can be used to discover the transition path between two points in an n-dimensional space. In this thesis, we use a spherically structured Self-Organizing Map called the Geodesic Self-Organizing Map for dimensionality reduction and the creation of a topological mapping that approximates the n-dimensional space. We rst present an intuitive method for a user to navigate the surface of the Geodesic SOM. A new application of the distance transformation algorithm is then proposed to compute the path between two points on the surface of the SOM, which corresponds to two points in the data space. Discussions will then follow on how this application could be improved using some form of surface shape analysis. The new approach presented in this thesis would then be evaluated by analyzing the results of using the Geodesic SOM for manifold embedding and by carrying out data analyses using carbon dioxide emissions data

    Path finding on a spherical self-organizing map using distance transformations

    Get PDF
    Spatialization methods create visualizations that allow users to analyze high-dimensional data in an intuitive manner and facilitates the extraction of meaningful information. Just as geographic maps are simpli ed representations of geographic spaces, these visualizations are esssentially maps of abstract data spaces that are created through dimensionality reduction. While we are familiar with geographic maps for path planning/ nding applications, research into using maps of high-dimensional spaces for such purposes has been largely ignored. However, literature has shown that it is possible to use these maps to track temporal and state changes within a high-dimensional space. A popular dimensionality reduction method that produces a mapping for these purposes is the Self-Organizing Map. By using its topology preserving capabilities with a colour-based visualization method known as the U-Matrix, state transitions can be visualized as trajectories on the resulting mapping. Through these trajectories, one can gather information on the transition path between two points in the original high-dimensional state space. This raises the interesting question of whether or not the Self-Organizing Map can be used to discover the transition path between two points in an n-dimensional space. In this thesis, we use a spherically structured Self-Organizing Map called the Geodesic Self-Organizing Map for dimensionality reduction and the creation of a topological mapping that approximates the n-dimensional space. We rst present an intuitive method for a user to navigate the surface of the Geodesic SOM. A new application of the distance transformation algorithm is then proposed to compute the path between two points on the surface of the SOM, which corresponds to two points in the data space. Discussions will then follow on how this application could be improved using some form of surface shape analysis. The new approach presented in this thesis would then be evaluated by analyzing the results of using the Geodesic SOM for manifold embedding and by carrying out data analyses using carbon dioxide emissions data
    corecore