Search CORE

785 research outputs found

Graph-Based Approaches to Protein StructureComparison - From Local to Global Similarity

Author: Mernberger Marco
Publication venue: Philipps-Universität Marburg
Publication date: 01/01/2011
Field of study

The comparative analysis of protein structure data is a central aspect of structural bioinformatics. Drawing upon structural information allows the inference of function for unknown proteins even in cases where no apparent homology can be found on the sequence level. Regarding the function of an enzyme, the overall fold topology might less important than the specific structural conformation of the catalytic site or the surface region of a protein, where the interaction with other molecules, such as binding partners, substrates and ligands occurs. Thus, a comparison of these regions is especially interesting for functional inference, since structural constraints imposed by the demands of the catalyzed biochemical function make them more likely to exhibit structural similarity. Moreover, the comparative analysis of protein binding sites is of special interest in pharmaceutical chemistry, in order to predict cross-reactivities and gain a deeper understanding of the catalysis mechanism. From an algorithmic point of view, the comparison of structured data, or, more generally, complex objects, can be attempted based on different methodological principles. Global methods aim at comparing structures as a whole, while local methods transfer the problem to multiple comparisons of local substructures. In the context of protein structure analysis, it is not a priori clear, which strategy is more suitable. In this thesis, several conceptually different algorithmic approaches have been developed, based on local, global and semi-global strategies, for the task of comparing protein structure data, more specifically protein binding pockets. The use of graphs for the modeling of protein structure data has a long standing tradition in structural bioinformatics. Recently, graphs have been used to model the geometric constraints of protein binding sites. The algorithms developed in this thesis are based on this modeling concept, hence, from a computer scientist's point of view, they can also be regarded as global, local and semi-global approaches to graph comparison. The developed algorithms were mainly designed on the premise to allow for a more approximate comparison of protein binding sites, in order to account for the molecular flexibility of the protein structures. A main motivation was to allow for the detection of more remote similarities, which are not apparent by using more rigid methods. Subsequently, the developed approaches were applied to different problems typically encountered in the field of structural bioinformatics in order to assess and compare their performance and suitability for different problems. Each of the approaches developed during this work was capable of improving upon the performance of existing methods in the field. Another major aspect in the experiments was the question, which methodological concept, local, global or a combination of both, offers the most benefits for the specific task of protein binding site comparison, a question that is addressed throughout this thesis

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

Training Optimization for Artificial Neural Networks

Author: ALEJO ELEUTERIO ROBERTO
ALEJO ELEUTERIO ROBERTO
RODRIGUEZ MENDEZ BENJAMIN GONZALO
RODRIGUEZ MENDEZ BENJAMIN GONZALO
Toribio Luna Primitivo
Toribio Luna Primitivo
VALDOVINOS ROSAS ROSA MARIA
VALDOVINOS ROSAS ROSA MARIA
Publication venue: 'Universidad Autonoma del Estado de Mexico'
Publication date: 29/04/2010
Field of study

Debido a la habilidad para modelar problemas complejos, actualmente las Redes Neuronales Artificiales (nn) son muy populares en Reconocimiento de Patrones, Minería de Datos y Aprendizaje Automático. No obstante, el elevado costo computacional asociado a la fase en entrenamiento, cuando grandes bases de datos son utilizados, es su principal desventaja. Con la intención de disminuir el costo computacional e incrementar la convergencia de la nn, el presente trabajo analiza la conveniencia de realizar pre-procesamiento a los conjuntos de datos. De forma específica, se evalúan los métodos de grafo de vecindad relativa (rng), grafo de Gabriel (gg) y el método basado en los vecinos envolventes k-ncn. Los resultados experimentales muestran la factibilidad y las múltiples ventajas de esas metodologías para solventar los problemas descritos previamente.Debido a la habilidad para modelar problemas complejos, actualmente las Redes Neuronales ArtiÀciales (nn) son muy populares en Reconocimiento de Patrones, Minería de Datos y Aprendizaje Automático. No obstante, el elevado costo computacional asociado a la fase en entrenamiento, cuando grandes bases de datos son utilizados, es su principal desventaja. Con la intención de disminuir el costo computacional e incrementar la convergencia de la nn, el presente trabajo analiza la conveniencia de realizar pre-procesamiento a los conjuntos de datos. De forma especíÀca, se evalúan los métodos de grafo de vecindad relativa (rng), grafo de Gabriel (gg) y el método basado en los vecinos envolventes k-ncn. Los resultados experimentales muestran la factibilidad y las múltiples ventajas de esas metodologías para solventar los problemas descritos previament

Red Mexicana de Repositorios Institucionales

Repositorio Institucional de la Universidad Autónoma del Estado de México

Improving Variable Selection and Mammography-based Machine Learning Classifiers for Breast Cancer CADx

Author: Noel Pérez Pérez
Publication venue
Publication date: 11/06/2015
Field of study

Repositório Aberto da Universidade do Porto

Predictive Pattern Discovery in Dynamic Data Systems

Author: Zhang Wenjing
Publication venue: e-Publications@Marquette
Publication date: 01/01/2013
Field of study

This dissertation presents novel methods for analyzing nonlinear time series in dynamic systems. The purpose of the newly developed methods is to address the event prediction problem through modeling of predictive patterns. Firstly, a novel categorization mechanism is introduced to characterize different underlying states in the system. A new hybrid method was developed utilizing both generative and discriminative models to address the event prediction problem through optimization in multivariate systems. Secondly, in addition to modeling temporal dynamics, a Bayesian approach is employed to model the first-order Markov behavior in the multivariate data sequences. Experimental evaluations demonstrated superior performance over conventional methods, especially when the underlying system is chaotic and has heterogeneous patterns during state transitions. Finally, the concept of adaptive parametric phase space is introduced. The equivalence between time-domain phase space and associated parametric space is theoretically analyzed

epublications@Marquette

Trends in Nearest Feature Classification for Face RecognitionAchievements and Perspectives

Author: C&#233
Mauricio Orozco-Alzate
Publication venue: 'IntechOpen'
Publication date: 01/01/2009
Field of study

IntechOpen

Biometric Authentication using Nonparametric Methods

Author: Radhika K. R.
Sheela S. V.
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 07/06/2010
Field of study

The physiological and behavioral trait is employed to develop biometric authentication systems. The proposed work deals with the authentication of iris and signature based on minimum variance criteria. The iris patterns are preprocessed based on area of the connected components. The segmented image used for authentication consists of the region with large variations in the gray level values. The image region is split into quadtree components. The components with minimum variance are determined from the training samples. Hu moments are applied on the components. The summation of moment values corresponding to minimum variance components are provided as input vector to k-means and fuzzy kmeans classifiers. The best performance was obtained for MMU database consisting of 45 subjects. The number of subjects with zero False Rejection Rate [FRR] was 44 and number of subjects with zero False Acceptance Rate [FAR] was 45. This paper addresses the computational load reduction in off-line signature verification based on minimal features using k-means, fuzzy k-means, k-nn, fuzzy k-nn and novel average-max approaches. FRR of 8.13% and FAR of 10% was achieved using k-nn classifier. The signature is a biometric, where variations in a genuine case, is a natural expectation. In the genuine signature, certain parts of signature vary from one instance to another. The system aims to provide simple, fast and robust system using less number of features when compared to state of art works.Comment: 20 page

arXiv.org e-Print Archive

Crossref

Computational Analysis of Structure-Activity Relationships : From Prediction to Visualization Methods

Author: Wassermann Anne Mai
Publication venue: Universitäts- und Landesbibliothek Bonn
Publication date
Field of study

Understanding how structural modifications affect the biological activity of small molecules is one of the central themes in medicinal chemistry. By no means is structure-activity relationship (SAR) analysis a priori dependent on computational methods. However, as molecular data sets grow in size, we quickly approach our limits to access and compare structures and associated biological properties so that computational data processing and analysis often become essential. Here, different types of approaches of varying complexity for the analysis of SAR information are presented, which can be applied in the context of screening and chemical optimization projects. The first part of this thesis is dedicated to machine-learning strategies that aim at de novo ligand prediction and the preferential detection of potent hits in virtual screening. High emphasis is put on benchmarking of different strategies and a thorough evaluation of their utility in practical applications. However, an often claimed disadvantage of these prediction methods is their "black box" character because they do not necessarily reveal which structural features are associated with biological activity. Therefore, these methods are complemented by more descriptive SAR analysis approaches showing a higher degree of interpretability. Concepts from information theory are adapted to identify activity-relevant structure-derived descriptors. Furthermore, compound data mining methods exploring prespecified properties of available bioactive compounds on a large scale are designed to systematically relate molecular transformations to activity changes. Finally, these approaches are complemented by graphical methods that primarily help to access and visualize SAR data in congeneric series of compounds and allow the formulation of intuitive SAR rules applicable to the design of new compounds. The compendium of SAR analysis tools introduced in this thesis investigates SARs from different perspectives

bonndoc – Der Publikationsserver der Universität Bonn