12 research outputs found

    MIANN models of networks of biochemical reactions, ecosystems, and U.S. Supreme Court with Balaban-Markov indices

    Get PDF
    [Abstract] We can use Artificial Neural Networks (ANNs) and graph Topological Indices (TIs) to seek structure-property relationship. Balabans’ J index is one of the classic TIs for chemo-informatics studies. We used here Markov chains to generalize the J index and apply it to bioinformatics, systems biology, and social sciences. We seek new ANN models to show the discrimination power of the new indices at node level in three proof-of-concept experiments. First, we calculated more than 1,000,000 values of the new Balaban-Markov centralities Jk(i) and other indices for all nodes in >100 complex networks. In the three experiments, we found new MIANN models with >80% of Specificity (Sp) and Sensitivity (Sn) in train and validation series for Metabolic Reactions of Networks (MRNs) for 42 organisms (bacteria, yeast, nematode and plants), 73 Biological Interaction Webs or Networks (BINs), and 43 sub-networks of U.S. Supreme court citations in different decades from 1791 to 2005. This work may open a new route for the application of TIs to unravel hidden structure-property relationships in complex bio-molecular, ecological, and social networks

    Graphs and networks theory

    Get PDF
    This chapter discusses graphs and networks theory

    Herramientas informáticas y de inteligencia artificial para el meta-análisis en la frontera entre la bioinformática y las ciencias jurídicas

    Get PDF
    [Resumen] Los modelos computacionales, conocidos por su acrónimo en idioma Inglés como QSPR (Quantitative Structure-Property Relationships) pueden usarse para predecir propiedades de sistemas complejos. Estas predicciones representan una aplicación importante de las Tecnologías de la Información y la Comunicación (TICs). La mayor relevancia es debido a la reducción de costes de medición experimental en términos de tiempo, recursos humanos, recursos materiales, y/o el uso de animales de laboratorio en ciencias biomoleculares, técnicas, sociales y/o jurídicas. Las Redes Neuronales Artificiales (ANNs) son una de las herramientas informáticas más poderosas para buscar modelos QSPR. Para ello, las ANNs pueden usar como variables de entrada (input) parámetros numéricos que cuantifiquen información sobre la estructura del sistema. Los parámetros conocidos como Índices Topológicos (TIs) se encuentran entre los más versátiles. Los TIs se calculan en Teoría de Grafos a partir de la representación de cualquier sistema como una red de nodos interconectados; desde moléculas a redes biológicas, tecnológicas, y sociales. Esta tesis tiene como primer objetivo realizar una revisión y/o introducir nuevos TIs y software de cálculo de TIs útiles como inputs de ANNs para el desarrollo de modelos QSPR de redes bio-moleculares, biológicas, tecnológico-económicas y socio-jurídicas. En ellas, por una parte, los nodos representan biomoléculas, organismos, poblaciones, leyes tributarias o concausas de delitos. Por otra parte, en la interacción TICs-Ciencias Biomoleculares- Derecho se hace necesario un marco de seguridad jurídica que permita el adecuado desarrollo de las TICs y sus aplicaciones en Ciencias Biomoleculares. Por eso, el segundo objetivo de esta tesis es revisar el marco jurídico-legal de protección de los modelos QSAR/QSPR de sistemas moleculares. El presente trabajo de investigación pretende demostrar la utilidad de estos modelos para predecir características y propiedades de estos sistemas complejos.[Resumo] Os modelos de ordenador coñecidos pola súas iniciais en inglés QSPR (Quantitative Structure-Property Relationships) poden prever as propiedades de sistemas complexos e reducir os custos experimentais en termos de tempo, recursos humanos, materiais e/ou o uso de animais de laboratorio nas ciencias biomoleculares, técnicas, e sociais. As Redes Neurais Artificiais (ANNs) son unha das ferramentas máis poderosas para buscar modelos QSPR. Para iso, as ANNs poden facer uso, coma variables de entrada (input), dos parámetros numéricos da estrutura do sistema chamados Índices Topolóxicos (TIs). Os TI calcúlanse na teoría dos grafos a partir da representación do sistema coma unha rede de nós conectados, incluíndo tanto moléculas coma redes sociais e tecnolóxicas. Esta tese ten como obxectivo principal revisar e/ou desenvolver novos TIs, programas de cálculo de TIs, e/ou modelos QSPR facendo uso de ANNs para predicir redes bio-moleculares, biolóxicas, económicas, e sociais ou xurídicas onde os nós representan moléculas biolóxicas, organismos, poboacións, ou as leis fiscais ou as concausas dun delito. Ademais, a interacción das TIC con as ciencias biolóxicas e xurídicas necesita dun marco de seguridade xurídica que permita o bo desenvolvemento das TIC e as súas aplicacións en Ciencias Biomoleculares. Polo tanto, o segundo obxectivo desta tese é analizar o marco xurídico e legal de protección dos modelos QSPR. O presente traballo de investigación pretende demostrar a utilidade destes modelos para predicir características e propiedades destes sistemas complexos.[Abstract] QSPR (Quantitative Structure-Property Relationships) computer models can predict properties of complex systems reducing experimental costs in terms of time, human resources, material resources, and/or the use of laboratory animals in bio-molecular, technical, and/or social sciences. Artificial Neural Networks (ANNs) are one of the most powerful tools to search QSPR models. For this, the ANNs may use as input variables numerical parameters of the system structure called Topological Indices (TIs). The TIs are calculated in Graph Theory from a representation of any system as a network of interconnected nodes, including molecules or social and technological networks. The first aim of this thesis is to review and/or develop new TIs, TIs calculation software, and QSPR models using ANNs to predict bio-molecular, biological, commercial, social, and legal networks where nodes represent bio-molecules, organisms, populations, products, tax laws, or criminal causes. Moreover, the interaction of ICTs with Biomolecular and law Sciences needs a legal security framework that allows the proper development of ICTs and their applications in Biomolecular Sciences. Therefore, the second objective of this thesis is to review the legal framework and legal protection of QSPR techniques. The present work of investigation tries to demonstrate the usefulness of these models to predict characteristics and properties of these complex systems

    Computational techniques for cell signaling

    Get PDF
    Cells can be viewed as sophisticated machines that organize their constituent components and molecules to receive, process, and respond to signals. The goal of the scientist is to uncover both the individual operations underlying these processes and the mechanism of the emergent properties of interest that give rise to the various phenomena such as disease, development, recovery or aging. Cell signaling plays a crucial role in all of these areas. The complexity of biological processes coupled with the physical limitations of experiments to observe individual molecular components across small to large scales limits the knowlege that can be gleaned from direct observations. Mathematical modeling can be used to estimate parameters that are hidden or too difficult to observe in experiments, and it can make qualitative predictions that can distinguish between hypotheses of interest. Statistical analysis can be employed to explore the large amounts of data generated by modern experimental techniques such as sequencing and high-throughput screening, and it can integrate the observations from many individual experiments or even separate studies to generate new hypotheses. This dissertation employs mathematical and statistical analyses for three prominent aspects of cell signaling: the physical transfer of signaling molecules between cells, the intracellular protein machinery that organizes into pathways to process these signals, and changes in gene expression in response to cell signaling. Computational biology can be described as an applied discipline in that it aims to further the knowledge of a discipline that is distinct from itself. However, the richness of the problems encountered in biology requires continuous development of better methods equipped to handle the complexity, size, or uncertainty of the data, and to build in constraints motivated by the reality of the underlying biological system. In addition, better computational and mathematical methods are also needed to model the emergent behavior that arises from many components. The work presented in this dissertation fulfills both of these roles. We apply known and existing techniques to analyse experimental data and provide biological meaning, and we also develop new statistical and mathematical models that add to the knowledge and practice of computational biology. Much of cell signaling is initiated by signal transduction from the exterior, either by sensing the environmental conditions or the recpetion of specific signals from other cells. The phenomena of most immediate concern to our species, that of human health and disease, are usually also generated from, and manifest in, our tissues and organs due to the interaction and signaling between cells. A modality of inter-cellular communication that was regarded earlier as an obscure phenomenon but has more recently come to the attention of the scientific community is that of tunneling nanotubes (TNs). TNs have been observed as thin (of the order of 100 nanometers) extensions from a cell to another closely located one. The formation of such structures along with the intercellular exchange of molecules through them, and their interaction with the cytoskeleton, could be involved in many important processes, such as tissue formation and cancer growth. We describe a simple model of passive transport of molecules between cells due to TNs. Building on a few basic assumptions, we derive parametrized, closed-form expressions to describe the concentration of transported molecules as a function of distance from a population of TN-forming cells. Our model predicts how the perfusion of molecules through the TNs is affected by the size of the transferred molecules, the length and stability of nanotube formation, and the differences between membrane-bound and cytosolic proteins. To our knowledge, this is the first published mathematical model of intercellular transfer through tunneling nanotubes. We envision that experimental observations will be able to confirm or improve the assumptions made in our model. Furthermore, quantifying the form of inter-cellular communication in the basic scenario envisioned in our model can help suggest ways to measure and investigate cases of possible regulation of either formation of tunneling nanotubes or transport through them. The next problem we focus on is uncovering how the interactions between the genes and proteins in a cell organize into pathways to process call signals or perform other tasks. The ability to accurately model and deeply understand gene and protein interaction networks of various kinds can be very powerful for prioritizing candidate genes and predicting their role in various signaling pathways and processes. A popular technique for gene prioritization and function prediction is the graph diffusion kernel. We show how the graph diffusion kernel is mathematically similar to the Ising spin graph, a model popular in statistical physics but not usually employed on biological interaction networks. We develop a new method for calculating gene association based on the Ising spin model which is different from the methods common in either bioinformatics or statistical physics. We show that our method performs better than both the graph diffusion kernel and its commonly used equivalent in the Ising model. We present a theoretical argument for understanding its performance based on ideas of phase transitions on networks. We measure its performance by applying our method to link prediction on protein interaction networks. Unlike candidate gene prioritization or function prediction, link prediction does not depend on the existing annotation or characterization of genes for ground truth. It helps us to avoid the confounding noise and uncertainty in the network and annotation data. As a purely network analysis problem, it is well suited for comparing network analysis methods. Once we know that we are accurately modeling the interaction network, we can employ our model to solve other problems like gene prioritization using interaction data. We also apply statistical analysis for a specific instance of a cell signaling process: the drought response in Brassica napus, a plant of scientific and economic importance. Important changes in the cell physiology of guard cells are initiated by abscisic acid, an important phytohormone that signals water deficit stress. We analyse RNA-seq reads resulting from the sequencing of mRNA extracted from protoplasts treated with abscisic acid. We employ sequence analysis, statisitical modeling, and the integration of cross-species network data to uncover genes, pathways, and interactions important in this process. We confirm what is known from other species and generate new gene and interaction candidates. By associating functional and sequence modification, we are also able to uncover evidence of evolution of gene specialization, a process that is likely widespread in polyploid genomes. This work has developed new computational methods and applied existing tools for understanding cellular signaling and pathways. We have applied statistical analysis to integrate expression, interactome, pathway, regulatory elements, and homology data to infer \textit{Brassica napus} genes and their roles involved in drought response. Previous literature suggesting support for our findings from other species based on independent experiments is found for many of of these findings. By relating the changes in regulatory elements, our RNA-seq results and common gene ancestry, we present evidence of its evolution in the context of polyploidy. Our work can provide a scientific basis for the pursuit of certain genes as targets of breeding and genetic engineering efforts for the development of drought tolerant oil crops. Building on ideas from statistical physics, we developed a new model of gene associations in networks. Using link prediction as a metric for the accuracy of modeling the underlying structure of a real network, we show that our model shows improved performance on real protein interaction networks. Our model of gene associations can be use to prioritize candidate genes for a disease or phenotype of interest. We also develop a mathematical model for a novel inter-cellular mode of biomolecule transfer. We relate hypotheses about the dynamics of TN formation, stability, and nature of molecular transport to quantitative predictions that may be tested by suitable experiments. In summary, this work demostrates the application and development of computational analysis of cell signaling at the level of the transcriptome, the interactome, and physical transport

    Diffusion and Supercritical Spreading Processes on Complex Networks

    Get PDF
    Die große Menge an Datensätzen, die in den letzten Jahren verfügbar wurden, hat es ermöglicht, sowohl menschlich-getriebene als auch biologische komplexe Systeme in einem beispiellosen Ausmaß empirisch zu untersuchen. Parallel dazu ist die Vorhersage und Kontrolle epidemischer Ausbrüche für Fragen der öffentlichen Gesundheit sehr wichtig geworden. In dieser Arbeit untersuchen wir einige wichtige Aspekte von Diffusionsphänomenen und Ausbreitungsprozeßen auf Netzwerken. Wir untersuchen drei verschiedene Probleme im Zusammenhang mit Ausbreitungsprozeßen im überkritischen Regime. Zunächst untersuchen wir die Reaktionsdiffusion auf Ensembles zufälliger Netzwerke, die durch die beobachteten Levy-Flugeigenschaften der menschlichen Mobilität charakterisiert sind. Das zweite Problem ist die Schätzung der Ankunftszeiten globaler Pandemien. Zu diesem Zweck leiten wir geeignete verborgene Geometrien netzgetriebener Streuprozeße, unter Nutzung der Random-Walk-Theorie, her und identifizieren diese. Durch die Definition von effective distances wird das Problem komplexer raumzeitlicher Muster auf einfache, homogene Wellenausbreitungsmuster reduziert. Drittens führen wir durch die Einbettung von Knoten in den verborgenen Raum, der durch effective distances im Netzwerk definiert ist, eine neuartige Netzwerkzentralität ein, die ViralRank genannt wird und quantifiziert, wie nahe ein Knoten, im Durchschnitt, den anderen Knoten im Netzwerk ist. Diese drei Studien bilden einen einheitlichen Rahmen zur Charakterisierung von Diffusions- und Ausbreitungsprozeßen, die sich auf komplexen Netzwerken allgemein abzeichnen, und bieten neue Ansätze für herausfordernde theoretische Probleme, die für die Bewertung künftiger Modelle verwendet werden können.The large amount of datasets that became available in recent years has made it possible to empirically study humanly-driven, as well as biological complex systems to an unprecedented extent. In parallel, the prediction and control of epidemic outbreaks have become very important for public health issues. In this thesis, we investigate some important aspects of diffusion phenomena and spreading processes unfolding on networks. We study three different problems related to spreading processes in the supercritical regime. First, we study reaction-diffusion on ensembles of random networks characterized by the observed Levy-flight properties of human mobility. The second problem is the estimation of the arrival times of global pandemics. To this end, we derive and identify suitable hidden geometries of network-driven spreading processes, leveraging on random-walk theory. Through the definition of network effective distances, the problem of complex spatiotemporal patterns is reduced to simple, homogeneous wave propagation patterns. Third, by embedding nodes in the hidden space defined by network effective distances, we introduce a novel network centrality, called ViralRank, which quantifies how close a node is, on average, to the other nodes. These three studies constitute a unified framework to characterize diffusion and spreading processes unfolding on complex networks in very general settings, and provide new approaches to challenging theoretical problems that can be used to benchmark future models

    Development of an R-language tool to enhance in silico drug discovery from ethnopharmacologically used plant sources: The example of androgenetic alopecia.

    Get PDF
    Herbal medicines have been, are, and will always be a major asset in drug discovery. A freely available, user-friendly suite of computation tools has been developed to enhance a variety of in silico drug discovery processes. The investigation focused on discovering novel leads for the treatment of androgenetic alopecia (AGA) from natural sources already used ethnopharmacologically. A set of twenty-two R code snippets (termed ‘Tool Services’) was created. Tool Services focus on collecting, manipulating, and analysing data from a variety of sources. These sources include general information, ethnopharmacology, chemistry, pharmacology, targets, diseases, pathways and predictive QSAR modelling data. Sixty-nine plants with established use in AGA were studied and their 2,157 phytochemical ingredients recorded. Taxonomically, more than a third of these plants belong to four families that share many similarities in terms of DNA and phytochemical content. Structural similarity studies on 34 phytochemicals chosen based on their frequency of occurrence in the plants revealed similarities between them and with UV-protectants, vascular protectants and anti-inflammatory agents. Seven drugs currently marketed as monotherapy of AGA were structurally compared against our phytochemicals and were also assessed for drug-drug interactions, side effects, adverse drug reactions, and their metabolic fate. During this phase of the study, studies of alopecia as a side effect and as an adverse drug reaction of drugs were also prepared. A study on 48 targets revealed a strong relation with pathways that are implicated in hair follicle growth and development. More than half of these genes were linked to diseases such as hypotrichosis. Finally, the actual binding sites of these targets and the binding affinities of chemicals for these targets were revealed. Undoubtedly, the androgen receptor (AR) is one of the most studied target in AGA. A QSAR classification model was built for AR using 206 active and 1600 inactive compounds in terms of AR antagonism, utilising both Random Forest and Naïve Bayes algorithms
    corecore