2,611 research outputs found

    An Overview of the Use of Neural Networks for Data Mining Tasks

    Get PDF
    In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks

    Techniques for automated parameter estimation in computational models of probabilistic systems

    Get PDF
    The main contribution of this dissertation is the design of two new algorithms for automatically synthesizing values of numerical parameters of computational models of complex stochastic systems such that the resultant model meets user-specified behavioral specifications. These algorithms are designed to operate on probabilistic systems – systems that, in general, behave differently under identical conditions. The algorithms work using an approach that combines formal verification and mathematical optimization to explore a model\u27s parameter space. The problem of determining whether a model instantiated with a given set of parameter values satisfies the desired specification is first defined using formal verification terminology, and then reformulated in terms of statistical hypothesis testing. Parameter space exploration involves determining the outcome of the hypothesis testing query for each parameter point and is guided using simulated annealing. The first algorithm uses the sequential probability ratio test (SPRT) to solve the hypothesis testing problems, whereas the second algorithm uses an approach based on Bayesian statistical model checking (BSMC). The SPRT-based parameter synthesis algorithm was used to validate that a given model of glucose-insulin metabolism has the capability of representing diabetic behavior by synthesizing values of three parameters that ensure that the glucose-insulin subsystem spends at least 20 minutes in a diabetic scenario. The BSMC-based algorithm was used to discover the values of parameters in a physiological model of the acute inflammatory response that guarantee a set of desired clinical outcomes. These two applications demonstrate how our algorithms use formal verification, statistical hypothesis testing and mathematical optimization to automatically synthesize parameters of complex probabilistic models in order to meet user-specified behavioral propertie

    Nature as a Network of Morphological Infocomputational Processes for Cognitive Agents

    Get PDF
    This paper presents a view of nature as a network of infocomputational agents organized in a dynamical hierarchy of levels. It provides a framework for unification of currently disparate understandings of natural, formal, technical, behavioral and social phenomena based on information as a structure, differences in one system that cause the differences in another system, and computation as its dynamics, i.e. physical process of morphological change in the informational structure. We address some of the frequent misunderstandings regarding the natural/morphological computational models and their relationships to physical systems, especially cognitive systems such as living beings. Natural morphological infocomputation as a conceptual framework necessitates generalization of models of computation beyond the traditional Turing machine model presenting symbol manipulation, and requires agent-based concurrent resource-sensitive models of computation in order to be able to cover the whole range of phenomena from physics to cognition. The central role of agency, particularly material vs. cognitive agency is highlighted

    The Extended Edit Distance Metric

    Full text link
    Similarity search is an important problem in information retrieval. This similarity is based on a distance. Symbolic representation of time series has attracted many researchers recently, since it reduces the dimensionality of these high dimensional data objects. We propose a new distance metric that is applied to symbolic data objects and we test it on time series data bases in a classification task. We compare it to other distances that are well known in the literature for symbolic data objects. We also prove, mathematically, that our distance is metric.Comment: Technical repor

    Formal methods applied to the analysis of phylogenies: Phylogenetic model checking

    Get PDF
    Los árboles filogenéticos son abstracciones útiles para modelar y caracterizar la evolución de un conjunto de especies o poblaciones respecto del tiempo. La proposición, verificación y generalización de hipótesis sobre un árbol filogenético inferido juegan un papel importante en el estudio y comprensión de las relaciones evolutivas. Actualmente, uno de los principales objetivos científicos es extraer o descubrir los mensajes biológicos implícitos y las propiedades estructurales subyacentes en la filogenia. Por ejemplo, la integración de información genética en una filogenia ayuda al descubrimiento de genes conservados en todo o parte del árbol, la identificación de posiciones covariantes en el ADN o la estimación de las fechas de divergencia entre especies. Consecuentemente, los árboles ayudan a comprender el mecanismo que gobierna la deriva evolutiva. Hoy en día, el amplio espectro de métodos y herramientas heterogéneas para el análisis de filogenias enturbia y dificulta su utilización, además del fuerte acoplamiento entre la especificación de propiedades y los algoritmos utilizados para su evaluación (principalmente scripts ad hoc). Este problema es el punto de arranque de esta tesis, donde se analiza como solución la posibilidad de introducir un entorno formal de verificación de hipótesis que, de manera automática y modular, estudie la veracidad de dichas propiedades definidas en un lenguaje genérico e independiente (en una lógica formal asociada) sobre uno de los múltiples softwares preparados para ello. La contribución principal de la tesis es la propuesta de un marco formal para la descripción, verificación y manipulación de relaciones causales entre especies de forma independiente del código utilizado para su valoración. Para ello, exploramos las características de las técnicas de model checking, un paradigma en el que una especificación expresada en lógica temporal se verifica con respecto a un modelo del sistema que representa una implementación a un cierto nivel de detalle. Se ha aplicado satisfactoriamente en la industria para el modelado de sistemas y su verificación, emergiendo del ámbito de las ciencias de la computación. Las contribuciones concretas de la tesis han sido: A) La identificación e interpretación de los árboles filogeneticos como modelos de la evolución, adaptados al entorno de las técnicas de model checking. B) La definición de una lógica temporal que captura las propiedades filogenéticas habituales junto con un método de construcción de propiedades. C) La clasificación de propiedades filogenéticas, identificando categorías de propiedades según estén centradas en la estructura del árbol, en las secuencias o sean híbridas. D) La extensión de las lógicas y modelos para contemplar propiedades cuantitativas de tiempo, probabilidad y de distancias. E) El desarrollo de un entorno para la verificación de propiedades booleanas, cuantitativas y paramétricas. F) El establecimiento de los principios para la manipulación simbolica de objetos filogenéticos, p. ej., clados. G) La explotación de las herramientas de model checking existentes, detectando sus problemas y carencias en el campo de filogenia y proponiendo mejoras. H) El desarrollo de técnicas "ad hoc" para obtener ganancia de complejidad alrededor de dos frentes: distribución de los cálculos y datos, y el uso de sistemas de información. Los puntos A-F se centran en las aportaciones conceptuales de nuestra aproximación, mientras que los puntos G-H enfatizan la parte de herramientas e implementación. Los contenidos de la tesis están contrastados por la comunidad científica mediante las siguientes publicaciones en conferencias y revistas internacionales. La introducción de model checking como entorno formal para analizar propiedades biológicas (puntos A-C) ha llevado a la publicación de nuestro primer artículo de congreso [1]. En [2], desarrollamos la verificación de hipótesis filogenéticas sobre un árbol de ejemplo construido a partir de las relaciones impuestas por un conjunto de proteínas codificadas por el ADN mitocondrial humano (ADNmt). En ese ejemplo, usamos una herramienta automática y genérica de model checking (punto G). El artículo de revista [7] resume lo básico de los artículos de congreso previos y extiende la aplicación de lógicas temporales a propiedades filogenéticas no consideradas hasta ahora. Los artículos citados aquí engloban los contenidos presentados en las Parte I--II de la tesis. El enorme tamaño de los árboles y la considerable cantidad de información asociada a los estados (p.ej., la cadena de ADN) obligan a la introducción de adaptaciones especiales en las herramientas de model checking para mantener un rendimiento razonable en la verificación de propiedades y aliviar también el problema de la explosión de estados (puntos G-H). El artículo de congreso [3] presenta las ventajas de rebanar el ADN asociado a los estados, la partición de la filogenia en pequeños subárboles y su distribución entre varias máquinas. Además, la idea original del model checking rebanado se complementa con la inclusión de una base de datos externa para el almacenamiento de secuencias. El artículo de revista [4] reúne las nociones introducidas en [3] junto con la implementación y resultados preliminares presentados [5]. Este tema se corresponde con lo presentado en la Parte III de la tesis. Para terminar, la tesis reaprovecha las extensiones de las lógicas temporales con tiempo explícito y probabilidades a fin de manipular e interrogar al árbol sobre información cuantitativa. El artículo de congreso [6] ejemplifica la necesidad de introducir probabilidades y tiempo discreto para el análisis filogenético de un fenotipo real, en este caso, el ratio de distribución de la intolerancia a la lactosa entre diversas poblaciones arraigadas en las hojas de la filogenia. Esto se corresponde con el Capítulo 13, que queda englobado dentro de las Partes IV--V. Las Partes IV--V completan los conceptos presentados en ese artículo de conferencia hacia otros dominios de aplicación, como la puntuación de árboles, y tiempo continuo (puntos E-F). La introducción de parámetros en las hipótesis filogenéticas se plantea como trabajo futuro. Referencias [1] Roberto Blanco, Gregorio de Miguel Casado, José Ignacio Requeno, and José Manuel Colom. Temporal logics for phylogenetic analysis via model checking. In Proceedings IEEE International Workshop on Mining and Management of Biological and Health Data, pages 152-157. IEEE, 2010. [2] José Ignacio Requeno, Roberto Blanco, Gregorio de Miguel Casado, and José Manuel Colom. Phylogenetic analysis using an SMV tool. In Miguel P. Rocha, Juan M. Corchado Rodríguez, Florentino Fdez-Riverola, and Alfonso Valencia, editors, Proceedings 5th International Conference on Practical Applications of Computational Biology and Bioinformatics, volume 93 of Advances in Intelligent and Soft Computing, pages 167-174. Springer, Berlin, 2011. [3] José Ignacio Requeno, Roberto Blanco, Gregorio de Miguel Casado, and José Manuel Colom. Sliced model checking for phylogenetic analysis. In Miguel P. Rocha, Nicholas Luscombe, Florentino Fdez-Riverola, and Juan M. Corchado Rodríguez, editors, Proocedings 6th International Conference on Practical Applications of Computational Biology and Bioinformatics, volume 154 of Advances in Intelligent and Soft Computing, pages 95-103. Springer, Berlin, 2012. [4] José Ignacio Requeno and José Manuel Colom. Model checking software for phylogenetic trees using distribution and database methods. Journal of Integrative Bioinformatics, 10(3):229-233, 2013. [5] José Ignacio Requeno and José Manuel Colom. Speeding up phylogenetic model checking. In Mohd Saberi Mohamad, Loris Nanni, Miguel P. Rocha, and Florentino Fdez-Riverola, editors, Proceedings 7th International Conference on Practical Applications of Computational Biology and Bioinformatics, volume 222 of Advances in Intelligent Systems and Computing, pages 119-126. Springer, Berlin, 2013. [6] José Ignacio Requeno and José Manuel Colom. Timed and probabilistic model checking over phylogenetic trees. In Miguel P. Rocha et al., editors, Proceedings 8th International Conference on Practical Applications of Computational Biology and Bioinformatics, Advances in Intelligent and Soft Computing. Springer, Berlin, 2014. [7] José Ignacio Requeno, Gregorio de Miguel Casado, Roberto Blanco, and José Manuel Colom. Temporal logics for phylogenetic analysis via model checking. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 10(4):1058-1070, 2013

    Evolving cell models for systems and synthetic biology

    Get PDF
    This paper proposes a new methodology for the automated design of cell models for systems and synthetic biology. Our modelling framework is based on P systems, a discrete, stochastic and modular formal modelling language. The automated design of biological models comprising the optimization of the model structure and its stochastic kinetic constants is performed using an evolutionary algorithm. The evolutionary algorithm evolves model structures by combining different modules taken from a predefined module library and then it fine-tunes the associated stochastic kinetic constants. We investigate four alternative objective functions for the fitness calculation within the evolutionary algorithm: (1) equally weighted sum method, (2) normalization method, (3) randomly weighted sum method, and (4) equally weighted product method. The effectiveness of the methodology is tested on four case studies of increasing complexity including negative and positive autoregulation as well as two gene networks implementing a pulse generator and a bandwidth detector. We provide a systematic analysis of the evolutionary algorithm’s results as well as of the resulting evolved cell models

    The Best Model of a Cat Is Several Cats

    Get PDF
    Modern biotechnology is emerging at the intersection of engineering, biology, physics, and computer science. As such it carries with it history from several disparate fields of research including a strong tradition in deductive reasoning primarily derived from discovery focused molecular biology and physics. Engineering biological systems is a complex undertaking requiring a broader set of epistemic tools and methods than what is usually applied in today's discovery based research. Inductive reasoning as commonly used in computer science has proven to be a very efficient approach to build knowledge about complex megadimensional datasets, including synthetic biology applications. The authors conclude that the multi-heuristic nature of modern biotechnology makes it an engineering field primed for inductive reasoning to complement the dominating deductive tradition

    Categorical Ontology of Complex Systems, Meta-Systems and Theory of Levels: The Emergence of Life, Human Consciousness and Society

    Get PDF
    Single cell interactomics in simpler organisms, as well as somatic cell interactomics in multicellular organisms, involve biomolecular interactions in complex signalling pathways that were recently represented in modular terms by quantum automata with ‘reversible behavior’ representing normal cell cycling and division. Other implications of such quantum automata, modular modeling of signaling pathways and cell differentiation during development are in the fields of neural plasticity and brain development leading to quantum-weave dynamic patterns and specific molecular processes underlying extensive memory, learning, anticipation mechanisms and the emergence of human consciousness during the early brain development in children. Cell interactomics is here represented for the first time as a mixture of ‘classical’ states that determine molecular dynamics subject to Boltzmann statistics and ‘steady-state’, metabolic (multi-stable) manifolds, together with ‘configuration’ spaces of metastable quantum states emerging from complex quantum dynamics of interacting networks of biomolecules, such as proteins and nucleic acids that are now collectively defined as quantum interactomics. On the other hand, the time dependent evolution over several generations of cancer cells --that are generally known to undergo frequent and extensive genetic mutations and, indeed, suffer genomic transformations at the chromosome level (such as extensive chromosomal aberrations found in many colon cancers)-- cannot be correctly represented in the ‘standard’ terms of quantum automaton modules, as the normal somatic cells can. This significant difference at the cancer cell genomic level is therefore reflected in major changes in cancer cell interactomics often from one cancer cell ‘cycle’ to the next, and thus it requires substantial changes in the modeling strategies, mathematical tools and experimental designs aimed at understanding cancer mechanisms. Novel solutions to this important problem in carcinogenesis are proposed and experimental validation procedures are suggested. From a medical research and clinical standpoint, this approach has important consequences for addressing and preventing the development of cancer resistance to medical therapy in ongoing clinical trials involving stage III cancer patients, as well as improving the designs of future clinical trials for cancer treatments.\ud \ud \ud KEYWORDS: Emergence of Life and Human Consciousness;\ud Proteomics; Artificial Intelligence; Complex Systems Dynamics; Quantum Automata models and Quantum Interactomics; quantum-weave dynamic patterns underlying human consciousness; specific molecular processes underlying extensive memory, learning, anticipation mechanisms and human consciousness; emergence of human consciousness during the early brain development in children; Cancer cell ‘cycling’; interacting networks of proteins and nucleic acids; genetic mutations and chromosomal aberrations in cancers, such as colon cancer; development of cancer resistance to therapy; ongoing clinical trials involving stage III cancer patients’ possible improvements of the designs for future clinical trials and cancer treatments. \ud \u
    corecore