26 research outputs found

    Fusion of Domain Knowledge for Dynamic Learning in Transcriptional Networks

    Get PDF
    A critical challenge of the postgenomic era is to understand how genes are differentially regulated even when they belong to a given network. Because the fundamental mechanism controlling gene expression operates at the level of transcription initiation, computational techniques have been devel oped that identify cis-regulatory features and map such features into differential expression patterns. The fact that such co-regulated genes may be differentially regulated suggests that subtle differences in the shared cis-acting regulatory elements are likely significant. Thus, we carry out an exhaustive description of cis-acting regulatory features including the orientation, location and number of binding sites for a regulatory protein, the presence of binding site submotifs, the class and number of RNA polymerase sites, as well as gene expression data, which is treated as one feature among many. These features, derived from dif ferent domain sources, are analyzed concurrently, and dynamic relations are re cognized to generate profiles, which are groups of promoters sharing common features. We apply this method to probe the regulatory networks governed by the PhoP/PhoQ two-component system in the enteric bacteria Escherichia coli and Salmonella enterica. Our analysis uncovered novel members of the PhoP regulon as and the resulting profiles group genes that share underlying biologi cal that characterize the system kinetics. The predictions were experimentally validated to establish that the PhoP protein uses multiple mechanisms to control gene transcription and is a central element in a highly connected network.Ministerio de Ciencia y Tecnología BIO2004-0270-

    Classification of Gene Expression Profiles: Comparison of K-means and Expectation Maximization Algorithms

    Get PDF
    Biomedical research has been revolutionized by high throughput techniques and the enormous amount of data they are able to generate. In particular technology has the capacity to monitor changes in RNA abundance for thou sands of genes simultaneously. The interest shown over microarray analysis methods has rapidly raised. Clustering is widely used in the analysis of microarray data to group genes of interest targeted from microarray experiments on the basis of similarity of expression patterns. In this work we apply two clustering algorithms, K-means and Expecta tion Maximization to particular a problem and we compare the groupings obtained on the basis of the cohesiveness of the gene products associated to the genes in each clusterMinisterio de Ciencia y Tecnología TIN-2006-12879Junta de Andalucía TIC-0278

    Optimization of multi-classifiers for computational biology: application to gene finding and expression

    Get PDF
    Genomes of many organisms have been sequenced over the last few years. However, transforming such raw sequence data into knowledge remains a hard task. A great number of prediction programs have been developed to address part of this problem: the location of genes along a genome and their expression. We propose a multi-objective methodology to combine state-of-the-art algorithms into an aggregation scheme in order to obtain optimal methods’ aggregations. The results obtained show a major improvement in sensitivity when our methodology is compared to the performance of individual methods for gene finding and gene expression problems. The methodology proposed here is an automatic method generator, and a step forward to exploit all already existing methods, by providing alternative optimal methods’ aggregations to answer concrete queries for a certain biological problem with a maximized accuracy of the prediction. As more approaches are integrated for each of the presented problems, de novo accuracy can be expected to improve further.Ministerio de Ciencia y Tecnología TIN2006-12879Junta de Andalucía TIC-0278

    Uso de las plataformas LEGO y Arduino en la enseñanza de la programación

    Get PDF
    Cada vez es más común que los grados de ingeniería y ciencia incluyan la enseñanza de la programación en sus planes de estudio. Estas asignaturas suponen un auténtico desafío para los profesores encargados ya que muchos estudiantes encuentran bastantes dificultades en su primer encuentro con la programación. En la actualidad existen enfoques docentes innovadores que pueden ayudar en esta tarea, la computación física es uno de los más prometedores. Ésta introduce los conceptos de la programación en el mundo real para que el alumno interaccione con ellos. Utilizando este paradigma hemos desarrollado un conjunto de recursos docentes para la enseñanza de la programación en ciencias e ingeniería. Se han preparado un conjunto de demostraciones para ser utilizadas en clase de teoría y varios módulos para ser utilizados por los alumnos en el laboratorio. Las experiencias de teoría y de laboratorio se apoyan en las plataformas Arduino -una microcontroladora open hardware- y LEGO -una plataforma robótica educativa. El material desarrollado ha sido evaluado en un curso de programación dentro del grado de Biología y con estudiantes voluntarios de primero de Matemáticas. Los resultados han sido positivos: se ha incrementado el número de estudiantes que aprenden a programar satisfactoriamente y disfrutan programando. Estos resultados indican que el uso de este recurso docente como complemento a la docencia tradicional mejora el aprendizaje de los estudiantes facilitando la labor del profesor.SUMMARY -- Engineers and scientists increasingly rely on computers for their work. As a consequence most science and engineering degrees have introduced a computer programming course in their curricula. However, lecturers face a complex task when teaching this subject: students consider the subject to be unrelated to their core interests and often feel uncomfortable when learning to program for the first time. Several studies have proposed the use of the physical computing paradigm. This paradigm takes the computational concepts “out of the screen” and into the real world so that the student can interact with them. Using this paradigm we have designed and implemented several introductory programming learning modules for an introductory programming course in science and engineering. These modules are to be used in lectures and laboratory sessions. We selected the Arduino board –an electronic board- and LEGO –a robotic platform- as the hardware platform. The effectiveness of the modules was assessed by comparing two programming courses: in one the teacher used traditional methods; in the other he complemented these with the modules. We evaluated the modules in a programming course for Biology students and found that they were highly effective: more students learned to program and more students enjoyed programming. These results suggest that the physical computing paradigm involves the student more effectively in the learning process

    Cis-cop: Multiobjective identification of cis-regulatory modules based on constrains

    Get PDF
    Gene expression regulation is an intricate, dynamic phenomenon essential for all biolog ical functions. The necessary instructions for gen expression are encoded in cis-regulatory elements that work together and interact with the RNA polymerase to confer specific spatial and temporal patterns of transcrip tion. Therefore, the identification of these el ements is currently an active area of research in computational analysis of regulatory se quences. However, the problem is difficult since the combinatorial interactions between the regulating factors can be very complex. Here we present a web server, Cis-cop, that identifies cis-regulatory modules given a set of transcription factor binding sites and, ad ditionally, also RNA pol sites for a group of genes

    Optimal Selection of Microarray Analysis Methods Using a Conceptual Clustering Algorithm

    Get PDF
    The rapid development of methods that select over/under expressed genes from microarray experiments have not yet matched the need for tools that identify informational profiles that differentiate between experimental condi tions such as time, treatment and phenotype. Uncertainty arises when methods devoted to identify significantly expressed genes are evaluated: do all microar ray analysis methods yield similar results from the same input dataset? do dif ferent microarray datasets require distinct analysis methods?. We performed a detailed evaluation of several microarray analysis methods, finding that none of these methods alone identifies all observable differential profiles, nor subsumes the results obtained by the other methods. Consequently, we propose a proce dure that, given certain user-defined preferences, generates an optimal suite of statistical methods. These solutions are optimal in the sense that they constitute partial ordered subsets of all possible method-associations bounded by both, the most specific and the most sensitive available solution

    Mining Structural Databases: An Evolutionary Multi-Objetive Conceptual Clustering Methodology

    Get PDF
    The increased availability of biological databases contain ing representations of complex objects permits access to vast amounts of data. In spite of the recent renewed interest in knowledge-discovery tech niques (or data mining), there is a dearth of data analysis methods in tended to facilitate understanding of the represented objects and related systems by their most representative features and those relationship de rived from these features (i.e., structural data). In this paper we propose a conceptual clustering methodology termed EMO-CC for Evolution ary Multi-Objective Conceptual Clustering that uses multi-objective and multi-modal optimization techniques based on Evolutionary Algorithms that uncover representative substructures from structural databases. Be sides, EMO-CC provides annotations of the uncovered substructures, and based on them, applies an unsupervised classification approach to retrieve new members of previously discovered substructures. We apply EMO-CC to the Gene Ontology database to recover interesting sub structures that describes problems from different points of view and use them to explain inmuno-inflammatory responses measured in terms of gene expression profiles derived from the analysis of longitudinal blood expression profiles of human volunteers treated with intravenous endo toxin compared to placebo

    A Multiobjective Evolutionary Conceptual Clustering Methodology for Gene Annotation Within Structural Databases: A Case of Study on the Gene Ontology Database

    Get PDF
    Current tools and techniques devoted to examine the content of large databases are often hampered by their inability to support searches based on criteria that are meaningful to their users. These shortcomings are particularly evident in data banks storing representations of structural data such as biological networks. Conceptual clustering techniques have demonstrated to be appropriate for uncovering relationships between features that characterize objects in structural data. However, typical con ceptual clustering approaches normally recover the most obvious relations, but fail to discover the lessfrequent but more informative underlying data associations. The combination of evolutionary algorithms with multiobjective and multimodal optimization techniques constitutes a suitable tool for solving this problem. We propose a novel conceptual clustering methodology termed evolutionary multiobjective conceptual clustering (EMO-CC), re lying on the NSGA-II multiobjective (MO) genetic algorithm. We apply this methodology to identify conceptual models in struc tural databases generated from gene ontologies. These models can explain and predict phenotypes in the immunoinflammatory response problem, similar to those provided by gene expression or other genetic markers. The analysis of these results reveals that our approach uncovers cohesive clusters, even those comprising a small number of observations explained by several features, which allows describing objects and their interactions from different perspectives and at different levels of detail.Ministerio de Ciencia y Tecnología TIC-2003-00877Ministerio de Ciencia y Tecnología BIO2004-0270EMinisterio de Ciencia y Tecnología TIN2006-1287

    Uso de las plataformas LEGO y Arduino en la enseñanza de la programación

    Get PDF
    Cada vez es más común que los grados de ingeniería y ciencia incluyan la enseñanza de la programación en sus planes de estudio. Estas asignaturas suponen un auténtico desafío para los profesores encargados ya que muchos estudiantes encuentran bastantes dificultades en su primer encuentro con la programación. En la actualidad existen enfoques docentes innovadores que pueden ayudar en esta tarea, la computación física es uno de los más prometedores. Ésta introduce los conceptos de la programación en el mundo real para que el alumno interaccione con ellos. Utilizando este paradigma hemos desarrollado un conjunto de recursos docentes para la enseñanza de la programación en ciencias e ingeniería. Se han preparado un conjunto de demostraciones para ser utilizadas en clase de teoría y varios módulos para ser utilizados por los alumnos en el laboratorio. Las experiencias de teoría y de laboratorio se apoyan en las plataformas Arduino -una microcontroladora open hardware- y LEGO -una plataforma robótica educativa. El material desarrollado ha sido evaluado en un curso de programación dentro del grado de Biología y con estudiantes voluntarios de primero de Matemáticas. Los resultados han sido positivos: se ha incrementado el número de estudiantes que aprenden a programar satisfactoriamente y disfrutan programando. Estos resultados indican que el uso de este recurso docente como complemento a la docencia tradicional mejora el aprendizaje de los estudiantes facilitando la labor del profesor.Engineers and scientists increasingly rely on computers for their work. As a consequence most science and engineering degrees have introduced a computer programming course in their curricula. However, lecturers face a complex task when teaching this subject: students consider the subject to be unrelated to their core interests and often feel uncomfortable when learning to program for the first time. Several studies have proposed the use of the physical computing paradigm. This paradigm takes the computational concepts “out of the screen” and into the real world so that the student can interact with them. Using this paradigm we have designed and implemented several introductory programming learning modules for an introductory programming course in science and engineering. These modules are to be used in lectures and laboratory sessions. We selected the Arduino board –an electronic board- and LEGO –a robotic platform- as the hardware platform. The effectiveness of the modules was assessed by comparing two programming courses: in one the teacher used traditional methods; in the other he complemented these with the modules. We evaluated the modules in a programming course for Biology students and found that they were highly effective: more students learned to program and more students enjoyed programming. These results suggest that the physical computing paradigm involves the student more effectively in the learning process.Este trabajo ha contado con el apoyo de la Universidad de Granada a través del proyecto PID/13-54

    Identifying promoter features of co-regulated genes with similar network motifs

    Get PDF
    Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2008, Philadelphia, PA, USA. 3–5 November 2008.Background: A large amount of computational and experimental work has been devoted to uncovering network motifs in gene regulatory networks. The leading hypothesis is that evolutionary processes independently selected recurrent architectural relationships among regulators and target genes (motifs) to produce characteristic expression patterns of its members. However, even with the same architecture, the genes may still be differentially expressed. Therefore, to define fully the expression of a group of genes, the strength of the connections in a network motif must be specified, and the cis-promoter features that participate in the regulation must be determined.Results: We have developed a model-based approach to analyze proteobacterial genomes for promoter features that is specifically designed to account for the variability in sequence, location and topology intrinsic to differential gene expression. We provide methods for annotating regulatory regions by detecting their subjacent cis-features. This includes identifying binding sites for a transcriptional regulator, distinguishing between activation and repression sites, direct and reverse orientation, and among sequences that weakly reflect a particular pattern; binding sites for the RNA polymerase, characterizing different classes, and locations relative to the transcription factor binding sites; the presence of riboswitches in the 5'UTR, and for other transcription factors. We applied our approach to characterize network motifs controlled by the PhoP/PhoQ regulatory system of Escherichia coli and Salmonella enterica serovar Typhimurium. We identified key features that enable the PhoP protein to control its target genes, and distinct features may produce different expression patterns even within the same network motif.Conclusion: Global transcriptional regulators control multiple promoters by a variety of network motifs. This is clearly the case for the regulatory protein PhoP. In this work, we studied this regulatory protein and demonstrated that understanding gene expression does not only require identifying a set of connexions or network motif, but also the cis-acting elements participating in each of these connexions.This research was supported in part by the Spanish Ministry of Science and Technology under project TIN2006-12879 and by Consejería de Innovacion, Investigación y Ciencia de la de la Junta de Andalucía under project TIC02788
    corecore