1,312 research outputs found

    Discovery of Functional Genes for Systemic Acquired Resistance in Arabidopsis Thaliana through Integrated Data Mining

    Get PDF
    Various data mining techniques combined with sequence motif information in the promoter region of genes were applied to discover functional genes that are involved in the defense mechanism of systemic acquired resistance (SAR) in Arabidopsis thaliana. A series of K-Means clustering with difference-in-shape as distance measure was initially applied. A stability measure was used to validate this clustering process. A decision tree algorithm with the discover-and-mask technique was used to identify a group of most informative genes. Appearance and abundance of various transcription factor binding sites in the promoter region of the genes were studied. Through the combination of these techniques, we were able to identify 24 candidate genes involved in the SAR defense mechanism. The candidate genes fell into 2 highly resolved categories, each category showing significantly unique profiles of regulatory elements in their promoter regions. This study demonstrates the strength of such integration methods and suggests a broader application of this approach.Diff\ue9rentes techniques d'exploration de donn\ue9es, combin\ue9es \ue0 de l'information sur le motif de s\ue9quence dans la r\ue9gion promotrice de g\ue8nes, ont \ue9t\ue9 appliqu\ue9es pour d\ue9couvrir les g\ue8nes fonctionnels qui interviennent dans le m\ue9canisme de d\ue9fense de la r\ue9sistance syst\ue9mique acquise (RSA ou SAR) chez Arabidopsis thaliana. On a initialement utilis\ue9 une s\ue9rie de classifications par les K moyennes et la diff\ue9rence de forme comme mesure de distance. On a utilis\ue9 une mesure de stabilit\ue9 pour valider ce processus de classification, et un algorithme d'arbre de d\ue9cision ainsi que la technique de d\ue9couverte et de masquage pour identifier un groupe de g\ue8nes sup\ue9rieurement informatifs. On a \ue9tudi\ue9 l'apparence et l'abondance de diff\ue9rents sites de liaison de facteurs de transcription dans la r\ue9gion promotrice des g\ue8nes. En combinant ces techniques, nous avons pu identifier 24 g\ue8nes candidats intervenant dans le m\ue9canisme de d\ue9fense de la RSA. Ces g\ue8nes candidats se classaient dans deux cat\ue9gories hautement r\ue9solues, chacune pr\ue9sentant des profils v\ue9ritablement uniques d'\ue9l\ue9ments r\ue9gulateurs dans leurs r\ue9gions promotrices. Cette \ue9tude d\ue9montre le potentiel de pareilles m\ue9thodes d'int\ue9gration et laisse entrevoir une plus vaste application de cette approche.Peer reviewed: YesNRC publication: Ye

    MorphDB : prioritizing genes for specialized metabolism pathways and gene ontology categories in plants

    Get PDF
    Recent times have seen an enormous growth of "omics" data, of which high-throughput gene expression data are arguably the most important from a functional perspective. Despite huge improvements in computational techniques for the functional classification of gene sequences, common similarity-based methods often fall short of providing full and reliable functional information. Recently, the combination of comparative genomics with approaches in functional genomics has received considerable interest for gene function analysis, leveraging both gene expression based guilt-by-association methods and annotation efforts in closely related model organisms. Besides the identification of missing genes in pathways, these methods also typically enable the discovery of biological regulators (i.e., transcription factors or signaling genes). A previously built guilt-by-association method is MORPH, which was proven to be an efficient algorithm that performs particularly well in identifying and prioritizing missing genes in plant metabolic pathways. Here, we present MorphDB, a resource where MORPH-based candidate genes for large-scale functional annotations (Gene Ontology, MapMan bins) are integrated across multiple plant species. Besides a gene centric query utility, we present a comparative network approach that enables researchers to efficiently browse MORPH predictions across functional gene sets and species, facilitating efficient gene discovery and candidate gene prioritization. MorphDB is available at http://bioinformatics.psb.ugent.be/webtools/morphdb/morphDB/index/. We also provide a toolkit, named "MORPH bulk" (https://github.com/arzwa/morph-bulk), for running MORPH in bulk mode on novel data sets, enabling researchers to apply MORPH to their own species of interest

    Mining biological information from 3D short time-series gene expression data: the OPTricluster algorithm

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Nowadays, it is possible to collect expression levels of a set of genes from a set of biological samples during a series of time points. Such data have three dimensions: gene-sample-time (GST). Thus they are called 3D microarray gene expression data. To take advantage of the 3D data collected, and to fully understand the biological knowledge hidden in the GST data, novel subspace clustering algorithms have to be developed to effectively address the biological problem in the corresponding space.</p> <p>Results</p> <p>We developed a subspace clustering algorithm called Order Preserving Triclustering (OPTricluster), for 3D short time-series data mining. OPTricluster is able to identify 3D clusters with coherent evolution from a given 3D dataset using a combinatorial approach on the sample dimension, and the order preserving (OP) concept on the time dimension. The fusion of the two methodologies allows one to study similarities and differences between samples in terms of their temporal expression profile. OPTricluster has been successfully applied to four case studies: immune response in mice infected by malaria (<it>Plasmodium chabaudi</it>), systemic acquired resistance in <it>Arabidopsis thaliana</it>, similarities and differences between inner and outer cotyledon in <it>Brassica napus </it>during seed development, and to <it>Brassica napus </it>whole seed development. These studies showed that OPTricluster is robust to noise and is able to detect the similarities and differences between biological samples.</p> <p>Conclusions</p> <p>Our analysis showed that OPTricluster generally outperforms other well known clustering algorithms such as the TRICLUSTER, gTRICLUSTER and K-means; it is robust to noise and can effectively mine the biological knowledge hidden in the 3D short time-series gene expression data.</p

    The Effect of Metal Composition and Particle Size on Nanostructure-Toxicity in Plants

    Get PDF
    Silver nanoparticles (AgNPs) have consistently been shown to have a detrimental effect on bacteria, fungi, and plants. The interaction of AgNPs with plants has received considerable scientific attention, because it is potentially through plants that these structures can enter the food chain and bioaccumulate in humans and animals. To determine the effects of AgNPs on plants, Arabidopsis thaliana seedlings were chronically exposed to sublethal levels of AgNPs using a standardized method. To gain insight on mechanism of phytotoxicity, the seedlings were exposed to low concentrations of Ag+ (in the form of silver nitrate), AgNPs, or gold nanoparticles (AuNPs). To test if NP size influenced the response by the plant, AgNPs and AuNPs were tested at both 20 nm and 80 nm sizes. Exposure to AgNO3 altered the expression of several genes, but exposure to AuNPs did not cause any measurable changes in the Arabidopsis transcriptome. Exposure of plants with 20 nm and 80 nm AgNPs, on the other hand, caused the differential expression of 226 and 212 genes, respectively, indicative of cell wall reorganization and response to oxidative and biotic stress. The size of the AgNPs had little influence on gene expression patterns. Root length measurements were taken to quantify phytotoxicity of various NPs. While AgNO3 increased root elongation, the NPs, irrespective of metal composition and size, did not cause significant differences in root length. Taken together, my data suggest that the chemical nature of the metal core is the major determinant of AgNP phytotoxicity in chronically exposed plants

    A predicted protein interactome identifies conserved global networks and disease resistance subnetworks in maize

    Get PDF
    Interactomes are genome-wide roadmaps of protein-protein interactions. They have been produced for humans, yeast, the fruit fly, and Arabidopsis thaliana and have become invaluable tools for generating and testing hypotheses. A predicted interactome for Zea mays (PiZeaM) is presented here as an aid to the research community for this valuable crop species. PiZeaM was built using a proven method of interologs (interacting orthologs) that were identified using both one-to-one and many-to-many orthology between genomes of maize and reference species. Where both maize orthologs occurred for an experimentally determined interaction in the reference species, we predicted a likely interaction in maize. A total of 49,026 unique interactions for 6004 maize proteins were predicted. These interactions are enriched for processes that are evolutionarily conserved, but include many otherwise poorly annotated proteins in maize. The predicted maize interactions were further analyzed by comparing annotation of interacting proteins, including different layers of ontology. A map of pairwise gene co-expression was also generated and compared to predicted interactions. Two global subnetworks were constructed for highly conserved interactions. These subnetworks showed clear clustering of proteins by function. Another subnetwork was created for disease response using a bait and prey strategy to capture interacting partners for proteins that respond to other organisms. Closer examination of this subnetwork revealed the connectivity between biotic and abiotic hormone stress pathways. We believe PiZeaM will provide a useful tool for the prediction of protein function and analysis of pathways for Z. mays researchers and is presented in this paper as a reference tool for the exploration of protein interactions in maize

    Transcriptional Regulation: a Genomic Overview

    Get PDF
    The availability of the Arabidopsis thaliana genome sequence allows a comprehensive analysis of transcriptional regulation in plants using novel genomic approaches and methodologies. Such a genomic view of transcription first necessitates the compilation of lists of elements. Transcription factors are the most numerous of the different types of proteins involved in transcription in eukaryotes, and the Arabidopsis genome codes for more than 1,500 of them, or approximately 6% of its total number of genes. A genome-wide comparison of transcription factors across the three eukaryotic kingdoms reveals the evolutionary generation of diversity in the components of the regulatory machinery of transcription. However, as illustrated by Arabidopsis, transcription in plants follows similar basic principles and logic to those in animals and fungi. A global view and understanding of transcription at a cellular and organismal level requires the characterization of the Arabidopsis transcriptome and promoterome, as well as of the interactome, the localizome, and the phenome of the proteins involved in transcription
    corecore