1 research outputs found

    Discovery of Functional Genes for Systemic Acquired Resistance in Arabidopsis Thaliana through Integrated Data Mining

    Get PDF
    Various data mining techniques combined with sequence motif information in the promoter region of genes were applied to discover functional genes that are involved in the defense mechanism of systemic acquired resistance (SAR) in Arabidopsis thaliana. A series of K-Means clustering with difference-in-shape as distance measure was initially applied. A stability measure was used to validate this clustering process. A decision tree algorithm with the discover-and-mask technique was used to identify a group of most informative genes. Appearance and abundance of various transcription factor binding sites in the promoter region of the genes were studied. Through the combination of these techniques, we were able to identify 24 candidate genes involved in the SAR defense mechanism. The candidate genes fell into 2 highly resolved categories, each category showing significantly unique profiles of regulatory elements in their promoter regions. This study demonstrates the strength of such integration methods and suggests a broader application of this approach.Diff\ue9rentes techniques d'exploration de donn\ue9es, combin\ue9es \ue0 de l'information sur le motif de s\ue9quence dans la r\ue9gion promotrice de g\ue8nes, ont \ue9t\ue9 appliqu\ue9es pour d\ue9couvrir les g\ue8nes fonctionnels qui interviennent dans le m\ue9canisme de d\ue9fense de la r\ue9sistance syst\ue9mique acquise (RSA ou SAR) chez Arabidopsis thaliana. On a initialement utilis\ue9 une s\ue9rie de classifications par les K moyennes et la diff\ue9rence de forme comme mesure de distance. On a utilis\ue9 une mesure de stabilit\ue9 pour valider ce processus de classification, et un algorithme d'arbre de d\ue9cision ainsi que la technique de d\ue9couverte et de masquage pour identifier un groupe de g\ue8nes sup\ue9rieurement informatifs. On a \ue9tudi\ue9 l'apparence et l'abondance de diff\ue9rents sites de liaison de facteurs de transcription dans la r\ue9gion promotrice des g\ue8nes. En combinant ces techniques, nous avons pu identifier 24 g\ue8nes candidats intervenant dans le m\ue9canisme de d\ue9fense de la RSA. Ces g\ue8nes candidats se classaient dans deux cat\ue9gories hautement r\ue9solues, chacune pr\ue9sentant des profils v\ue9ritablement uniques d'\ue9l\ue9ments r\ue9gulateurs dans leurs r\ue9gions promotrices. Cette \ue9tude d\ue9montre le potentiel de pareilles m\ue9thodes d'int\ue9gration et laisse entrevoir une plus vaste application de cette approche.Peer reviewed: YesNRC publication: Ye
    corecore