14 research outputs found

    Finding biological process modifications in cancer tissues by mining gene expression correlations

    Get PDF
    BACKGROUND: Through the use of DNA microarrays it is now possible to obtain quantitative measurements of the expression of thousands of genes from a biological sample. This technology yields a global view of gene expression that can be used in several ways. Functional insight into expression profiles is routinely obtained by using Gene Ontology terms associated to the cellular genes. In this paper, we deal with functional data mining from expression profiles, proposing a novel approach that studies the correlations between genes and their relations to Gene Ontology (GO). By using this "functional correlations comparison" we explore all possible pairs of genes identifying the affected biological processes by analyzing in a pair-wise manner gene expression patterns and linking correlated pairs with Gene Ontology terms. RESULTS: We apply here this "functional correlations comparison" approach to identify the existing correlations in hepatocarcinoma (161 microarray experiments) and to reveal functional differences between normal liver and cancer tissues. The number of well-correlated pairs in each GO term highlights several differences in genetic interactions between cancer and normal tissues. We performed a bootstrap analysis in order to compute false detection rates (FDR) and confidence limits. CONCLUSION: Experimental results show the main advantage of the applied method: it both picks up general and specific GO terms (in particular it shows a fine resolution in the specific GO terms). The results obtained by this novel method are highly coherent with the ones proposed by other cancer biology studies. But additionally they highlight the most specific and interesting GO terms helping the biologist to focus his/her studies on the most relevant biological processes

    I2AM: a Semi-Automatic System for Data Interpretation in Petroleum Geology

    No full text
    Artificial intelligence, data mining techniques and statistical methods are widely used in reservoir modelling, for instance in prediction of sedimentary facies. The aim of this work was to define and implement a suite of tools for in- terpretation of image logs and large datasets of subsurface data coming from geological exploration

    Automatic Cluster Selection Using Index Driven Search Strategy

    No full text
    Clustering is the task of categorizing objects into different classes in an unsupervised way. Hierarchical clustering algorithms are usually very effective in detecting the dataset underlying structure. However, they do not create clusters, but compute only a hierarchical representation of the dataset. It is then desirable to make them a suitable automatic pre-processing step for the algorithms operating on the selected clusters. To this purpose, in this paper we present an algorithm that finds the best clustering partition according to clustering validity indexes. In particular, our automatic approach performs a validity index-driven search through a clustering tree. The best partition is then selected cutting the tree in a non-horizontal way. The algorithm was implemented in a software tool and then tested on different datasets. The overall system makes then hierarchical clustering an automatic step, where no user interaction is needed in order to select clusters from a hierarchical cluster representation

    di4g: Uno strumento di clustering per l’analisi integrata di dati geologici

    No full text
    di4g (data integrator for geology) è uno strumento sviluppato per l’analisi di dati geologici, in particolare per la geologia degli idrocarburi. Questa disciplina si occupa di cercare e valutare gli elementi fondamentali nella formazione di un giacimento di idrocarburi in un bacino sedimentario. Tipicamente, attraverso sonde calate nei pozzi esplorativi, si ottengono numerosi dati eterogenei, costituiti da log elettrici e rappresentati sotto forma di curve e log immagini (detti FMI) significative della conformazione delle pareti dei pozzi. Da queste immagini si possono ricavare informazioni riguardanti la tessitura delle rocce, il tipo di porosità, la presenza di fratture (rappresentate da sinusoidi). L’esperto geologo analizza questi log visivamente, per identificare le varie caratteristiche presenti all’interno delle immagini. Questa è però un’analisi complessa e soggettiva nell’interpretazione che richiede, inoltre, un elevato tempo di esecuzione. Per questo, è stato sviluppato I2AM (Intelligent Image Analysis and Mapping), un software per l’interpretazione semiautomatica delle immagini provenienti dai pozzi petroliferi (1) e per individuare le caratteristiche visuali presenti (6, 8). Poiché tutti i dati disponibili sono eterogenei e non tutti considerati da I2AM, per consentire l’allineamento e la fusione di diversi dataset è stato sviluppato di4g. di4g fonde la tabella delle caratteristiche prodotta da I2AM con i log elettrici disponibili e consente anche di eseguire l’analisi integrata di dati provenienti da pozzi diversi. Per fornire una prima classificazione sui dati in ingresso, di4g applica una tecnica di clustering individuando zone del pozzo simili e raggruppandole in cluster con un algoritmo di clustering

    The main advantages to use the integration between geology and artificial intelligence techniques to Interpret Image Logs. And Example from Algeria.

    No full text
    none4Image logs hold important information about the subsurface sequences and they provide information about bedding and fault/fracture spatial distribution and characteristics. They can supply insight on the rock texture, textural organization and porosity types and distribution. To reduce the subjectivity of the interpretation and cut the interpretation time we developed and tested a new semi-automatic process for image log interpretation using a new software. This process led to the development of an expert system (called I2AM) that exploits image processing algorithms and clustering techniques, to analyze and classify borehole images. This system extrapolates the maximum amount of information from the image logs by considering not only the surfaces that cut the borehole but also the textural features of the images. Once the image log are analysed the application of clustering techniques to the values extracted from the borehole images supply a consistent classification of the images and the propagation of this classification along the logged section. In this way, we can automatically extract rock properties information with two main advantages: (i) avoid the subjectivity of the interpretation, (ii) reduce the interpretation time. The final results of this process is a set of “image facies” identified along the image log obtained by a largely automated log interpretation, although some level of human interaction and correction is still necessary. We define the clustering application as semi-automatic because the interpreter can decide, based on his geological background and on the geological characteristics of the logged section, to keep the clusters/classes proposed by the system or modify the number of clusters/classes. The clustering process and the propagation of the classes along the logged section is very fast (30 seconds) allowing an interactive approach, producing several scenarios with different number of classes and/or allowing a quick update of the image log interpretation once more data/knowledge is acquired. This approach was tested on 5 wells from north Africa where a previous image log interpretation was performed. The new interpretation based on this system made 3 years later (with more data and information) produced more refined results in very short time.Proceedings of the 9th Middle East Geosciences Conference and Exhibiton (GEO 2010 Manama, Bahrain), March 2010.noneDenis Ferraretti; Raffaele Di Cuia; Giacomo Gamberoni; Erick PortierFerraretti, Denis; Raffaele Di, Cuia; Gamberoni, Giacomo; Erick, Portie

    Bayesian Networks Learning for Gene Expression Datasets

    No full text
    Abstract. DNA arrays yield a global view of gene expression and can be used to build genetic networks models, in order to study relations between genes. Literature proposes Bayesian network as an appropriate tool for develop similar models. In this paper, we exploit the contribute of two Bayesian network learning algorithms to generate genetic networks from microarray datasets of experiments performed on Acute Myeloid Leukemia (AML). In the results, we present an analysis protocol used to synthesize knowledge about the most interesting gene interactions and compare the networks learned by the two algorithms. We also evaluated relations found in these models with the ones found by biological studies performed on AML

    Bayesian networks learning for gene expression datasets

    No full text
    DNA arrays yield a global view of gene expression and can be used to build genetic networks models, in order to study relations between genes. Literature proposes Bayesian network as an appropriate tool for develop similar models. In this paper, we exploit the contribute of two Bayesian network learning algorithms to generate genetic networks from microarray datasets of experiments performed on Acute Myeloid Leukemia (AML). In the results, we present an analysis protocol used to synthesize knowledge about the most interesting gene interactions and compare the networks learned by the two algorithms. We also evaluated relations found in these models with the ones found by biological studies performed on AML

    Marker Analysis with APRIORI-based Algorithms

    No full text
    Abstract. In genetic studies, complex diseases are often analyzed searching for marker patterns that play a significant role in the susceptibility to the disease. In this paper we consider a dataset regarding periodontitis, that includes the analysis of nine genetic markers for 148 individuals. We analyze these data by using two APRIORI-based algorithms: APRIORI-SD and APRIORI with filtering. The discovered rules (especially those found by APRIORI with filtering) confirmed the results published on periodontitis.
    corecore