19,913 research outputs found

    Differential gene expression graphs: A data structure for classification in DNA microarrays

    Get PDF
    This paper proposes an innovative data structure to be used as a backbone in designing microarray phenotype sample classifiers. The data structure is based on graphs and it is built from a differential analysis of the expression levels of healthy and diseased tissue samples in a microarray dataset. The proposed data structure is built in such a way that, by construction, it shows a number of properties that are perfectly suited to address several problems like feature extraction, clustering, and classificatio

    A Genome-Wide Analysis Reveals Significant Overlap of Transcription and DNA Repair in Stationary Phase Yeast

    Get PDF
    The association between transcription and DNA repair is acknowledged as a player in the generation of mutations in a non-random fashion in prokaryotes and eukaryotes. Previous studies demonstrated that the transcription complex is capable of directing DNA repair to sites of transcription. This process is especially important to growth-arrested cells, in which many DNA repair capacities are diminished; it may also lead to mutations preferentially in transcribed genes. Using microarray analysis of growth-arrested yeast cultures, we demonstrated on a genomic scale, the co-localization of a DNA-turnover marker, indicative of DNA-repair-associated DNA synthesis, with genes persistently transcribed during stationary phase. This may serve as a clue regarding the non-random manner in which non-dividing cells may potentially mutate in the absence of replication, solely as a result of their inherent, transcriptional stress response

    Genomic alterations in primary gastric adenocarcinomas correlate with clinicopathological characteristics and survival.

    Get PDF
    Background & aimsPathogenesis of gastric cancer is driven by an accumulation of genetic changes that to a large extent occur at the chromosomal level. In order to investigate the patterns of chromosomal aberrations in gastric carcinomas, we performed genome-wide microarray based comparative genomic hybridisation (microarray CGH). With this recently developed technique chromosomal aberrations can be studied with high resolution and sensitivity.MethodsArray CGH was applied to a series of 35 gastric adenocarcinomas using a genome-wide scanning array with 2275 BAC and P1 clones spotted in triplicate. Each clone contains at least one STS for linkage to the sequence of the human genome. These arrays provide an average resolution of 1.4 Mb across the genome. DNA copy number changes were correlated with clinicopathological tumour characteristics as well as survival.ResultsAll thirty-five cancers showed chromosomal aberrations and 16 of the 35 tumours showed one or more amplifications. The most frequent aberrations are gains of 8q24.2, 8q24.1, 20q13.12, 20q13.2, 7p11.2, 1q32.3, 8p23.1-p23.3, losses of 5q14.1, 18q22.1, 19p13.12-p13.3, 9p21.3-p24.3, 17p13.1-p13.3, 13q31.1, 16q22.1, 21q21.3, and amplifications of 7q21-q22, and 12q14.1-q21.1. These aberrations were correlated to clinicopathological characteristics and survival. Gain of 1q32.3 was significantly correlated with lymph node status (p=0.007). Tumours with loss of 18q22.1, as well as tumours with amplifications were associated with poor survival (p=0.02, both).ConclusionsMicroarray CGH has revealed several chromosomal regions that have not been described before in gastric cancer at this frequency and resolution, such as amplification of at 7q21-q22 and 12q14.1-q21.1, as well gains at 1q32.3, 7p11.2, and losses at 13q13.1. Interestingly, gain of 1q32.3 and loss of 18q22.1 are associated with a bad prognosis indicating that these regions could harbour gene(s) that may determine aggressive tumour behaviour and poor clinical outcome

    Assessing similarity of feature selection techniques in high-dimensional domains

    Get PDF
    Recent research efforts attempt to combine multiple feature selection techniques instead of using a single one. However, this combination is often made on an “ad hoc” basis, depending on the specific problem at hand, without considering the degree of diversity/similarity of the involved methods. Moreover, though it is recognized that different techniques may return quite dissimilar outputs, especially in high dimensional/small sample size domains, few direct comparisons exist that quantify these differences and their implications on classification performance. This paper aims to provide a contribution in this direction by proposing a general methodology for assessing the similarity between the outputs of different feature selection methods in high dimensional classification problems. Using as benchmark the genomics domain, an empirical study has been conducted to compare some of the most popular feature selection methods, and useful insight has been obtained about their pattern of agreement

    A graph-based representation of Gene Expression profiles in DNA microarrays

    Get PDF
    This paper proposes a new and very flexible data model, called gene expression graph (GEG), for genes expression analysis and classification. Three features differentiate GEGs from other available microarray data representation structures: (i) the memory occupation of a GEG is independent of the number of samples used to built it; (ii) a GEG more clearly expresses relationships among expressed and non expressed genes in both healthy and diseased tissues experiments; (iii) GEGs allow to easily implement very efficient classifiers. The paper also presents a simple classifier for sample-based classification to show the flexibility and user-friendliness of the proposed data structur

    Transcriptional response of Burkholderia cenocepacia J2315 sessile cells to treatments with high doses of hydrogen peroxide and sodium hypochlorite

    Get PDF
    Background: Burkholderia cepacia complex bacteria are opportunistic pathogens, which can cause severe respiratory tract infections in patients with cystic fibrosis (CF). As treatment of infected CF patients is problematic, multiple preventive measures are taken to reduce the infection risk. Besides a stringent segregation policy to prevent patient-to-patient transmission, clinicians also advise patients to clean and disinfect their respiratory equipment on a regular basis. However, problems regarding the efficacy of several disinfection procedures for the removal and/or killing of B. cepacia complex bacteria have been reported. In order to unravel the molecular mechanisms involved in the resistance of biofilm-grown Burkholderia cenocepacia cells against high concentrations of reactive oxygen species (ROS), the present study focussed on the transcriptional response in sessile B. cenocepacia J2315 cells following exposure to high levels of H2O2 or NaOCl. Results: The exposure to H2O2 and NaOCl resulted in an upregulation of the transcription of 315 (4.4%) and 386 (5.4%) genes, respectively. Transcription of 185 (2.6%) and 331 (4.6%) genes was decreased in response to the respective treatments. Many of the upregulated genes in the NaOCl- and H2O2-treated biofilms are involved in oxidative stress as well as general stress response, emphasizing the importance of the efficient neutralization and scavenging of ROS. In addition, multiple upregulated genes encode proteins that are necessary to repair ROS-induced cellular damage. Unexpectedly, a prolonged treatment with H2O2 also resulted in an increased transcription of multiple phage-related genes. A closer inspection of hybridisation signals obtained with probes targeting intergenic regions led to the identification of a putative 6S RNA. Conclusion: Our results reveal that the transcription of a large fraction of B. cenocepacia J2315 genes is altered upon exposure of sessile cells to ROS. These observations have highlighted that B. cenocepacia may alter several pathways in response to exposure to ROS and they have led to the identification of many genes not previously implicated in the stress response of this pathogen

    Identification of disease-causing genes using microarray data mining and gene ontology

    Get PDF
    Background: One of the best and most accurate methods for identifying disease-causing genes is monitoring gene expression values in different samples using microarray technology. One of the shortcomings of microarray data is that they provide a small quantity of samples with respect to the number of genes. This problem reduces the classification accuracy of the methods, so gene selection is essential to improve the predictive accuracy and to identify potential marker genes for a disease. Among numerous existing methods for gene selection, support vector machine-based recursive feature elimination (SVMRFE) has become one of the leading methods, but its performance can be reduced because of the small sample size, noisy data and the fact that the method does not remove redundant genes. Methods: We propose a novel framework for gene selection which uses the advantageous features of conventional methods and addresses their weaknesses. In fact, we have combined the Fisher method and SVMRFE to utilize the advantages of a filtering method as well as an embedded method. Furthermore, we have added a redundancy reduction stage to address the weakness of the Fisher method and SVMRFE. In addition to gene expression values, the proposed method uses Gene Ontology which is a reliable source of information on genes. The use of Gene Ontology can compensate, in part, for the limitations of microarrays, such as having a small number of samples and erroneous measurement results. Results: The proposed method has been applied to colon, Diffuse Large B-Cell Lymphoma (DLBCL) and prostate cancer datasets. The empirical results show that our method has improved classification performance in terms of accuracy, sensitivity and specificity. In addition, the study of the molecular function of selected genes strengthened the hypothesis that these genes are involved in the process of cancer growth. Conclusions: The proposed method addresses the weakness of conventional methods by adding a redundancy reduction stage and utilizing Gene Ontology information. It predicts marker genes for colon, DLBCL and prostate cancer with a high accuracy. The predictions made in this study can serve as a list of candidates for subsequent wet-lab verification and might help in the search for a cure for cancers

    Features of mammalian microRNA promoters emerge from polymerase II chromatin immunoprecipitation data

    Get PDF
    Background: MicroRNAs (miRNAs) are short, non-coding RNA regulators of protein coding genes. miRNAs play a very important role in diverse biological processes and various diseases. Many algorithms are able to predict miRNA genes and their targets, but their transcription regulation is still under investigation. It is generally believed that intragenic miRNAs (located in introns or exons of protein coding genes) are co-transcribed with their host genes and most intergenic miRNAs transcribed from their own RNA polymerase II (Pol II) promoter. However, the length of the primary transcripts and promoter organization is currently unknown. Methodology: We performed Pol II chromatin immunoprecipitation (ChIP)-chip using a custom array surrounding regions of known miRNA genes. To identify the true core transcription start sites of the miRNA genes we developed a new tool (CPPP). We showed that miRNA genes can be transcribed from promoters located several kilobases away and that their promoters share the same general features as those of protein coding genes. Finally, we found evidence that as many as 26% of the intragenic miRNAs may be transcribed from their own unique promoters. Conclusion: miRNA promoters have similar features to those of protein coding genes, but miRNA transcript organization is more complex. © 2009 Corcoran et al
    corecore