23 research outputs found

    Exhaustive prediction of disease susceptibility to coding base changes in the human genome

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Single Nucleotide Polymorphisms (SNPs) are the most abundant form of genomic variation and can cause phenotypic differences between individuals, including diseases. Bases are subject to various levels of selection pressure, reflected in their inter-species conservation.</p> <p>Results</p> <p>We propose a method that is not dependant on transcription information to score each coding base in the human genome reflecting the disease probability associated with its mutation. Twelve factors likely to be associated with disease alleles were chosen as the input for a support vector machine prediction algorithm. The analysis yielded 83% sensitivity and 84% specificity in segregating disease like alleles as found in the Human Gene Mutation Database from non-disease like alleles as found in the Database of Single Nucleotide Polymorphisms. This algorithm was subsequently applied to each base within all known human genes, exhaustively confirming that interspecies conservation is the strongest factor for disease association. For each gene, the length normalized average disease potential score was calculated. Out of the 30 genes with the highest scores, 21 are directly associated with a disease. In contrast, out of the 30 genes with the lowest scores, only one is associated with a disease as found in published literature. The results strongly suggest that the highest scoring genes are enriched for those that might contribute to disease, if mutated.</p> <p>Conclusion</p> <p>This method provides valuable information to researchers to identify sensitive positions in genes that have a high disease probability, enabling them to optimize experimental designs and interpret data emerging from genetic and epidemiological studies.</p

    Functional characterisation of the TSC1–TSC2 complex to assess multiple TSC2 variants identified in single families affected by tuberous sclerosis complex

    Get PDF
    BACKGROUND: Tuberous sclerosis complex (TSC) is an autosomal dominant disorder characterised by seizures, mental retardation and the development of hamartomas in a variety of organs and tissues. The disease is caused by mutations in either the TSC1 gene on chromosome 9q34, or the TSC2 gene on chromosome 16p13.3. The TSC1 and TSC2 gene products, TSC1 and TSC2, interact to form a protein complex that inhibits signal transduction to the downstream effectors of the mammalian target of rapamycin (mTOR). METHODS: We have used a combination of different assays to characterise the effects of a number of pathogenic TSC2 amino acid substitutions on TSC1-TSC2 complex formation and mTOR signalling. RESULTS: We used these assays to compare the effects of 9 different TSC2 variants (S132C, F143L, A196T, C244R, Y598H, I820del, T993M, L1511H and R1772C) identified in individuals with symptoms of TSC from 4 different families. In each case we were able to identify the pathogenic mutation. CONCLUSION: Functional characterisation of TSC2 variants can help identify pathogenic changes in individuals with TSC, and assist in the diagnosis and genetic counselling of the index cases and/or other family members

    Genome-Wide Comparative Gene Family Classification

    Get PDF
    Correct classification of genes into gene families is important for understanding gene function and evolution. Although gene families of many species have been resolved both computationally and experimentally with high accuracy, gene family classification in most newly sequenced genomes has not been done with the same high standard. This project has been designed to develop a strategy to effectively and accurately classify gene families across genomes. We first examine and compare the performance of computer programs developed for automated gene family classification. We demonstrate that some programs, including the hierarchical average-linkage clustering algorithm MC-UPGMA and the popular Markov clustering algorithm TRIBE-MCL, can reconstruct manual curation of gene families accurately. However, their performance is highly sensitive to parameter setting, i.e. different gene families require different program parameters for correct resolution. To circumvent the problem of parameterization, we have developed a comparative strategy for gene family classification. This strategy takes advantage of existing curated gene families of reference species to find suitable parameters for classifying genes in related genomes. To demonstrate the effectiveness of this novel strategy, we use TRIBE-MCL to classify chemosensory and ABC transporter gene families in C. elegans and its four sister species. We conclude that fully automated programs can establish biologically accurate gene families if parameterized accordingly. Comparative gene family classification finds optimal parameters automatically, thus allowing rapid insights into gene families of newly sequenced species
    corecore