30 research outputs found

    PDA: Pooled DNA analyzer

    Get PDF
    BACKGROUND: Association mapping using abundant single nucleotide polymorphisms is a powerful tool for identifying disease susceptibility genes for complex traits and exploring possible genetic diversity. Genotyping large numbers of SNPs individually is performed routinely but is cost prohibitive for large-scale genetic studies. DNA pooling is a reliable and cost-saving alternative genotyping method. However, no software has been developed for complete pooled-DNA analyses, including data standardization, allele frequency estimation, and single/multipoint DNA pooling association tests. This motivated the development of the software, 'PDA' (Pooled DNA Analyzer), to analyze pooled DNA data. RESULTS: We develop the software, PDA, for the analysis of pooled-DNA data. PDA is originally implemented with the MATLAB(® )language, but it can also be executed on a Windows system without installing the MATLAB(®). PDA provides estimates of the coefficient of preferential amplification and allele frequency. PDA considers an extended single-point association test, which can compare allele frequencies between two DNA pools constructed under different experimental conditions. Moreover, PDA also provides novel chromosome-wide multipoint association tests based on p-value combinations and a sliding-window concept. This new multipoint testing procedure overcomes a computational bottleneck of conventional haplotype-oriented multipoint methods in DNA pooling analyses and can handle data sets having a large pool size and/or large numbers of polymorphic markers. All of the PDA functions are illustrated in the four bona fide examples. CONCLUSION: PDA is simple to operate and does not require that users have a strong statistical background. The software is available at

    Modeling expression quantitative trait loci in data combining ethnic populations

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Combining data from different ethnic populations in a study can increase efficacy of methods designed to identify expression quantitative trait loci (eQTL) compared to analyzing each population independently. In such studies, however, the genetic diversity of minor allele frequencies among populations has rarely been taken into account. Due to the fact that allele frequency diversity and population-level expression differences are present in populations, a consensus regarding the optimal statistical approach for analysis of eQTL in data combining different populations remains inconclusive.</p> <p>Results</p> <p>In this report, we explored the applicability of a constrained two-way model to identify eQTL for combined ethnic data that might contain genetic diversity among ethnic populations. In addition, gene expression differences resulted from ethnic allele frequency diversity between populations were directly estimated and analyzed by the constrained two-way model. Through simulation, we investigated effects of genetic diversity on eQTL identification by examining gene expression data pooled from normal quantile transformation of each population. Using the constrained two-way model to reanalyze data from Caucasians and Asian individuals available from HapMap, a large number of eQTL were identified with similar genetic effects on the gene expression levels in these two populations. Furthermore, 19 single nucleotide polymorphisms with inter-population differences with respect to both genotype frequency and gene expression levels directed by genotypes were identified and reflected a clear distinction between Caucasians and Asian individuals.</p> <p>Conclusions</p> <p>This study illustrates the influence of minor allele frequencies on common eQTL identification using either separate or combined population data. Our findings are important for future eQTL studies in which different datasets are combined to increase the power of eQTL identification.</p

    A genome-wide scan using tree-based association analysis for candidate loci related to fasting plasma glucose levels

    Get PDF
    BACKGROUND: In the analysis of complex traits such as fasting plasma glucose levels, researchers often adjust the trait for some important covariates before assessing gene susceptibility, and may at times encounter confounding among the covariates and the susceptible genes. Previously, the tree-based method has been employed to accommodate the heterogeneity in complex traits. In this study, we performed a genome-wide screen on fasting glucose levels in the offspring generation of the Framingham Heart Study provided by the Genetic Analysis Workshop 13. We defined one quantitative trait and converted it to a dichotomous trait based on a predetermined cut-off value, and performed association analyses using regression and classification trees for the two traits, respectively. A marker was interpreted as positive if at least one of its alleles exhibited association in both analyses. Our purpose was to identify candidate genes susceptible to fasting glucose levels in the presence of other covariates. The covariates entered in the analysis including sex, body mass index, and lipids (total plasma cholesterol, high density lipoprotein cholesterol, and triglycerides) of the subjects, and those of their parents. RESULTS: Four out of seven positive regions in chromosomes 1, 2, 6, 11, 16, 18, and 19 from our analyses harbored or were very close to previously reported diabetes related genes or potential candidate genes. CONCLUSION: This screen method that employed tree-based association showed promise for identifying candidate loci in the presence of covariates in genome scans for complex traits

    Construction of endophenotypes for complex diseases in the presence of heterogeneity

    Get PDF
    Endophenotypes such as behavior disorders have been increasingly adopted in genetic studies for complex traits. For efficient gene mapping, it is essential that an endophenotype is associated with the disease of interest and is inheritable or co-segregating within families. In this study, we proposed a strategy to construct endophenotypes to analyze the Genetic Analysis Workshop 14 simulated dataset. Initially, generalized estimating equation models were employed to identify phenotypes that were correlated to the disease (affected status) in combination with the family structures in data. Endophenotypes were then constructed with consideration of heterogeneity as functions of the identified phenotypes. Genome scans on the constructed endophenotypes were carried out using family-based association analysis. For comparison, genome scans were also performed with the original affected status. The family-based association analysis using the endophenotypes correctly identified the same susceptible gene in about 80 of the 100 replicates

    A genome-wide scanning and fine mapping study of COGA data

    Get PDF
    A thorough genetic mapping study was performed to identify predisposing genes for alcoholism dependence using the Collaborative Study on the Genetics of Alcoholism (COGA) data. The procedure comprised whole-genome linkage and confirmation analyses, single locus and haplotype fine mapping analyses, and gene × environment haplotype regression. Stratified analysis was considered to reduce the ethnic heterogeneity and simultaneously family-based and case-control study designs were applied to detect potential genetic signals. By using different methods and markers, we found high linkage signals at D1S225 (253.7 cM), D1S547 (279.2 cM), D2S1356 (64.6 cM), and D7S2846 (56.8 cM) with nonparametric linkage scores of 3.92, 4.10, 4.44, and 3.55, respectively. We also conducted haplotype and odds ratio analyses, where the response was the dichotomous status of alcohol dependence, explanatory variables were the inferred individual haplotypes and the three statistically significant covariates were age, gender, and max drink (the maximum number of drinks consumed in a 24-hr period). The final model identified important AD-related haplotypes within a candidate region of NRXN1 at 2p21 and a few others in the inter-gene regions. The relative magnitude of risks to the identified risky/protective haplotypes was elucidated

    A large-scale survey of genetic copy number variations among Han Chinese residing in Taiwan

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Copy number variations (CNVs) have recently been recognized as important structural variations in the human genome. CNVs can affect gene expression and thus may contribute to phenotypic differences. The copy number inferring tool (CNIT) is an effective hidden Markov model-based algorithm for estimating allele-specific copy number and predicting chromosomal alterations from single nucleotide polymorphism microarrays. The CNIT algorithm, which was constructed using data from 270 HapMap multi-ethnic individuals, was applied to identify CNVs from 300 unrelated Han Chinese individuals in Taiwan.</p> <p>Results</p> <p>Using stringent selection criteria, 230 regions with variable copy numbers were identified in the Han Chinese population; 133 (57.83%) had been reported previously, 64 displayed greater than 1% CNV allele frequency. The average size of the CNV regions was 322 kb (ranging from 1.48 kb to 5.68 Mb) and covered a total of 2.47% of the human genome. A total of 196 of the CNV regions were simple deletions and 27 were simple amplifications. There were 449 genes and 5 microRNAs within these CNV regions; some of these genes are known to be associated with diseases.</p> <p>Conclusion</p> <p>The identified CNVs are characteristic of the Han Chinese population and should be considered when genetic studies are conducted. The CNV distribution in the human genome is still poorly characterized, and there is much diversity among different ethnic populations.</p
    corecore