24 research outputs found

    Data Integration in Genetics and Genomics: Methods and Challenges

    Get PDF
    Due to rapid technological advances, various types of genomic and proteomic data with different sizes, formats, and structures have become available. Among them are gene expression, single nucleotide polymorphism, copy number variation, and protein-protein/gene-gene interactions. Each of these distinct data types provides a different, partly independent and complementary, view of the whole genome. However, understanding functions of genes, proteins, and other aspects of the genome requires more information than provided by each of the datasets. Integrating data from different sources is, therefore, an important part of current research in genomics and proteomics. Data integration also plays important roles in combining clinical, environmental, and demographic data with high-throughput genomic data. Nevertheless, the concept of data integration is not well defined in the literature and it may mean different things to different researchers. In this paper, we first propose a conceptual framework for integrating genetic, genomic, and proteomic data. The framework captures fundamental aspects of data integration and is developed taking the key steps in genetic, genomic, and proteomic data fusion. Secondly, we provide a review of some of the most commonly used current methods and approaches for combining genomic data with focus on the statistical aspects

    Successful identification of rare variants using oligogenic segregation analysis as a prioritizing tool for whole-exome sequencing studies

    Get PDF
    We aim to identify rare variants that have large effects on trait variance using a cost-efficient strategy. We use an oligogenic segregation analysis as a prioritizing tool for whole-exome sequencing studies to identify families more likely to harbor rare variants, by estimating the mean number of quantitative trait loci (QTLs) in each family. We hypothesize that families with additional QTLs, relative to the other families, are more likely to segregate functional rare variants. We test the association of rare variants with the traits only in regions where at least modest evidence of linkage with the trait is observed, thereby reducing the number of tests performed. We found that family 7 harbored an estimated two, one, and zero additional QTLs for traits Q1, Q2, and Q4, respectively. Two rare variants (C4S4935 and C6S2981) segregating in family 7 were associated with Q1 and explained a substantial proportion of the observed linkage signal. These rare variants have 31 and 22 carriers, respectively, in the 128-member family and entered through a single but different founder. For Q2, we found one rare variant unique to family 7 that showed small effect and weak evidence of association; this was a false positive. These results are a proof of principle that prioritizing the sequencing of carefully selected extended families is a simple and cost-efficient design strategy for sequencing studies aiming at identifying functional rare variants

    Reduced proportions of natural killer T cells are present in the relatives of lupus patients and are associated with autoimmunity

    Get PDF
    Abstract Introduction Systemic lupus erythematosus is a genetically complex disease. Currently, the precise allelic polymorphisms associated with this condition remain largely unidentified. In part this reflects the fact that multiple genes, each having a relatively minor effect, act in concert to produce disease. Given this complexity, analysis of subclinical phenotypes may aid in the identification of susceptibility alleles. Here, we used flow cytometry to investigate whether some of the immune abnormalities that are seen in the peripheral blood lymphocyte population of lupus patients are seen in their first-degree relatives. Methods Peripheral blood mononuclear cells were isolated from the subjects, stained with fluorochrome-conjugated monoclonal antibodies to identify various cellular subsets, and analyzed by flow cytometry. Results We found reduced proportions of natural killer (NK)T cells among 367 first-degree relatives of lupus patients as compared with 102 control individuals. There were also slightly increased proportions of memory B and T cells, suggesting increased chronic low-grade activation of the immune system in first-degree relatives. However, only the deficiency of NKT cells was associated with a positive anti-nuclear antibody test and clinical autoimmune disease in family members. There was a significant association between mean parental, sibling, and proband values for the proportion of NKT cells, suggesting that this is a heritable trait. Conclusions The findings suggest that analysis of cellular phenotypes may enhance the ability to detect subclinical lupus and that genetically determined altered immunoregulation by NKT cells predisposes first-degree relatives of lupus patients to the development of autoimmunity

    Genome-wide association analysis of cardiovascular-related quantitative traits in the Framingham Heart Study

    No full text
    Abstract Multivariate linear growth curves were used to model high-density lipoprotein (HDL), low-density lipoprotein (LDL), triglycerides (TG), and systolic blood pressure (SBP) measured during four exams from 1659 independent individuals from the Framingham Heart Study. The slopes and intercepts from each of two phenotype models were tested for association with 348,053 autosomal single-nucleotide polymorphisms from the Affymetrix Gene Chip 500 k set. Three regions were associated with LDL intercept, TG slope, and SBP intercept (p < 1.44 × 10-7). We observed results consistent with previously reported associations between rs599839, on chromosome 1p13, and LDL. We note that the association is significant with LDL intercept but not slope. Markers on chromosome 17q25 were associated with TG slope, and a single-nucleotide polymorphism on chromosome 7p11 was associated with SBP intercept. Growth curve models can be used to gain more insight on the relationships between SNPs and traits than traditional association analysis when longitudinal data has been collected. The power to detect association with changes over time may be limited if the subjects are not followed over a long enough time period

    Transmission-Ratio Distortion and Allele Sharing in Affected Sib Pairs: A New Linkage Statistic with Reduced Bias, with Application to Chromosome 6q25.3

    Get PDF
    We studied the effect of transmission-ratio distortion (TRD) on tests of linkage based on allele sharing in affected sib pairs. We developed and implemented a discrete-trait allele-sharing test statistic, S(ad), analogous to the S(pairs) test statistic of Whittemore and Halpern, that evaluates an excess sharing of alleles at autosomal loci in pairs of affected siblings, as well as a lack of sharing in phenotypically discordant relative pairs, where available. Under the null hypothesis of no linkage, nuclear families with at least two affected siblings and one unaffected sibling have a contribution to S(ad) that is unbiased, with respect to the effects of TRD independent of the disease under study. If more distantly related unaffected individuals are studied, the bias of S(ad) is generally reduced compared with that of S(pairs), but not completely. Moreover, S(ad) has higher power, in some circumstances, because of the availability of unaffected relatives, who are ignored in affected-only analyses. We discuss situations in which it may be an efficient use of resources to genotype unaffected relatives, which would give insights for promising study designs. The method is applied to a sample of pedigrees ascertained for asthma in a chromosomal region in which TRD has been reported. Results are consistent with the presence of transmission distortion in that region

    Specific Variants in the MLH1 Gene Region May Drive DNA Methylation, Loss of Protein Expression, and MSI-H Colorectal Cancer

    Get PDF
    Background: We previously identified an association between a mismatch repair gene, MLH1, promoter SNP (rs1800734) and microsatellite unstable (MSI-H) colorectal cancers (CRCs) in two samples. The current study expanded on this finding as we explored the genetic basis of DNA methylation in this region of chromosome 3. We hypothesized that specific polymorphisms in the MLH1 gene region predispose it to DNA methylation, resulting in the loss of MLH1 gene expression, mismatch-repair function, and consequently to genome-wide microsatellite instability. Methodology/Principal Findings: We first tested our hypothesis in one sample from Ontario (901 cases, 1,097 controls) and replicated major findings in two additional samples from Newfoundland and Labrador (479 cases, 336 controls) and from Seattle (591 cases, 629 controls). Logistic regression was used to test for association between SNPs in the region of MLH1 and CRC, MSI-H CRC, MLH1 gene expression in CRC, and DNA methylation in CRC. The association between rs1800734 and MSI-H CRCs, previously reported in Ontario and Newfoundland, was replicated in the Seattle sample. Two additional SNPs, in strong linkage disequilibrium with rs1800734, showed strong associations with MLH1 promoter methylation, loss of MLH1 protein, and MSI-H CRC in all three samples. The logistic regression model of MSI-H CRC that included MLH1-promotermethylation status and MLH1 immunohisotchemistry status fit most parsimoniously in all three samples combined. When rs1800734 was added to this model, its effect was not statistically significant (P-value = 0.72 vs. 2.361024 when the SNP was examined alone). Conclusions/Significance: The observed association of rs1800734 with MSI-H CRC occurs through its effect on the MLH1 promoter methylation, MLH1 IHC deficiency, or both

    Shwachman-Diamond Syndrome with Exocrine Pancreatic Dysfunction and Bone Marrow Failure Maps to the Centromeric Region of Chromosome 7

    Get PDF
    Shwachman-Diamond syndrome (SDS) is an autosomal recessive disorder characterized by exocrine pancreatic insufficiency and hematologic and skeletal abnormalities. A genomewide scan of families with SDS was terminated at ∼50% completion, with the identification of chromosome 7 markers that showed linkage with the disease. Finer mapping revealed significant linkage across a broad interval that included the centromere. The maximum two-point LOD score was 8.7, with D7S473, at a recombination fraction of 0. The maximum multipoint LOD score was 10, in the interval between D7S499 and D7S482 (5.4 cM on the female map and 0 cM on the male map), a region delimited by recombinant events detected in affected children. Evidence from all 15 of the multiplex families analyzed provided support for the linkage, consistent with a single locus for SDS. However, the presence of several different mutations is suggested by the heterogeneity of disease-associated haplotypes in the candidate region
    corecore