3 research outputs found

    Block clustering of Binary Data with Gaussian Co-variables

    Get PDF
    The simultaneous grouping of rows and columns is an important technique that is increasingly used in large-scale data analysis. In this paper, we present a novel co-clustering method using co-variables in its construction. It is based on a latent block model taking into account the problem of grouping variables and clustering individuals by integrating information given by sets of co-variables. Numerical experiments on simulated data sets and an application on real genetic data highlight the interest of this approach

    Candidate gene family-based and case-control studies of susceptibility to high Schistosoma mansoni worm burden in African children: a protocol

    Get PDF
    Background: Approximately 25% of the risk of Schistosoma mansoni is associated with host genetic variation. We will test 24 candidate genes, mainly in the Th2 and Th17 pathways, for association with S. mansoni infection intensity in four African countries, using family based and case-control approaches. Methods: Children aged 5-15 years will be recruited in S. mansoni endemic areas of Ivory Coast, Cameroon, Uganda and the Democratic Republic of Congo (DRC). We will use family based (study 1) and case-control (study 2) designs. Study 1 will take place in Ivory Coast, Cameroon, Uganda and the DRC. We aim to recruit 100 high worm burden families from each country except Uganda, where a previous study recruited at least 40 families. For phenotyping, cases will be defined as the 20% of children in each community with heaviest worm burdens as measured by the circulating cathodic antigen (CCA) assay. Study 2 will take place in Uganda. We will recruit 500 children in a highly endemic community. For phenotyping, cases will be defined as the 20% of children with heaviest worm burdens as measured by the CAA assay, while controls will be the 20% of infected children with the lightest worm burdens. Deoxyribonucleic acid (DNA) will be genotyped on the Illumina H3Africa SNP (single nucleotide polymorphisms) chip and genotypes will be converted to sets of haplotypes that span the gene region for analysis. We have selected 24 genes for genotyping that are mainly in the Th2 and Th17 pathways and that have variants that have been demonstrated to be or could be associated with Schistosoma infection intensity. Analysis: In the family-based design, we will identify SNP haplotypes disproportionately transmitted to children with high worm burden. Case-control analysis will detect overrepresentation of haplotypes in extreme phenotypes with correction for relatedness by using whole genome principal components

    Detecting multi-way epistasis in family-based association studies

    No full text
    International audienceThe era of genome-wide association studies (GWAS) has led to the discovery of numerous genetic variants associated with disease. Better understanding of whether these or other variants interact leading to differential risk compared with individual marker effects will increase our understanding of the genetic architecture of disease, which may be investigated using the family-based study design. We present M-TDT (the multi-locus transmission disequilibrium test), a tool for detecting family-based multi-locus multi-allelic effects for qualitative or quantitative traits, extended from the original transmission disequilibrium test (TDT). Tests to handle the comparison between additive and epistatic models, lack of independence between markers and multiple offspring are described. Performance of M-TDT is compared with a multifactor dimensionality reduction (MDR) approach designed for investigating families in the hypothesis-free genome-wide setting (the multifactor dimensionality reduction pedigree disequilibrium test, MDR-PDT). Other methods derived from the TDT or MDR to investigate genetic interaction in the family-based design are also discussed. The case of three independent biallelic loci is illustrated using simulations for one- to three-locus alternative hypotheses. M-TDT identified joint-locus effects and distinguished effectively between additive and epistatic models. We showed a practical example of M-TDT based on three genes already known to be implicated in malaria susceptibility. Our findings demonstrate the value of M-TDT in a hypothesis-driven context to test for multi-way epistasis underlying common disease etiology, whereas MDR-PDT-based methods are more appropriate in a hypothesis-free genome-wide setting
    corecore