41 research outputs found

    Beyond Support in Two-Stage Variable Selection

    Full text link
    Numerous variable selection methods rely on a two-stage procedure, where a sparsity-inducing penalty is used in the first stage to predict the support, which is then conveyed to the second stage for estimation or inference purposes. In this framework, the first stage screens variables to find a set of possibly relevant variables and the second stage operates on this set of candidate variables, to improve estimation accuracy or to assess the uncertainty associated to the selection of variables. We advocate that more information can be conveyed from the first stage to the second one: we use the magnitude of the coefficients estimated in the first stage to define an adaptive penalty that is applied at the second stage. We give two examples of procedures that can benefit from the proposed transfer of information, in estimation and inference problems respectively. Extensive simulations demonstrate that this transfer is particularly efficient when each stage operates on distinct subsamples. This separation plays a crucial role for the computation of calibrated p-values, allowing to control the False Discovery Rate. In this setup, the proposed transfer results in sensitivity gains ranging from 50% to 100% compared to state-of-the-art

    What do we learn from a Genome Wide Association Study performed on HIV-1 infected Long Term Non Progressors individuals?

    Get PDF
    International audiencePrevious Genome Wide Association Studies performed on Elite Controllers and control HIV-1 infected individuals have shown that the MHC locus is predominantly responsible for containing plasma viremia below a threshold of detection. Here we performed a GWAS on a cohort of 160 HIV-1 infected Caucasian Long Term Non Progressors (LTNP) from the EC-funded European-African ''GISHEAL'' Consortium in order to explore whether novel genetic factors could account for the LTNP phenotype (i.e. maintenance of CD4 T cell counts >500 cells/μl and good health conditions without therapy)

    Patterns of chromosomal copy-number alterations in intrahepatic cholangiocarcinoma

    Get PDF
    International audienceBackground: Intrahepatic cholangiocarcinomas (ICC) are relatively rare malignant tumors associated with a poor prognosis. Recent studies using genome-wide sequencing technologies have mainly focused on identifying new driver mutations. There is nevertheless a need to investigate the spectrum of copy number aberrations in order to identify potential target genes in the altered chromosomal regions. The aim of this study was to characterize the patterns of chromosomal copy-number alterations (CNAs) in ICC. Methods: 53 patients having ICC with frozen material were selected. In 47 cases, DNA hybridization has been performed on a genomewide SNP array. A procedure with a segmentation step and a calling step classified genomic regions into copy-number aberration states. We identified the exclusively amplified and deleted recurrent genomic areas. These areas are those showing the highest estimated propensity level for copy loss (resp. copy gain) together with the lowest level for copy gain (resp. copy loss). We investigated ICC clustering. We analyzed the relationships between CNAs and clinico-pathological characteristics. Results: The overall genomic profile of ICC showed many alterations with higher rates for the deletions. Exclusively deleted genomic areas were 1p, 3p and 14q. The main exclusively amplified genomic areas were 1q, 7p, 7q and 8q. Based on the exclusively deleted/amplified genomic areas, a clustering analysis identified three tumors groups: the first group characterized by copy loss of 1p and copy gain of 7p, the second group characterized by 1p and 3p copy losses without 7p copy gain, the last group characterized mainly by very few CNAs. From univariate analyses, the number of tumors, the size of the largest tumor and the stage were significantly associated with shorter time recurrence. We found no relationship between the number of altered cytobands or tumor groups and time to recurrence. Conclusion: This study describes the spectrum of chromosomal aberrations across the whole genome. Some of the recurrent exclusive CNAs harbor candidate target genes. Despite the absence of correlation between CNAs and clinico-pathological characteristics, the co-occurence of 7p gain and 1p loss in a subgroup of patients may suggest a differential activation of EGFR and its downstream pathways, which may have a potential effect on targeted therapies

    Genetics of VEGF Serum Variation in Human Isolated Populations of Cilento: Importance of VEGF Polymorphisms

    Get PDF
    Vascular Endothelial Growth Factor (VEGF) is the main player in angiogenesis. Because of its crucial role in this process, the study of the genetic factors controlling VEGF variability may be of particular interest for many angiogenesis-associated diseases. Although some polymorphisms in the VEGF gene have been associated with a susceptibility to several disorders, no genome-wide search on VEGF serum levels has been reported so far. We carried out a genome-wide linkage analysis in three isolated populations and we detected a strong linkage between VEGF serum levels and the 6p21.1 VEGF region in all samples. A new locus on chromosome 3p26.3 significantly linked to VEGF serum levels was also detected in a combined population sample. A sequencing of the gene followed by an association study identified three common single nucleotide polymorphisms (SNPs) influencing VEGF serum levels in one population (Campora), two already reported in the literature (rs3025039, rs25648) and one new signal (rs3025020). A fourth SNP (rs41282644) was found to affect VEGF serum levels in another population (Cardile). All the identified SNPs contribute to the related population linkages (35% of the linkage explained in Campora and 15% in Cardile). Interestingly, none of the SNPs influencing VEGF serum levels in one population was found to be associated in the two other populations. These results allow us to exclude the hypothesis that the common variants located in the exons, intron-exon junctions, promoter and regulative regions of the VEGF gene may have a causal effect on the VEGF variation. The data support the alternative hypothesis of a multiple rare variant model, possibly consisting in distinct variants in different populations, influencing VEGF serum levels

    A constrained polynomial regression procedure for estimating the local False Discovery Rate

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>In the context of genomic association studies, for which a large number of statistical tests are performed simultaneously, the local False Discovery Rate (<it>lFDR</it>), which quantifies the evidence of a specific gene association with a clinical or biological variable of interest, is a relevant criterion for taking into account the multiple testing problem. The <it>lFDR </it>not only allows an inference to be made for each gene through its specific value, but also an estimate of Benjamini-Hochberg's False Discovery Rate (<it>FDR</it>) for subsets of genes.</p> <p>Results</p> <p>In the framework of estimating procedures without any distributional assumption under the alternative hypothesis, a new and efficient procedure for estimating the <it>lFDR </it>is described. The results of a simulation study indicated good performances for the proposed estimator in comparison to four published ones. The five different procedures were applied to real datasets.</p> <p>Conclusion</p> <p>A novel and efficient procedure for estimating <it>lFDR </it>was developed and evaluated.</p

    Genomic aberrations associated with outcome in anaplastic oligodendroglial tumors treated within the EORTC phase III trial 26951

    Get PDF
    Despite similar morphological aspects, anaplastic oligodendroglial tumors (AOTs) form a heterogeneous clinical subgroup of gliomas. The chromosome arms 1p/19q codeletion has been shown to be a relevant biomarker in AOTs and to be perfectly exclusive from EGFR amplification in gliomas. To identify new genomic regions associated with prognosis, 60 AOTs from the EORTC trial 26951 were analyzed retrospectively using BAC-array-based comparative genomic hybridization. The data were processed using a binary tree method. Thirty-three BACs with prognostic value were identified distinguishing four genomic subgroups of AOTs with different prognosis (p < 0.0001). Type I tumors (25%) were characterized by: (1) an EGFR amplification, (2) a poor prognosis, (3) a higher rate of necrosis, and (4) an older age of patients. Type II tumors (21.7%) had: (1) loss of prognostic BACs located on 1p tightly associated with 19q deletion, (2) a longer survival, (3) an oligodendroglioma phenotype, and (4) a frontal location in brain. Type III AOTs (11.7%) exhibited: (1) a deletion of prognostic BACs located on 21q, and (2) a short survival. Finally, type IV tumors (41.7%) had different genomic patterns and prognosis than type I, II and III AOTs. Multivariate analysis showed that genomic type provides additional prognostic data to clinical, imaging and pathological features. Similar results were obtained in the cohort of 45 centrally reviewed–validated cases of AOTs. Whole genome analysis appears useful to screen the numerous genomic abnormalities observed in AOTs and to propose new biomarkers particularly in the non-1p/19q codeleted AOTs

    Distinct Genetic Loci Control Plasma HIV-RNA and Cellular HIV-DNA Levels in HIV-1 Infection: The ANRS Genome Wide Association 01 Study

    Get PDF
    Previous studies of the HIV-1 disease have shown that HLA and Chemokine receptor genetic variants influence disease progression and early viral load. We performed a Genome Wide Association study in a cohort of 605 HIV-1-infected seroconverters for detection of novel genetic factors that influence plasma HIV-RNA and cellular HIV-DNA levels. Most of the SNPs strongly associated with HIV-RNA levels were localised in the 6p21 major histocompatibility complex (MHC) region and were in the vicinity of class I and III genes. Moreover, protective alleles for four disease-associated SNPs in the MHC locus (rs2395029, rs13199524, rs12198173 and rs3093662) were strikingly over-represented among forty-five Long Term HIV controllers. Furthermore, we show that the HIV-DNA levels (reflecting the HIV reservoir) are associated with the same four SNPs, but also with two additional SNPs on chromosome 17 (rs6503919; intergenic region flanked by the DDX40 and YPEL2 genes) and chromosome 8 (rs2575735; within the Syndecan 2 gene). Our data provide evidence that the MHC controls both HIV replication and HIV reservoir. They also indicate that two additional genomic loci may influence the HIV reservoir
    corecore