9 research outputs found
A Solve-RD ClinVar-based reanalysis of 1522 index cases from ERN-ITHACA reveals common pitfalls and misinterpretations in exome sequencing
Purpose
Within the Solve-RD project (https://solve-rd.eu/), the European Reference Network for Intellectual disability, TeleHealth, Autism and Congenital Anomalies aimed to investigate whether a reanalysis of exomes from unsolved cases based on ClinVar annotations could establish additional diagnoses. We present the results of the “ClinVar low-hanging fruit” reanalysis, reasons for the failure of previous analyses, and lessons learned.
Methods
Data from the first 3576 exomes (1522 probands and 2054 relatives) collected from European Reference Network for Intellectual disability, TeleHealth, Autism and Congenital Anomalies was reanalyzed by the Solve-RD consortium by evaluating for the presence of single-nucleotide variant, and small insertions and deletions already reported as (likely) pathogenic in ClinVar. Variants were filtered according to frequency, genotype, and mode of inheritance and reinterpreted.
Results
We identified causal variants in 59 cases (3.9%), 50 of them also raised by other approaches and 9 leading to new diagnoses, highlighting interpretation challenges: variants in genes not known to be involved in human disease at the time of the first analysis, misleading genotypes, or variants undetected by local pipelines (variants in off-target regions, low quality filters, low allelic balance, or high frequency).
Conclusion
The “ClinVar low-hanging fruit” analysis represents an effective, fast, and easy approach to recover causal variants from exome sequencing data, herewith contributing to the reduction of the diagnostic deadlock
Impact of transcriptional deregulation linked to the production of chimeric transcripts intiated from LINE-1 repeat elements in gliomas
Les éléments LINE-1 (L1) sont une classe abondante de rétrotransposons représentant 17% du génome humain. La région 5’UTR des sous-familles les plus récentes (L1PA1 à 6) contient un promoteur bidirectionnel contenant non seulement un promoteur sens interne mais aussi un promoteur antisens, nommé ASP. Dans les cellules normales, l’un des mécanismes impliqués dans la régulation du promoteur de L1 est la méthylation ADN. Dans les tumeurs, une hypométhylation globale affectant notamment les L1 est observée. Il a été mis en évidence que cette hypométhylation pouvait induire la transcription, à partir de l’ASP, de transcrits chimères ou LCT (L1 Chimeric Transcript). Ces LCT sont composés en 5’ de la séquence L1 et se poursuivent dans la région génomique adjacente. Afin d’étudier l’impact pangénomique de cette dérégulation et son implication dans les processus tumoraux, un outil bio-informatique dédié, nommé CLIFinder, a été développé pour identifier dans des données de RNA-seq paired-end orientés des LCT putatifs. Les RNA-seq de 13 gliomes, qui sont les cancers du cerveau les plus fréquents chez l’adulte, et de 3 tissus cérébraux contrôles ont été étudiés. CLIFinder identifie 2675 chimères dans les gliomes dont 84% impliquent des L1 récents (PA1 à 7) pleine taille supposés posséder un ASP fonctionnel et 50% sont détectées spécifiquement dans les échantillons tumoraux. 78 chimères correspondent à des LCT déjà décrits dans la littérature. De même, l’étude de RNA-seq d’autres types tumoraux (lignée MCF7 et métastases ovariennes) par CLIFinder identifie des chimères en commun suggérant une récurrence de certaines d’entre elles. L’étude d’un groupe de chimères par marche en 5’ par RT-PCR valide que 89% (56/63) des chimères impliquant des L1 récents (L1PA1 à PA7) sont initiées dans la région de l’ASP et correspondent à des LCT alors que toutes les chimères testées impliquant des L1PA8 sont initiées en amont de cette région. Des études de RT-qPCR sur une cohorte plus large de 51 gliomes montrent que les 56 LCT testés, incluant des LCT spécifiques de tumeurs, sont exprimés non seulement dans les tumeurs mais aussi dans les contrôles. Par contre, 70% des LCT spécifiques de tumeurs, montrent alors une surexpression tumorale significative. Ces résultats suggèrent donc une transcription basale provenant de l’ASP dans les tissus normaux et que la dérégulation transcriptionnelle liées aux LCT dans les gliomes passe par une surexpression. Par ailleurs, afin de déterminer le ou les mécanismes sous-jacents impliqués dans l’augmentation de l’activité transcriptionnelle de l’ASP, deux hypothèses ont été testées. La première implique l’hypométhylation du promoteur de L1. Toutefois mes résultats tendent à réfuter cette hypothèse puisqu’aucune diminution de la méthylation ADN n’est retrouvée au niveau de la région promotrice des L1 impliqués dans la transcription de LCT surexprimés. Par contre, les gènes associés à des LCT dont l’expression est dérégulée en contexte tumoral présentent une dérégulation dans le même sens que celle du LCT. De plus, les variations d’expression de gènes corrèlent systématiquement avec celle des LCT correspondants. Ceci suggère qu’une augmentation d’activité transcriptionnelle aux loci des LCT serait responsable de la surexpression de ceux-ci. Enfin 2 LCT candidats surexprimés et ayant un potentiel de biomarqueur prédictif de la survie des patients, pourraient jouer un rôle fonctionnel dans l’initiation, la progression et/ou l’agressivité tumorale. En conclusion, mes travaux ont validé que CLIFinder se positionne comme un outil pertinent permettant d’identifier, de façon pangénomique, les LCT exprimés dans différents types tumoraux à partir de données de RNA-seq paired-end orientées. L’observation d’une récurrence ainsi que d’une surexpression tumorale de certains LCT suggère qu’ils pourraient jouer un rôle fonctionnel dans les processus de tumorigenèse.LINE-1 (L1) is the most abundant class of retrotransposons which represents 17% of the human genome. The 5’ region of the youngest L1 sub-families (L1PA1 to 6) contains a bidirectional promoter consisting, in addition to the internal sense promoter, of an antisense promoter, called ASP. In normal cells, the main defense mechanism, developed to counteract the deleterious effect of L1 activity, consists in L1 promoter DNA methylation. A hallmark of cancer genomes consists in a global DNA hypomethylation which affects especially L1 promoters. In tumors, evidences suggest that this hypomethylation could result in transcription from ASP of aberrant L1-Chimeric Transcripts (LCTs) composed of L1 5’end and its adjacent sequence. To investigate the pangenomic extent of this transcriptional deregulation and its impact in tumoral processes, a dedicated bioinformatic tool, CLIFinder, was designed to select putative LCTs among RNA-seq oriented paired-end reads. RNA-seq analyses of 13 gliomas, which are the most common brain cancer in adults, and 3 control brains were performed.CLIFinder identifies 2675 chimeras in gliomas, among which 84% involves recent L1 (PA1 to 7) full size, supposed to possess a functional ASP, and 50% are detected specifically in tumors samples. 78 chimeras correspond to LCT already described in literature. In addition, study of additional RNA-seq data from other tumor types (MCF7 and ovarian metastasis) by CLIFinder identifies common chimeras suggesting that some of them can be recurrent. The analysis of a group of chimeras by 5’ walk RT-PCR validate that 89% (56/63) of chimeras implying recent L1 (L1PA1 to 7) are initiated at the ASP region and therefore correspond to LCT; whereas all tested chimeras implying an L1PA8 element are transcribed from an upstream region. RT-qPCR studies on a larger cohort of 51 gliomas show that all 56 tested LCT, even identified by CLIFinder as “tumor specific”, are not only expressed in tumors but also in controls. Nevertheless, 70% of the “tumor specific” LCTs are significantly overexpressed in tumors. My results suggest that, even L1 5’ UTR methylation, some ASP are active in normal tissues and lead to a basal LCT expression in normal tissues. Moreover, a transcriptional deregulation linked to LCTs in tumors exists and implies a LCTs’ overexpression.In order to determine the underlying mechanisms involved in the increase of transcriptional activity of ASP, two hypothesis were tested. The first one implies L1 promoter hypomethylation. My results tend to refute this hypothesis because no decrease of the DNA methylation is found at the promoter region of L1 linked to overexpressed LCTs. On the other hand, the genes associated to LCT presenting an expression deregulation in tumors demonstrate a deregulation in the same way. Moreover, gene expression variations correlates systematically with the one corresponding LCTs. This suggests that an increase of transcriptional activity at the LCTs loci would be responsible of their overexpression. Finally, 2 candidate LCT overexpressed and presenting as potential predictive biomarkers for patient’s survival, could play a functional role in initiation, progression and/or the tumoral aggressiveness.In conclusion, my work has validated CLIFinder as a useful tool to identify, at pangenomic level, LCTs expressed in different tumor types from paired-end stranded RNA-seq data. The observation of the recurrence and tumoral overexpression for some LCTs suggests that they may play a functional role in tumoral processes
Etude de l'impact de la dérégulation transcriptionnelle liée à des transcrits chimères initiés à partir d'éléments répétés de type LINE-1 dans la tumorigenèse gliale
LINE-1 (L1) is the most abundant class of retrotransposons which represents 17% of the human genome. The 5’ region of the youngest L1 sub-families (L1PA1 to 6) contains a bidirectional promoter consisting, in addition to the internal sense promoter, of an antisense promoter, called ASP. In normal cells, the main defense mechanism, developed to counteract the deleterious effect of L1 activity, consists in L1 promoter DNA methylation. A hallmark of cancer genomes consists in a global DNA hypomethylation which affects especially L1 promoters. In tumors, evidences suggest that this hypomethylation could result in transcription from ASP of aberrant L1-Chimeric Transcripts (LCTs) composed of L1 5’end and its adjacent sequence. To investigate the pangenomic extent of this transcriptional deregulation and its impact in tumoral processes, a dedicated bioinformatic tool, CLIFinder, was designed to select putative LCTs among RNA-seq oriented paired-end reads. RNA-seq analyses of 13 gliomas, which are the most common brain cancer in adults, and 3 control brains were performed.CLIFinder identifies 2675 chimeras in gliomas, among which 84% involves recent L1 (PA1 to 7) full size, supposed to possess a functional ASP, and 50% are detected specifically in tumors samples. 78 chimeras correspond to LCT already described in literature. In addition, study of additional RNA-seq data from other tumor types (MCF7 and ovarian metastasis) by CLIFinder identifies common chimeras suggesting that some of them can be recurrent. The analysis of a group of chimeras by 5’ walk RT-PCR validate that 89% (56/63) of chimeras implying recent L1 (L1PA1 to 7) are initiated at the ASP region and therefore correspond to LCT; whereas all tested chimeras implying an L1PA8 element are transcribed from an upstream region. RT-qPCR studies on a larger cohort of 51 gliomas show that all 56 tested LCT, even identified by CLIFinder as “tumor specific”, are not only expressed in tumors but also in controls. Nevertheless, 70% of the “tumor specific” LCTs are significantly overexpressed in tumors. My results suggest that, even L1 5’ UTR methylation, some ASP are active in normal tissues and lead to a basal LCT expression in normal tissues. Moreover, a transcriptional deregulation linked to LCTs in tumors exists and implies a LCTs’ overexpression.In order to determine the underlying mechanisms involved in the increase of transcriptional activity of ASP, two hypothesis were tested. The first one implies L1 promoter hypomethylation. My results tend to refute this hypothesis because no decrease of the DNA methylation is found at the promoter region of L1 linked to overexpressed LCTs. On the other hand, the genes associated to LCT presenting an expression deregulation in tumors demonstrate a deregulation in the same way. Moreover, gene expression variations correlates systematically with the one corresponding LCTs. This suggests that an increase of transcriptional activity at the LCTs loci would be responsible of their overexpression. Finally, 2 candidate LCT overexpressed and presenting as potential predictive biomarkers for patient’s survival, could play a functional role in initiation, progression and/or the tumoral aggressiveness.In conclusion, my work has validated CLIFinder as a useful tool to identify, at pangenomic level, LCTs expressed in different tumor types from paired-end stranded RNA-seq data. The observation of the recurrence and tumoral overexpression for some LCTs suggests that they may play a functional role in tumoral processes.Les éléments LINE-1 (L1) sont une classe abondante de rétrotransposons représentant 17% du génome humain. La région 5’UTR des sous-familles les plus récentes (L1PA1 à 6) contient un promoteur bidirectionnel contenant non seulement un promoteur sens interne mais aussi un promoteur antisens, nommé ASP. Dans les cellules normales, l’un des mécanismes impliqués dans la régulation du promoteur de L1 est la méthylation ADN. Dans les tumeurs, une hypométhylation globale affectant notamment les L1 est observée. Il a été mis en évidence que cette hypométhylation pouvait induire la transcription, à partir de l’ASP, de transcrits chimères ou LCT (L1 Chimeric Transcript). Ces LCT sont composés en 5’ de la séquence L1 et se poursuivent dans la région génomique adjacente. Afin d’étudier l’impact pangénomique de cette dérégulation et son implication dans les processus tumoraux, un outil bio-informatique dédié, nommé CLIFinder, a été développé pour identifier dans des données de RNA-seq paired-end orientés des LCT putatifs. Les RNA-seq de 13 gliomes, qui sont les cancers du cerveau les plus fréquents chez l’adulte, et de 3 tissus cérébraux contrôles ont été étudiés. CLIFinder identifie 2675 chimères dans les gliomes dont 84% impliquent des L1 récents (PA1 à 7) pleine taille supposés posséder un ASP fonctionnel et 50% sont détectées spécifiquement dans les échantillons tumoraux. 78 chimères correspondent à des LCT déjà décrits dans la littérature. De même, l’étude de RNA-seq d’autres types tumoraux (lignée MCF7 et métastases ovariennes) par CLIFinder identifie des chimères en commun suggérant une récurrence de certaines d’entre elles. L’étude d’un groupe de chimères par marche en 5’ par RT-PCR valide que 89% (56/63) des chimères impliquant des L1 récents (L1PA1 à PA7) sont initiées dans la région de l’ASP et correspondent à des LCT alors que toutes les chimères testées impliquant des L1PA8 sont initiées en amont de cette région. Des études de RT-qPCR sur une cohorte plus large de 51 gliomes montrent que les 56 LCT testés, incluant des LCT spécifiques de tumeurs, sont exprimés non seulement dans les tumeurs mais aussi dans les contrôles. Par contre, 70% des LCT spécifiques de tumeurs, montrent alors une surexpression tumorale significative. Ces résultats suggèrent donc une transcription basale provenant de l’ASP dans les tissus normaux et que la dérégulation transcriptionnelle liées aux LCT dans les gliomes passe par une surexpression. Par ailleurs, afin de déterminer le ou les mécanismes sous-jacents impliqués dans l’augmentation de l’activité transcriptionnelle de l’ASP, deux hypothèses ont été testées. La première implique l’hypométhylation du promoteur de L1. Toutefois mes résultats tendent à réfuter cette hypothèse puisqu’aucune diminution de la méthylation ADN n’est retrouvée au niveau de la région promotrice des L1 impliqués dans la transcription de LCT surexprimés. Par contre, les gènes associés à des LCT dont l’expression est dérégulée en contexte tumoral présentent une dérégulation dans le même sens que celle du LCT. De plus, les variations d’expression de gènes corrèlent systématiquement avec celle des LCT correspondants. Ceci suggère qu’une augmentation d’activité transcriptionnelle aux loci des LCT serait responsable de la surexpression de ceux-ci. Enfin 2 LCT candidats surexprimés et ayant un potentiel de biomarqueur prédictif de la survie des patients, pourraient jouer un rôle fonctionnel dans l’initiation, la progression et/ou l’agressivité tumorale. En conclusion, mes travaux ont validé que CLIFinder se positionne comme un outil pertinent permettant d’identifier, de façon pangénomique, les LCT exprimés dans différents types tumoraux à partir de données de RNA-seq paired-end orientées. L’observation d’une récurrence ainsi que d’une surexpression tumorale de certains LCT suggère qu’ils pourraient jouer un rôle fonctionnel dans les processus de tumorigenèse
CLIFinder: Identification of LINE-1 Chimeric Transcripts in RNA-seq data
International audienceL1 Chimeric Transcripts (LCTs) are initiated by repeated LINE-1 element antisense promoters and include the L1 5'UTR sequence in antisense orientation followed by the adjacent genomic region. LCTs have been characterized mainly using bioinformatics approaches to query dbEST. To take advantage of NGS data to unravel the transcriptome composition, we developed Chimeric LIne Finder (CLIFinder), a new bioinformatics tool. Using oriented paired-end RNA-seq data, we demonstrated that CLIFinder can identify genome-wide transcribed chimera sequences corresponding to potential LCTs. Moreover, CLIFinder can be adapted to study transcription from other repeat types
L1 chimeric transcripts are expressed in healthy brain and their deregulation in glioma follows that of their host locus
International audienceBesides the consequences of retrotransposition, long interspersed element 1 (L1) retrotransposons can affect the host genome through their antisense promoter. In addition to the sense promoter, the evolutionarily recent L1 retrotransposons, which are present in several thousand copies, also possess an anti-sense promoter that can produce L1 chimeric transcripts (LCT) composed of the L1 5′ UTR followed by the adjacent genomic sequence. The full extent to which LCT expression occurs in a given tissue and whether disruption of the defense mechanisms that normally control L1 retrotransposons affects their expression and function in cancer cells, remain to be established. By using CLIFinder, a dedicated bioinformatics tool, we found that LCT expression was widespread in normal brain and aggressive glioma samples, and that approximately 17% of recent L1 retrotransposons, from the L1PA1 to L1PA7 subfamilies, were involved in their production. Importantly, the transcriptional activities of the L1 antisense promoters and of their host loci were coupled. Accordingly, we detected LCT-producing L1 retrotransposons mainly in transcriptionally active genes and genomic loci. Moreover, changes in the host genomic locus expression level in glioma were associated with a similar change in LCT expression level, regardless of the L1 promoter methylation status. Our findings support a model in which the host genomic locus transcriptional activity is the main driving force of LCT expression. We hypothesize that this model is more applicable when host gene and LCT are transcribed from the same strand
L1 chimeric transcripts are expressed in healthy brain and their deregulation in glioma follows that of their host locus
International audienceBesides the consequences of retrotransposition, long interspersed element 1 (L1) retrotransposons can affect the host genome through their antisense promoter. In addition to the sense promoter, the evolutionarily recent L1 retrotransposons, which are present in several thousand copies, also possess an anti-sense promoter that can produce L1 chimeric transcripts (LCT) composed of the L1 5′ UTR followed by the adjacent genomic sequence. The full extent to which LCT expression occurs in a given tissue and whether disruption of the defense mechanisms that normally control L1 retrotransposons affects their expression and function in cancer cells, remain to be established. By using CLIFinder, a dedicated bioinformatics tool, we found that LCT expression was widespread in normal brain and aggressive glioma samples, and that approximately 17% of recent L1 retrotransposons, from the L1PA1 to L1PA7 subfamilies, were involved in their production. Importantly, the transcriptional activities of the L1 antisense promoters and of their host loci were coupled. Accordingly, we detected LCT-producing L1 retrotransposons mainly in transcriptionally active genes and genomic loci. Moreover, changes in the host genomic locus expression level in glioma were associated with a similar change in LCT expression level, regardless of the L1 promoter methylation status. Our findings support a model in which the host genomic locus transcriptional activity is the main driving force of LCT expression. We hypothesize that this model is more applicable when host gene and LCT are transcribed from the same strand
Rare predicted loss-of-function variants of type I IFN immunity genes are associated with life-threatening COVID-19
BackgroundWe previously reported that impaired type I IFN activity, due to inborn errors of TLR3- and TLR7-dependent type I interferon (IFN) immunity or to autoantibodies against type I IFN, account for 15-20% of cases of life-threatening COVID-19 in unvaccinated patients. Therefore, the determinants of life-threatening COVID-19 remain to be identified in similar to 80% of cases.MethodsWe report here a genome-wide rare variant burden association analysis in 3269 unvaccinated patients with life-threatening COVID-19, and 1373 unvaccinated SARS-CoV-2-infected individuals without pneumonia. Among the 928 patients tested for autoantibodies against type I IFN, a quarter (234) were positive and were excluded.ResultsNo gene reached genome-wide significance. Under a recessive model, the most significant gene with at-risk variants was TLR7, with an OR of 27.68 (95%CI 1.5-528.7, P=1.1x10(-4)) for biochemically loss-of-function (bLOF) variants. We replicated the enrichment in rare predicted LOF (pLOF) variants at 13 influenza susceptibility loci involved in TLR3-dependent type I IFN immunity (OR=3.70[95%CI 1.3-8.2], P=2.1x10(-4)). This enrichment was further strengthened by (1) adding the recently reported TYK2 and TLR7 COVID-19 loci, particularly under a recessive model (OR=19.65[95%CI 2.1-2635.4], P=3.4x10(-3)), and (2) considering as pLOF branchpoint variants with potentially strong impacts on splicing among the 15 loci (OR=4.40[9%CI 2.3-8.4], P=7.7x10(-8)). Finally, the patients with pLOF/bLOF variants at these 15 loci were significantly younger (mean age [SD]=43.3 [20.3] years) than the other patients (56.0 [17.3] years; P=1.68x10(-5)).ConclusionsRare variants of TLR3- and TLR7-dependent type I IFN immunity genes can underlie life-threatening COVID-19, particularly with recessive inheritance, in patients under 60 years old