15 research outputs found
Functional microRNA screening using a comprehensive lentiviral human microRNA expression library
ABSTRACT: BACKGROUND: MicroRNAs (miRNAs) are a class of small regulatory RNAs that target sequences in messenger RNAs (mRNAs) to inhibit their protein output. Dissecting the complexities of miRNA function continues to prove challenging as miRNAs are predicted to have thousands of targets, and mRNAs can be targeted by dozens of miRNAs. RESULTS: To systematically address biological function of miRNAs, we constructed and validated a lentiviral miRNA expression library containing 660 currently annotated and 422 candidate human miRNA precursors. The miRNAs are expressed from their native genomic backbone, ensuring physiological processing. The arrayed layout of the library renders it ideal for high-throughput screens, but also allows pooled screening and hit picking. We demonstrate its functionality in both short- and long-term assays, and are able to corroborate previously described results of well-studied miRNAs. CONCLUSIONS: With the miRNA expression library we provide a versatile tool for the systematic elucidation of miRNA function.
Fusion transcripts and their genomic breakpoints in polyadenylated and ribosomal RNA-minus RNA sequencing data
BACKGROUND: Fusion genes are typically identified by RNA sequencing (RNA-seq) without elucidating the causal genomic breakpoints. However, nonâpoly(A)-enriched RNA-seq contains large proportions of intronic reads that also span genomic breakpoints. RESULTS: We have developed an algorithm, Dr. Disco, that searches for fusion transcripts by taking an entire reference genome into account as search space. This includes exons but also introns, intergenic regions, and sequences that do not meet splice junction motifs. Using 1,275 RNA-seq samples, we investigated to what extent genomic breakpoints can be extracted from RNA-seq data and their implications regarding poly(A)-enriched and ribosomal RNAâminus RNA-seq data. Comparison with whole-genome sequencing data revealed that most genomic breakpoints are not, or minimally, transcribed while, in contrast, the genomic breakpoints of all 32 TMPRSS2-ERGâpositive tumours were present at RNA level. We also revealed tumours in which the ERG breakpoint was located before ERG, which co-existed with additional deletions and messenger RNA that incorporated intergenic cryptic exons. In breast cancer we identified rearrangement hot spots near CCND1 and in glioma near CDK4 and MDM2 and could directly associate this with increased expression. Furthermore, in all datasets we find fusions to intergenic regions, often spanning multiple cryptic exons that potentially encode neo-antigens. Thus, fusion transcripts other than classical gene-to-gene fusions are prominently present and can be identified using RNA-seq. CONCLUSION: By using the full potential of nonâpoly(A)-enriched RNA-seq data, sophisticated analysis can reliably identify expressed genomic breakpoints and their transcriptional effects
Consensus molecular subtype classification of colorectal adenomas
Consensus molecular subtyping is an RNA expression-based classification system for colorectal cancer (CRC). Genomic alterations accumulate during CRC pathogenesis, including the premalignant adenoma stage, leading to changes in RNA expression. Only a minority of adenomas progress to malignancies, a transition that is associated with specific DNA copy number aberrations or microsatellite instability (MSI). We aimed to investigate whether colorectal adenomas can already be stratified into consensus molecular subtype (CMS) classes, and whether specific CMS classes are related to the presence of specific DNA copy number aberrations associated with progression to malignancy. RNA sequencing was performed on 62 adenomas and 59 CRCs. MSI status was determined with polymerase chain reaction-based methodology. DNA copy number was assessed by low-coverage DNA sequencing (n = 30) or array-comparative genomic hybridisation (n = 32). Adenomas were classified into CMS classes together with CRCs from the study cohort and from The Cancer Genome Atlas (n = 556), by use of the established CMS classifier. As a result, 54 of 62 (87%) adenomas were classified according to the CMS. The CMS3 âmetabolic subtypeâ, which was least common among CRCs, was most prevalent among adenomas (n = 45; 73%). One of the two adenomas showing MSI was classified as CMS1 (2%), the âMSI immuneâ subtype. Eight adenomas (13%) were classified as the âcanonicalâ CMS2. No adenomas were classified as the âmesenchymalâ CMS4, consistent with the fact that adenomas lack invasion-associated stroma. The distribution of the CMS classes among adenomas was confirmed in an independent series. CMS3 was enriched with adenomas at low risk of progressing to CRC, whereas relatively more high-risk adenomas were observed in CMS2. We conclude that adenomas can be stratified into the CMS classes. Considering that CMS1 and CMS2 expression signatures may mark adenomas at increased risk of progression, the distribution of the CMS classes among adenomas is consistent with the proportion of adenomas expected to progress to CRC
Consensus molecular subtype classification of colorectal adenomas
Consensus molecular subtyping is an RNA expression-based classification system for colorectal cancer (CRC). Genomic alterations accumulate during CRC pathogenesis, including the premalignant adenoma stage, leading to changes in RNA expression. Only a minority of adenomas progress to malignancies, a transition that is associated with specific DNA copy number aberrations or microsatellite instability (MSI). We aimed to investigate whether colorectal adenomas can already be stratified into consensus molecular subtype (CMS) classes, and whether specific CMS classes are related to the presence of specific DNA copy number aberrations associated with progression to malignancy. RNA sequencing was performed on 62 adenomas and 59 CRCs. MSI status was determined with polymerase chain reaction-based methodology. DNA copy number was assessed by low-coverage DNA sequencing (n = 30) or array-comparative genomic hybridisation (n = 32). Adenomas were classified into CMS classes together with CRCs from the study cohort and from The Cancer Genome Atlas (n = 556), by use of the established CMS classifier. As a result, 54 of 62 (87%) adenomas were classified according to the CMS. The CMS3 âmetabolic subtypeâ, which was least common among CRCs, was most prevalent among adenomas (n = 45; 73%). One of the two adenomas showing MSI was classified as CMS1 (2%), the âMSI immuneâ subtype. Eight adenomas (13%) were classified as the âcanonicalâ CMS2. No adenomas were classified as the âmesenchymalâ CMS4, consistent with the fact that adenomas lack invasion-associated stroma. The distribution of the CMS classes among adenomas was confirmed in an independent series. CMS3 was enriched with adenomas at low risk of progressing to CRC, whereas relatively more high-risk adenomas were observed in CMS2. We conclude that adenomas can be stratified into the CMS classes. Considering that CMS1 and CMS2 expression signatures may mark adenomas at increased risk of progression, the distribution of the CMS classes among adenomas is consistent with the proportion of adenomas expected to progress to CRC.Pattern Recognition and Bioinformatic
Identification of differentially expressed splice variants by the proteogenomic pipeline splicify
Proteogenomics, i.e. comprehensive integration of genomics and proteomics data, is a powerful approach identifying novel protein biomarkers. This is especially the case for proteins that differ structurally between disease and control conditions. As tumor development is associated with aberrant splicing, we focus on this rich source of cancer specific biomarkers. To this end, we developed a proteogenomic pipeline, Splicify, which can detect differentially expressed protein isoforms. Splicify is based on integrating RNA massive parallel sequencing data and tandem mass spectrometry proteomics data to identify protein isoforms resulting from differential splicing between two conditions. Proof of concept was obtained by applying Splicify to RNA sequencing and mass spectrometry data obtained from colorectal cancer cell line SW480, before and after siRNA-mediated downmodulation of the splicing factors SF3B1 and SRSF1. These analyses revealed 2172 and 149 differentially expressed isoforms, respectively, with peptide confirmation upon knock-down of SF3B1 and SRSF1 compared with their controls. Splice variants identified included RAC1, OSBPL3, MKI67, and SYK. One additional sample was analyzed by PacBio Iso-Seq full-length transcript sequencing after SF3B1 downmodulation. This analysis verified the alternative splicing identified by Splicify and in addition identified novel splicing events that were not represented in the human reference genome annotation. Therefore, Splicify offers a validated proteogenomic data analysis pipeline for identification of disease specific protein biomarkers resulting from mRNA alternative splicing. Splicify is publicly available on GitHub (https://github.com/NKI-TGO/ SPLICIFY) and suitable to address basic research questions using pre-clinical model systems as well as translational research questions using patient-derived samples, e.g. allowing to identify clinically relevant biomarkers
Genetic Profiling of Colorectal Carcinomas of Patients with Primary Sclerosing Cholangitis and Inflammatory Bowel Disease
BACKGROUND: Patients with primary sclerosing cholangitis (PSC) and inflammatory bowel disease (IBD) run a 10-fold increased risk of developing colorectal cancer (CRC) compared to patients with IBD only. The aim of this study was to perform an extensive screen of known carcinogenic genomic alterations in patients with PSC-IBD, and to investigate whether such changes occur already in nondysplastic mucosa. METHODS: Archival cancer tissue and nondysplastic mucosa from resection specimens of 19 patients with PSC-IBD-CRC were characterized, determining DNA copy-number variations, microsatellite instability (MSI), mutations on 48 cancer genes, and CpG island methylator phenotype (CIMP). Genetic profiles were compared with 2 published cohorts of IBD-associated CRC (IBD-CRC; n = 11) and sporadic CRC (s-CRC; n = 100). RESULTS: Patterns of chromosomal aberrations in PSC-IBD-CRC were similar to those observed in IBD-CRC and s-CRC, MSI occurred only once. Mutation frequencies were comparable between the groups, except for mutations in KRAS, which were less frequent in PSC-IBD-CRC (5%) versus IBD-CRC (38%) and s-CRC (31%; Pâ
=â
.034), and in APC, which were less frequent in PSC-IBD-CRC (5%) and IBD-CRC (0%) versus s-CRC (50%; Pâ
<â
.001). Cases of PSC-IBD-CRC were frequently CIMP positive (44%), at similar levels to cases of s-CRC (34%; Pâ
=â
.574) but less frequent than in cases with IBD-CRC (90%; Pâ
=â
.037). Similar copy number aberrations and mutations were present in matched cancers and adjacent mucosa in 5/15 and 7/11 patients, respectively. CONCLUSIONS: The excess risk of CRC in patients with PSC-IBD was not explained by copy number aberrations, mutations, MSI, nor CIMP status, in cancer tissue, nor in adjacent mucosa. These findings set the stage for further exome-wide and epigenetic studies
DNA hypermethylation analysis in sputum of asymptomatic subjects at risk for lung cancer participating in the NELSON trial: Argument for maximum screening interval of 2years
Aims Lung cancer is the major contributor to cancer mortality due to metastasised disease at time of presentation. The current study investigated DNA hypermethylation of biomarkers RASSF1A, APC, cytoglobin, 3OST2, FAM19A4, PHACTR3 and PRDM14 in sputum of asymptomatic high-risk individuals from the NELSON lung cancer low-dose spiral CT screening trial to detect lung cancer at preclinical stage. Methods Subjects were selected with (i) lung cancer in follow-up (cases; n=65), (ii) minor cytological aberrations (controls; n=120) and (iii) a random selection of subjects without cytological aberrations (controls; n=99). Median follow-up time for controls was 80months. Cut-off values were based on high specificity to assess diagnostic value of the biomarkers. ResultsRASSF1A may denote presence of invasive cancer because of its high specificity (93% (95% CI 89% to 96%); sensitivity 17% (95% CI 4% to 31%), with best performance in a screening interval of 2 years. The panel of RASSF1A, 3OST2 and PRDM14 detected 28% (95% CI 11% to 44%) of lung cancer cases within 2years, with specificity of 90% (95% CI 86% to 94%). Sputum cytology did not detect any lung cancers. Conclusions In a lung cancer screening setting with maximum screening interval of 2years, DNA hypermethylation analysis in sputum may play a role in the detection of preclinical disease, but complementary diagnostic markers are needed to improve sensitivity