16 research outputs found

    Google Goes Cancer: Improving Outcome Prediction for Cancer Patients by Network-Based Ranking of Marker Genes

    Get PDF
    Predicting the clinical outcome of cancer patients based on the expression of marker genes in their tumors has received increasing interest in the past decade. Accurate predictors of outcome and response to therapy could be used to personalize and thereby improve therapy. However, state of the art methods used so far often found marker genes with limited prediction accuracy, limited reproducibility, and unclear biological relevance. To address this problem, we developed a novel computational approach to identify genes prognostic for outcome that couples gene expression measurements from primary tumor samples with a network of known relationships between the genes. Our approach ranks genes according to their prognostic relevance using both expression and network information in a manner similar to Google's PageRank. We applied this method to gene expression profiles which we obtained from 30 patients with pancreatic cancer, and identified seven candidate marker genes prognostic for outcome. Compared to genes found with state of the art methods, such as Pearson correlation of gene expression with survival time, we improve the prediction accuracy by up to 7%. Accuracies were assessed using support vector machine classifiers and Monte Carlo cross-validation. We then validated the prognostic value of our seven candidate markers using immunohistochemistry on an independent set of 412 pancreatic cancer samples. Notably, signatures derived from our candidate markers were independently predictive of outcome and superior to established clinical prognostic factors such as grade, tumor size, and nodal status. As the amount of genomic data of individual tumors grows rapidly, our algorithm meets the need for powerful computational approaches that are key to exploit these data for personalized cancer therapies in clinical practice

    Solve-RD: systematic pan-European data sharing and collaborative analysis to solve rare diseases.

    Get PDF
    For the first time in Europe hundreds of rare disease (RD) experts team up to actively share and jointly analyse existing patient's data. Solve-RD is a Horizon 2020-supported EU flagship project bringing together >300 clinicians, scientists, and patient representatives of 51 sites from 15 countries. Solve-RD is built upon a core group of four European Reference Networks (ERNs; ERN-ITHACA, ERN-RND, ERN-Euro NMD, ERN-GENTURIS) which annually see more than 270,000 RD patients with respective pathologies. The main ambition is to solve unsolved rare diseases for which a molecular cause is not yet known. This is achieved through an innovative clinical research environment that introduces novel ways to organise expertise and data. Two major approaches are being pursued (i) massive data re-analysis of >19,000 unsolved rare disease patients and (ii) novel combined -omics approaches. The minimum requirement to be eligible for the analysis activities is an inconclusive exome that can be shared with controlled access. The first preliminary data re-analysis has already diagnosed 255 cases form 8393 exomes/genome datasets. This unprecedented degree of collaboration focused on sharing of data and expertise shall identify many new disease genes and enable diagnosis of many so far undiagnosed patients from all over Europe

    Solving unsolved rare neurological diseases-a Solve-RD viewpoint.

    Get PDF
    Funder: Durch Princess Beatrix Muscle Fund Durch Speeren voor Spieren Muscle FundFunder: University of Tübingen Medical Faculty PATE programFunder: European Reference Network for Rare Neurological Diseases | 739510Funder: European Joint Program on Rare Diseases (EJP-RD COFUND-EJP) | 44140962

    Twist exome capture allows for lower average sequence coverage in clinical exome sequencing

    Get PDF
    Background Exome and genome sequencing are the predominant techniques in the diagnosis and research of genetic disorders. Sufficient, uniform and reproducible/consistent sequence coverage is a main determinant for the sensitivity to detect single-nucleotide (SNVs) and copy number variants (CNVs). Here we compared the ability to obtain comprehensive exome coverage for recent exome capture kits and genome sequencing techniques. Results We compared three different widely used enrichment kits (Agilent SureSelect Human All Exon V5, Agilent SureSelect Human All Exon V7 and Twist Bioscience) as well as short-read and long-read WGS. We show that the Twist exome capture significantly improves complete coverage and coverage uniformity across coding regions compared to other exome capture kits. Twist performance is comparable to that of both short- and long-read whole genome sequencing. Additionally, we show that even at a reduced average coverage of 70× there is only minimal loss in sensitivity for SNV and CNV detection. Conclusion We conclude that exome sequencing with Twist represents a significant improvement and could be performed at lower sequence coverage compared to other exome capture techniques

    Solving patients with rare diseases through programmatic reanalysis of genome-phenome data.

    Get PDF
    Funder: EC | EC Seventh Framework Programm | FP7 Health (FP7-HEALTH - Specific Programme "Cooperation": Health); doi: https://doi.org/10.13039/100011272; Grant(s): 305444, 305444Funder: Ministerio de Economía y Competitividad (Ministry of Economy and Competitiveness); doi: https://doi.org/10.13039/501100003329Funder: Generalitat de Catalunya (Government of Catalonia); doi: https://doi.org/10.13039/501100002809Funder: EC | European Regional Development Fund (Europski Fond za Regionalni Razvoj); doi: https://doi.org/10.13039/501100008530Funder: Instituto Nacional de Bioinformática ELIXIR Implementation Studies Centro de Excelencia Severo OchoaFunder: EC | EC Seventh Framework Programm | FP7 Health (FP7-HEALTH - Specific Programme "Cooperation": Health)Reanalysis of inconclusive exome/genome sequencing data increases the diagnosis yield of patients with rare diseases. However, the cost and efforts required for reanalysis prevent its routine implementation in research and clinical environments. The Solve-RD project aims to reveal the molecular causes underlying undiagnosed rare diseases. One of the goals is to implement innovative approaches to reanalyse the exomes and genomes from thousands of well-studied undiagnosed cases. The raw genomic data is submitted to Solve-RD through the RD-Connect Genome-Phenome Analysis Platform (GPAP) together with standardised phenotypic and pedigree data. We have developed a programmatic workflow to reanalyse genome-phenome data. It uses the RD-Connect GPAP's Application Programming Interface (API) and relies on the big-data technologies upon which the system is built. We have applied the workflow to prioritise rare known pathogenic variants from 4411 undiagnosed cases. The queries returned an average of 1.45 variants per case, which first were evaluated in bulk by a panel of disease experts and afterwards specifically by the submitter of each case. A total of 120 index cases (21.2% of prioritised cases, 2.7% of all exome/genome-negative samples) have already been solved, with others being under investigation. The implementation of solutions as the one described here provide the technical framework to enable periodic case-level data re-evaluation in clinical settings, as recommended by the American College of Medical Genetics

    A Solve-RD ClinVar-based reanalysis of 1522 index cases from ERN-ITHACA reveals common pitfalls and misinterpretations in exome sequencing

    Get PDF
    Purpose Within the Solve-RD project (https://solve-rd.eu/), the European Reference Network for Intellectual disability, TeleHealth, Autism and Congenital Anomalies aimed to investigate whether a reanalysis of exomes from unsolved cases based on ClinVar annotations could establish additional diagnoses. We present the results of the “ClinVar low-hanging fruit” reanalysis, reasons for the failure of previous analyses, and lessons learned. Methods Data from the first 3576 exomes (1522 probands and 2054 relatives) collected from European Reference Network for Intellectual disability, TeleHealth, Autism and Congenital Anomalies was reanalyzed by the Solve-RD consortium by evaluating for the presence of single-nucleotide variant, and small insertions and deletions already reported as (likely) pathogenic in ClinVar. Variants were filtered according to frequency, genotype, and mode of inheritance and reinterpreted. Results We identified causal variants in 59 cases (3.9%), 50 of them also raised by other approaches and 9 leading to new diagnoses, highlighting interpretation challenges: variants in genes not known to be involved in human disease at the time of the first analysis, misleading genotypes, or variants undetected by local pipelines (variants in off-target regions, low quality filters, low allelic balance, or high frequency). Conclusion The “ClinVar low-hanging fruit” analysis represents an effective, fast, and easy approach to recover causal variants from exome sequencing data, herewith contributing to the reduction of the diagnostic deadlock

    Signature to predict risk in patients with and without adjuvant therapy.

    No full text
    <p>(<b>A</b>) Signature to predict risk in patients with adjuvant therapy. The signature was developed with patients receiving adjuvant therapy separated by their median survival into two groups, a high risk group with shorter survival and a low risk group with longer survival. A classifier trained with the signature using leave-one-out cross-validation shows a significant difference between the predicted low and high risk group (, logrank test). (<b>B</b>) Signature to predict risk in patients without adjuvant therapy. The signature was developed with patients not receiving adjuvant therapy separated by their median survival into two groups, a high risk group with shorter survival and a low risk group with longer survival. A classifier trained with the signature using leave-one-out cross-validation shows a significant difference between the predicted low and high risk group (, logrank test).</p

    Clinical characteristics of patients used in this study.

    No full text
    <p>The screening dataset (genome-wide gene expression profiling) comprises 30 samples of surgically resected pancreatic ductal adenocarcinoma from patients without adjuvant chemotherapy. The validation dataset (immunohistochemistry of seven marker candidates) comprises samples from 412 patients, of which 172 had received adjuvant therapy and 240 had not. Significant differences between the adjuvant and no adjuvant therapy subgroups were found for regional lymph nodes status (, Fisher's exact test) and for the stage groupings (, Fisher's exact test). Differences in all other variables were not significant.</p>†<p>Based on postsurgical histopathological assessment (indicated by the p prefix).</p>‡<p>Stage was assessed by the American Joint Committee on Cancer 2006 guidelines.</p

    NetRank feature selection outperforms standard feature selection methods.

    No full text
    <p>(<b>A</b>) The accuracy of different feature selection methods for predicting patient outcome was tested on the screening dataset. The NetRank feature selection using a transcription factor network is shown in red. For smaller training set sizes, our method is superior to all other feature selection methods, reaching an accuracy of 72% in a Monte Carlo cross-validation. (<b>B</b>) Markers found with NetRank are more accurate than markers described in literature.</p

    Regulatory network around signature genes.

    No full text
    <p>(<b>A</b>) All direct neighbors for the seven candidates STAT3, FOS, JUN, SP1, CDX2, CEBPA, and BRCA1 (marked yellow). Transcription factors are marked with a dot. Genes reported in the literature associated with pancreatic cancer survival according to GoGene are represented with larger circles. The absolute correlation coefficient of gene expression with survival in the screening dataset is shown in red. (<b>B</b>) Selection of the network showing genes that are regulated by FOS and SP1. It contains many literature-associated and highly correlated genes. (<b>C</b>) Protein–protein interactions among all signature genes, representing physical interactions between the transcription factors SP1, STAT3, JUN, FOS and the transcription coactivator BRCA1.</p
    corecore