38 research outputs found

    Tumour break load is a biologically relevant feature of genomic instability with prognostic value in colorectal cancer

    Get PDF
    BACKGROUND: Clinically implemented prognostic biomarkers are lacking for the 80% of colorectal cancers (CRCs) that exhibit chromosomal instability (CIN). CIN is characterised by chromosome segregation errors and double-strand break repair defects that lead to somatic copy number aberrations (SCNAs) and chromosomal rearrangement-associated structural variants (SVs), respectively. We hypothesise that the number of SVs is a distinct feature of genomic instability and defined a new measure to quantify SVs: the tumour break load (TBL). The present study aimed to characterise the biological impact and clinical relevance of TBL in CRC. METHODS: Disease-free survival and SCNA data were obtained from The Cancer Genome Atlas and two independent CRC studies. TBL was defined as the sum of SCNA-associated SVs. RNA gene expression data of microsatellite stable (MSS) CRC samples were used to train an RNA-based TBL classifier. Dichotomised DNA-based TBL data were used for survival analysis. RESULTS: TBL shows large variation in CRC with poor correlation to tumour mutational burden and fraction of genome altered. TBL impact on tumour biology was illustrated by the high accuracy of classifying cancers in TBL-high and TBL-low (area under the receiver operating characteristic curve [AUC]: 0.88; p < 0.01). High TBL was associated with disease recurrence in 85 stages II-III MSS CRCs from The Cancer Genome Atlas (hazard ratio [HR]: 6.1; p = 0.007) and in two independent validation series of 57 untreated stages II-III (HR: 4.1; p = 0.012) and 74 untreated stage II MSS CRCs (HR: 2.4; p = 0.01). CONCLUSION: TBL is a prognostic biomarker in patients with non-metastatic MSS CRC with great potential to be implemented in routine molecular diagnostics

    The FAIR Guiding Principles for scientific data management and stewardship

    Get PDF
    There is an urgent need to improve the infrastructure supporting the reuse of scholarly data. A diverse set of stakeholders—representing academia, industry, funding agencies, and scholarly publishers—have come together to design and jointly endorse a concise and measureable set of principles that we refer to as the FAIR Data Principles. The intent is that these may act as a guideline for those wishing to enhance the reusability of their data holdings. Distinct from peer initiatives that focus on the human scholar, the FAIR Principles put specific emphasis on enhancing the ability of machines to automatically find and use the data, in addition to supporting its reuse by individuals. This Comment is the first formal publication of the FAIR Principles, and includes the rationale behind them, and some exemplar implementations in the community

    Tailor-made multiple sequence alignments using the PRALINE 2 alignment toolkit

    No full text
    SUMMARY: PRALINE 2 is a toolkit for custom multiple sequence alignment workflows. It can be used to incorporate sequence annotations, such as secondary structure or (DNA) motifs, into the alignment scoring, as well as to customize many other aspects of a progressive multiple alignment workflow. AVAILABILITY AND IMPLEMENTATION: PRALINE 2 is implemented in Python and available as open source software on GitHub: https://github.com/ibivu/PRALINE/. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online

    SeRenDIP: SEquential REmasteriNg to DerIve profiles for fast and accurate predictions of PPI interface positions

    Get PDF
    MOTIVATION: Interpretation of ubiquitous protein sequence data has become a bottleneck in biomolecular research, due to a lack of structural and other experimental annotation data for these proteins. Prediction of protein interaction sites from sequence may be a viable substitute. We therefore recently developed a sequence-based random-forest method for protein-protein interface prediction, which yielded a significantly increased performance than other methods on both homomeric and heteromeric protein-protein interactions. Here we present a webserver that implements this method efficiently. RESULTS: With the aim of accelerating our previous approach, we obtained sequence conservation profiles by re-mastering the alignment of homologous sequences found by PSI-BLAST. This yielded a more than ten-fold speedup and at least the same accuracy, as reported previously for our method; these results allowed us to offer the method as a webserver. The web-server interface is targeted to the non-expert user. The input is simply a sequence of the protein of interest, and the output a table with scores indicating the likelihood of having an interaction interface at a certain position. As the method is sequence-based and not sensitive to the type of protein interaction, we expect this webserver to be of interest to many biological researchers in academia and in industry. AVAILABILITY: Webserver, source code and datasets are available at www.ibi.vu.nl/programs/serendipwww/
    corecore