22 research outputs found

    Finishing the euchromatic sequence of the human genome

    Get PDF
    The sequence of the human genome encodes the genetic instructions for human physiology, as well as rich information about human evolution. In 2001, the International Human Genome Sequencing Consortium reported a draft sequence of the euchromatic portion of the human genome. Since then, the international collaboration has worked to convert this draft into a genome sequence with high accuracy and nearly complete coverage. Here, we report the result of this finishing process. The current genome sequence (Build 35) contains 2.85 billion nucleotides interrupted by only 341 gaps. It covers ∼99% of the euchromatic genome and is accurate to an error rate of ∼1 event per 100,000 bases. Many of the remaining euchromatic gaps are associated with segmental duplications and will require focused work with new methods. The near-complete sequence, the first for a vertebrate, greatly improves the precision of biological analyses of the human genome including studies of gene number, birth and death. Notably, the human enome seems to encode only 20,000-25,000 protein-coding genes. The genome sequence reported here should serve as a firm foundation for biomedical research in the decades ahead

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe

    Hereditary diffuse gastric cancer: What the clinician should know

    No full text

    Prognostic factors in extrapulmonary small cell carcinomas.

    No full text

    Towards proactive palliative care in oncology: developing an explainable EHR-based machine learning model for mortality risk prediction

    No full text
    Abstract Background Ex-ante identification of the last year in life facilitates a proactive palliative approach. Machine learning models trained on electronic health records (EHR) demonstrate promising performance in cancer prognostication. However, gaps in literature include incomplete reporting of model performance, inadequate alignment of model formulation with implementation use-case, and insufficient explainability hindering trust and adoption in clinical settings. Hence, we aim to develop an explainable machine learning EHR-based model that prompts palliative care processes by predicting for 365-day mortality risk among patients with advanced cancer within an outpatient setting. Methods Our cohort consisted of 5,926 adults diagnosed with Stage 3 or 4 solid organ cancer between July 1, 2017, and June 30, 2020 and receiving ambulatory cancer care within a tertiary center. The classification problem was modelled using Extreme Gradient Boosting (XGBoost) and aligned to our envisioned use-case: “Given a prediction point that corresponds to an outpatient cancer encounter, predict for mortality within 365-days from prediction point, using EHR data up to 365-days prior.” The model was trained with 75% of the dataset (n = 39,416 outpatient encounters) and validated on a 25% hold-out dataset (n = 13,122 outpatient encounters). To explain model outputs, we used Shapley Additive Explanations (SHAP) values. Clinical characteristics, laboratory tests and treatment data were used to train the model. Performance was evaluated using area under the receiver operating characteristic curve (AUROC) and area under the precision-recall curve (AUPRC), while model calibration was assessed using the Brier score. Results In total, 17,149 of the 52,538 prediction points (32.6%) had a mortality event within the 365-day prediction window. The model demonstrated an AUROC of 0.861 (95% CI 0.856–0.867) and AUPRC of 0.771. The Brier score was 0.147, indicating slight overestimations of mortality risk. Explanatory diagrams utilizing SHAP values allowed visualization of feature impacts on predictions at both the global and individual levels. Conclusion Our machine learning model demonstrated good discrimination and precision-recall in predicting 365-day mortality risk among individuals with advanced cancer. It has the potential to provide personalized mortality predictions and facilitate earlier integration of palliative care

    HER2 expression, copy number variation and survival outcomes in HER2-low non-metastatic breast cancer: an international multicentre cohort study and TCGA-METABRIC analysis

    Get PDF
    Background HER2-low breast cancer (BC) is currently an area of active interest. This study evaluated the impact of low expression of HER2 on survival outcomes in HER2-negative non-metastatic breast cancer (BC). Methods Patients with HER2-negative non-metastatic BC from 6 centres within the Asian Breast Cancer Cooperative Group (ABCCG) (n = 28,280) were analysed. HER2-low was defined as immunohistochemistry (IHC) 1+ or 2+ and in situ hybridization non-amplified (ISH-) and HER2-zero as IHC 0. Relapse-free survival (RFS) and overall survival (OS) by hormone receptor status and HER2 IHC 0, 1+ and 2+ ISH- status were the main outcomes. A combined TCGA-BRCA and METABRIC cohort (n = 1967) was also analysed to explore the association between HER2 expression, ERBB2 copy number variation (CNV) status and RFS. Results ABCCG cohort median follow-up was 6.6 years; there were 12,260 (43.4%) HER2-low BC and 16,020 (56.6%) HER2-zero BC. The outcomes were better in HER2-low BC than in HER2-zero BC (RFS: centre-adjusted hazard ratio (HR) 0.88, 95% CI 0.82-0.93, P < 0.001; OS: centre-adjusted HR 0.82, 95% CI 0.76-0.89, P < 0.001). On multivariable analysis, HER2-low status was prognostic (RFS: HR 0.90, 95% CI 0.85-0.96, P = 0.002; OS: HR 0.86, 95% CI 0.79-0.93, P < 0.001). These differences remained significant in hormone receptor-positive tumours and for OS in hormone receptor-negative tumours. Superior outcomes were observed for HER2 IHC1+ BC versus HER2-zero BC (RFS: HR 0.89, 95% CI 0.83-0.96, P = 0.001; OS: HR 0.85, 95% CI 0.78-0.93, P = 0.001). No significant differences were seen between HER2 IHC2+ ISH- and HER2-zero BCs. In the TCGA-BRCA and METABRIC cohorts, ERBB2 CNV status was an independent RFS prognostic factor (neutral versus non-neutral HR 0.71, 95% CI 0.59-0.86, P < 0.001); no differences in RFS by ERBB2 mRNA expression levels were found. Conclusions HER2-low BC had a superior prognosis compared to HER2-zero BC in the non-metastatic setting, though absolute differences were modest and driven by HER2 IHC 1+ BC. ERBB2 CNV merits further investigation in HER2-negative BC.

    Multi-center evaluation of artificial intelligent imaging and clinical models for predicting neoadjuvant chemotherapy response in breast cancer

    No full text
    Background: Neoadjuvant chemotherapy (NAC) plays an important role in the management of locally advanced breast cancer. It allows for downstaging of tumors, potentially allowing for breast conservation. NAC also allows for in-vivo testing of the tumors’ response to chemotherapy and provides important prognostic information. There are currently no clearly defined clinical models that incorporate imaging with clinical data to predict response to NAC. Thus, the aim of this work is to develop a predictive AI model based on routine CT imaging and clinical parameters to predict response to NAC. Methods: The CT scans of 324 patients with NAC from multiple centers in Singapore were used in this study. Four different radiomics models were built for predicting pathological complete response (pCR): first two were based on textural features extracted from peri-tumoral and tumoral regions, the third model based on novel space-resolved radiomics which extract feature maps using voxel-based radiomics and the fourth model based on deep learning (DL). Clinical parameters were included to build a final prognostic model. Results: The best performing models were based on space-resolved and DL approaches. Space-resolved radiomics improves the clinical AUCs of pCR prediction from 0.743 (0.650 to 0.831) to 0.775 (0.685 to 0.860) and our DL model improved it from 0.743 (0.650 to 0.831) to 0.772 (0.685 to 0.853). The tumoral radiomics model performs the worst with no improvement of the AUC from the clinical model. The peri-tumoral combined model gives moderate performance with an AUC of 0.765 (0.671 to 0.855). Conclusions: Radiomics features extracted from diagnostic CT augment the predictive ability of pCR when combined with clinical features. The novel space-resolved radiomics and DL radiomics approaches outperformed conventional radiomics techniques.W.L.N. is supported by the National Medical Research Council Fellowship (NMRC/MOH-000166-00)

    State-of-the-Art Reviews and Analyses of Emerging Research Findings and Achievements of Thermoelectric Materials over the Past Years

    No full text
    corecore