79 research outputs found

    Extracting detailed oncologic history and treatment plan from medical oncology notes with large language models

    Full text link
    Both medical care and observational studies in oncology require a thorough understanding of a patient's disease progression and treatment history, often elaborately documented in clinical notes. Despite their vital role, no current oncology information representation and annotation schema fully encapsulates the diversity of information recorded within these notes. Although large language models (LLMs) have recently exhibited impressive performance on various medical natural language processing tasks, due to the current lack of comprehensively annotated oncology datasets, an extensive evaluation of LLMs in extracting and reasoning with the complex rhetoric in oncology notes remains understudied. We developed a detailed schema for annotating textual oncology information, encompassing patient characteristics, tumor characteristics, tests, treatments, and temporality. Using a corpus of 10 de-identified breast cancer progress notes at University of California, San Francisco, we applied this schema to assess the abilities of three recently-released LLMs (GPT-4, GPT-3.5-turbo, and FLAN-UL2) to perform zero-shot extraction of detailed oncological history from two narrative sections of clinical progress notes. Our team annotated 2750 entities, 2874 modifiers, and 1623 relationships. The GPT-4 model exhibited overall best performance, with an average BLEU score of 0.69, an average ROUGE score of 0.72, and an average accuracy of 67% on complex tasks (expert manual evaluation). Notably, it was proficient in tumor characteristic and medication extraction, and demonstrated superior performance in inferring symptoms due to cancer and considerations of future medications. The analysis demonstrates that GPT-4 is potentially already usable to extract important facts from cancer progress notes needed for clinical research, complex population management, and documenting quality patient care.Comment: Source code available at: https://github.com/MadhumitaSushil/OncLLMExtractio

    On-Sky Operations with the ALES Integral Field Spectrograph

    Full text link
    The integral field spectrograph configuration of the LMIRCam science camera within the Large Binocular Telescope Interferometer (LBTI) facilitates 2 to 5 Îź\mum spectroscopy of directly imaged gas-giant exoplanets. The mode, dubbed ALES, comprises magnification optics, a lenslet array, and direct-vision prisms, all of which are included within filter wheels in LMIRCam. Our observing approach includes manual adjustments to filter wheel positions to optimize alignment, on/off nodding to track sky-background variations, and wavelength calibration using narrow band filters in series with ALES optics. For planets with separations outside our 1"x1" field of view, we use a three-point nod pattern to visit the primary, secondary and sky. To minimize overheads we select the longest exposure times and nod periods given observing conditions, especially sky brightness and variability. Using this strategy we collected several datasets of low-mass companions to nearby stars

    Identification of a pan-cancer oncogenic microRNA superfamily anchored by a central core seed motif

    Get PDF
    MicroRNAs modulate tumorigenesis through suppression of specific genes. As many tumour types rely on overlapping oncogenic pathways, a core set of microRNAs may exist, which consistently drives or suppresses tumorigenesis in many cancer types. Here we integrate The Cancer Genome Atlas (TCGA) pan-cancer data set with a microRNA target atlas composed of publicly available Argonaute Crosslinking Immunoprecipitation (AGO-CLIP) data to identify pan-tumour microRNA drivers of cancer. Through this analysis, we show a pan-cancer, coregulated oncogenic microRNA ‘superfamily’ consisting of the miR-17, miR-19, miR-130, miR-93, miR-18, miR-455 and miR-210 seed families, which cotargets critical tumour suppressors via a central GUGC core motif. We subsequently define mutations in microRNA target sites using the AGO-CLIP microRNA target atlas and TCGA exome-sequencing data. These combined analyses identify pan-cancer oncogenic cotargeting of the phosphoinositide 3-kinase, TGFβ and p53 pathways by the miR-17-19-130 superfamily members

    Integrated Genomic Analysis of the 8q24 Amplification in Endometrial Cancers Identifies ATAD2 as Essential to MYC-Dependent Cancers

    Get PDF
    Chromosome 8q24 is the most commonly amplified region across multiple cancer types, and the typical length of the amplification suggests that it may target additional genes to MYC. To explore the roles of the genes most frequently included in 8q24 amplifications, we analyzed the relation between copy number alterations and gene expression in three sets of endometrial cancers (N = 252); and in glioblastoma, ovarian, and breast cancers profiled by TCGA. Among the genes neighbouring MYC, expression of the bromodomain-containing gene ATAD2 was the most associated with amplification. Bromodomain-containing genes have been implicated as mediators of MYC transcriptional function, and indeed ATAD2 expression was more closely associated with expression of genes known to be upregulated by MYC than was MYC itself. Amplifications of 8q24, expression of genes downstream from MYC, and overexpression of ATAD2 predicted poor outcome and increased from primary to metastatic lesions. Knockdown of ATAD2 and MYC in seven endometrial and 21 breast cancer cell lines demonstrated that cell lines that were dependent on MYC also depended upon ATAD2. These same cell lines were also the most sensitive to the histone deacetylase (HDAC) inhibitor Trichostatin-A, consistent with prior studies identifying bromodomain-containing proteins as targets of inhibition by HDAC inhibitors. Our data indicate high ATAD2 expression is a marker of aggressive endometrial cancers, and suggest specific inhibitors of ATAD2 may have therapeutic utility in these and other MYC-dependent cancers

    Characterizing genomic alterations in cancer by complementary functional associations.

    Get PDF
    Systematic efforts to sequence the cancer genome have identified large numbers of mutations and copy number alterations in human cancers. However, elucidating the functional consequences of these variants, and their interactions to drive or maintain oncogenic states, remains a challenge in cancer research. We developed REVEALER, a computational method that identifies combinations of mutually exclusive genomic alterations correlated with functional phenotypes, such as the activation or gene dependency of oncogenic pathways or sensitivity to a drug treatment. We used REVEALER to uncover complementary genomic alterations associated with the transcriptional activation of β-catenin and NRF2, MEK-inhibitor sensitivity, and KRAS dependency. REVEALER successfully identified both known and new associations, demonstrating the power of combining functional profiles with extensive characterization of genomic alterations in cancer genomes

    Digitization Workflows for Flat Sheets and Packets of Plants, Algae, and Fungi

    Get PDF
    Effective workflows are essential components in the digitization of biodiversity specimen collections. To date, no comprehensive, community-vetted workflows have been published for digitizing flat sheets and packets of plants, algae, and fungi, even though latest estimates suggest that only 33% of herbarium specimens have been digitally transcribed, 54% of herbaria use a specimen database, and 24% are imaging specimens. In 2012, iDigBio, the U.S. National Science Foundation’s (NSF) coordinating center and national resource for the digitization of public, nonfederal U.S. collections, launched several working groups to address this deficiency. Here, we report the development of 14 workflow modules with 7–36 tasks each. These workflows represent the combined work of approximately 35 curators, directors, and collections managers representing more than 30 herbaria, including 15 NSF-supported plant-related Thematic Collections Networks and collaboratives. The workflows are provided for download as Portable Document Format (PDF) and Microsoft Word files. Customization of these workflows for specific institutional implementation is encouraged

    Comprehensive Molecular Portraits of Invasive Lobular Breast Cancer

    Get PDF
    Invasive lobular carcinoma (ILC) is the second most prevalent histologic subtype of invasive breast cancer. Here, we comprehensively profiled 817 breast tumors, including 127 ILC, 490 ductal (IDC), and 88 mixed IDC/ILC. Besides E-cadherin loss, the best known ILC genetic hallmark, we identified mutations targeting PTEN, TBX3 and FOXA1 as ILC enriched features. PTEN loss associated with increased AKT phosphorylation, which was highest in ILC among all breast cancer subtypes. Spatially clustered FOXA1 mutations correlated with increased FOXA1 expression and activity. Conversely, GATA3 mutations and high expression characterized Luminal A IDC, suggesting differential modulation of ER activity in ILC and IDC. Proliferation and immune-related signatures determined three ILC transcriptional subtypes associated with survival differences. Mixed IDC/ILC cases were molecularly classified as ILC-like and IDC-like revealing no true hybrid features. This multidimensional molecular atlas sheds new light on the genetic bases of ILC and provides potential clinical options
    • …
    corecore