7 research outputs found

    Harnessing large language models (LLMs) for candidate gene prioritization and selection.

    Get PDF
    BACKGROUND: Feature selection is a critical step for translating advances afforded by systems-scale molecular profiling into actionable clinical insights. While data-driven methods are commonly utilized for selecting candidate genes, knowledge-driven methods must contend with the challenge of efficiently sifting through extensive volumes of biomedical information. This work aimed to assess the utility of large language models (LLMs) for knowledge-driven gene prioritization and selection. METHODS: In this proof of concept, we focused on 11 blood transcriptional modules associated with an Erythroid cells signature. We evaluated four leading LLMs across multiple tasks. Next, we established a workflow leveraging LLMs. The steps consisted of: (1) Selecting one of the 11 modules; (2) Identifying functional convergences among constituent genes using the LLMs; (3) Scoring candidate genes across six criteria capturing the gene\u27s biological and clinical relevance; (4) Prioritizing candidate genes and summarizing justifications; (5) Fact-checking justifications and identifying supporting references; (6) Selecting a top candidate gene based on validated scoring justifications; and (7) Factoring in transcriptome profiling data to finalize the selection of the top candidate gene. RESULTS: Of the four LLMs evaluated, OpenAI\u27s GPT-4 and Anthropic\u27s Claude demonstrated the best performance and were chosen for the implementation of the candidate gene prioritization and selection workflow. This workflow was run in parallel for each of the 11 erythroid cell modules by participants in a data mining workshop. Module M9.2 served as an illustrative use case. The 30 candidate genes forming this module were assessed, and the top five scoring genes were identified as BCL2L1, ALAS2, SLC4A1, CA1, and FECH. Researchers carefully fact-checked the summarized scoring justifications, after which the LLMs were prompted to select a top candidate based on this information. GPT-4 initially chose BCL2L1, while Claude selected ALAS2. When transcriptional profiling data from three reference datasets were provided for additional context, GPT-4 revised its initial choice to ALAS2, whereas Claude reaffirmed its original selection for this module. CONCLUSIONS: Taken together, our findings highlight the ability of LLMs to prioritize candidate genes with minimal human intervention. This suggests the potential of this technology to boost productivity, especially for tasks that require leveraging extensive biomedical knowledge

    Machine learning identifies a common signature for anti-SSA/Ro60 antibody expression across autoimmune diseases

    Get PDF
    Anti-Ro autoantibodies are among the most frequently detected extractable nuclear antigen autoantibodies, mainly associated with primary Sjögren's syndrome (pSS), systemic lupus erythematosus (SLE) and undifferentiated connective tissue disease (UCTD). Is there a common signature to all patients expressing anti-Ro60 autoantibodies regardless of their disease phenotype?Using high-throughput multi-omics data collected within the cross-sectional cohort from the PRECISESADS IMI project (genetic, epigenomic, transcriptomic, combined with flow cytometric data, multiplexed cytokines, classical serology and clinical data), we assessed by machine learning the integrated molecular profiling of 520 anti-Ro60-positive (anti-Ro60+ ) compared to 511 anti-Ro60-negative (anti-Ro60- ) patients with pSS, SLE and UCTD, and 279 healthy controls (HCs).The selected features for RNA-Seq, DNA methylation and GWAS data allowed a clear separation between anti-Ro60+ and anti-Ro60- patients. The different features selected by machine learning from the anti-Ro60+ patients constitute specific signatures when compared to anti-Ro60- patients and HCs. Remarkably, the transcript z-score of three genes (ATP10A, MX1 and PARP14), presenting an overexpression associated with a hypomethylation and genetic variation, and independently identified by the Boruta algorithm, was clearly higher in anti-Ro60+ patients compared to anti-Ro60- patients in all the diseases. We demonstrate that these signatures, enriched in interferon stimulated genes, were also found in anti-Ro60+ patients with rheumatoid arthritis and systemic sclerosis and remained stable over time and not influenced by treatment.Anti-Ro60+ patients present a specific inflammatory signature regardless of their disease suggesting that a dual therapeutic approach targeting both Ro-associated RNAs and anti-Ro60 autoantibodies should be considered

    Antiphospholipid antibodies in lupus: an update

    No full text
    International audienc

    Machine Learning for the Identification of a Common Signature for Anti-SSA/Ro 60 Antibody Expression Across Autoimmune Diseases.

    No full text
    Anti-Ro autoantibodies are among the most frequently detected extractable nuclear antigen autoantibodies, mainly associated with primary Sjögren's syndrome (SS), systemic lupus erythematosus (SLE), and undifferentiated connective tissue disease (UCTD). This study was undertaken to determine if there is a common signature for all patients expressing anti-Ro 60 autoantibodies regardless of their disease phenotype. Using high-throughput multiomics data collected from the cross-sectional cohort in the PRECISE Systemic Autoimmune Diseases (PRECISESADS) study Innovative Medicines Initiative (IMI) project (genetic, epigenomic, and transcriptomic data, combined with flow cytometry data, multiplexed cytokines, classic serology, and clinical data), we used machine learning to assess the integrated molecular profiling of 520 anti-Ro 60+ patients compared to 511 anti-Ro 60- patients with primary SS, patients with SLE, and patients with UCTD, and 279 healthy controls. The selected clinical features for RNA-Seq, DNA methylation, and genome-wide association study data allowed for a clear distinction between anti-Ro 60+ and anti-Ro 60- patients. The different features selected using machine learning from the anti-Ro 60+ patients constituted specific signatures when compared to anti-Ro 60- patients and healthy controls. Remarkably, the transcript Z score of 3 genes (ATP10A, MX1, and PARP14), presenting with overexpression associated with hypomethylation and genetic variation and independently identified using the Boruta algorithm, was clearly higher in anti-Ro 60+ patients compared to anti-Ro 60- patients regardless of disease type. Our findings demonstrated that these signatures, enriched in interferon-stimulated genes, were also found in anti-Ro 60+ patients with rheumatoid arthritis and those with systemic sclerosis and remained stable over time and were not affected by treatment. Anti-Ro 60+ patients present with a specific inflammatory signature regardless of their disease type, suggesting that a dual therapeutic approach targeting both Ro-associated RNAs and anti-Ro 60 autoantibodies should be considered

    Machine learning identifies a common signature for anti-SSA/Ro60 antibody expression across autoimmune diseases.

    No full text
    International audienceAnti-Ro autoantibodies are among the most frequently detected extractable nuclear antigen autoantibodies, mainly associated with primary Sjögren's syndrome (pSS), systemic lupus erythematosus (SLE) and undifferentiated connective tissue disease (UCTD). Is there a common signature to all patients expressing anti-Ro60 autoantibodies regardless of their disease phenotype
    corecore