81 research outputs found

    Harnessing large language models (LLMs) for candidate gene prioritization and selection.

    Get PDF
    BACKGROUND: Feature selection is a critical step for translating advances afforded by systems-scale molecular profiling into actionable clinical insights. While data-driven methods are commonly utilized for selecting candidate genes, knowledge-driven methods must contend with the challenge of efficiently sifting through extensive volumes of biomedical information. This work aimed to assess the utility of large language models (LLMs) for knowledge-driven gene prioritization and selection. METHODS: In this proof of concept, we focused on 11 blood transcriptional modules associated with an Erythroid cells signature. We evaluated four leading LLMs across multiple tasks. Next, we established a workflow leveraging LLMs. The steps consisted of: (1) Selecting one of the 11 modules; (2) Identifying functional convergences among constituent genes using the LLMs; (3) Scoring candidate genes across six criteria capturing the gene\u27s biological and clinical relevance; (4) Prioritizing candidate genes and summarizing justifications; (5) Fact-checking justifications and identifying supporting references; (6) Selecting a top candidate gene based on validated scoring justifications; and (7) Factoring in transcriptome profiling data to finalize the selection of the top candidate gene. RESULTS: Of the four LLMs evaluated, OpenAI\u27s GPT-4 and Anthropic\u27s Claude demonstrated the best performance and were chosen for the implementation of the candidate gene prioritization and selection workflow. This workflow was run in parallel for each of the 11 erythroid cell modules by participants in a data mining workshop. Module M9.2 served as an illustrative use case. The 30 candidate genes forming this module were assessed, and the top five scoring genes were identified as BCL2L1, ALAS2, SLC4A1, CA1, and FECH. Researchers carefully fact-checked the summarized scoring justifications, after which the LLMs were prompted to select a top candidate based on this information. GPT-4 initially chose BCL2L1, while Claude selected ALAS2. When transcriptional profiling data from three reference datasets were provided for additional context, GPT-4 revised its initial choice to ALAS2, whereas Claude reaffirmed its original selection for this module. CONCLUSIONS: Taken together, our findings highlight the ability of LLMs to prioritize candidate genes with minimal human intervention. This suggests the potential of this technology to boost productivity, especially for tasks that require leveraging extensive biomedical knowledge

    An interactive web application for the dissemination of human systems immunology data

    Get PDF
    International audienceBackground: Systems immunology approaches have proven invaluable in translational research settings. The current rate at which large-scale datasets are generated presents unique challenges and opportunities. Mining aggregates of these datasets could accelerate the pace of discovery, but new solutions are needed to integrate the heterogeneous data types with the contextual information that is necessary for interpretation. In addition, enabling tools and technologies facilitating investigators' interaction with large-scale datasets must be developed in order to promote insight and foster knowledge discovery. Methods: State of the art application programming was employed to develop an interactive web application for browsing and visualizing large and complex datasets. A collection of human immune transcriptome datasets were loaded alongside contextual information about the samples. Results: We provide a resource enabling interactive query and navigation of transcriptome datasets relevant to human immunology research. Detailed information about studies and samples are displayed dynamically; if desired the associated data can be downloaded. Custom interactive visualizations of the data can be shared via email or social media. This application can be used to browse context-rich systems-scale data within and across systems immunology studies. This resource is publicly available online at [Gene Expression Browser Landing Page (https://gxb.benaroyaresearch.org/dm3/landing.gsp)]. The source code is also available openly [Gene Expression Browser Source Code (https://github.com/BenaroyaResearch/gxbrowser)]. Conclusions: We have developed a data browsing and visualization application capable of navigating increasingly large and complex datasets generated in the context of immunological studies. This intuitive tool ensures that, whether taken individually or as a whole, such datasets generated at great effort and expense remain interpretable and a ready source of insight for years to come

    Development of a fixed module repertoire for the analysis and interpretation of blood transcriptome data.

    Get PDF
    As the capacity for generating large-scale molecular profiling data continues to grow, the ability to extract meaningful biological knowledge from it remains a limitation. Here, we describe the development of a new fixed repertoire of transcriptional modules, BloodGen3, that is designed to serve as a stable reusable framework for the analysis and interpretation of blood transcriptome data. The construction of this repertoire is based on co-clustering patterns observed across sixteen immunological and physiological states encompassing 985 blood transcriptome profiles. Interpretation is supported by customized resources, including module-level analysis workflows, fingerprint grid plot visualizations, interactive web applications and an extensive annotation framework comprising functional profiling reports and reference transcriptional profiles. Taken together, this well-characterized and well-supported transcriptional module repertoire can be employed for the interpretation and benchmarking of blood transcriptome profiles within and across patient cohorts. Blood transcriptome fingerprints for the 16 reference cohorts can be accessed interactively via: https://drinchai.shinyapps.io/BloodGen3Module/

    toxins Endothelial Toxicity of High Glucose and its by-Products in Diabetic Kidney Disease

    No full text
    International audienceAlterations of renal endothelial cells play a crucial role in the initiation and progression of diabetic kidney disease. High glucose per se, as well as glucose by-products, induce endothelial dysfunction in both large vessels and the microvasculature. Toxic glucose by-products include advanced glycation end products (AGEs), a group of modified proteins and/or lipids that become glycated after exposure to sugars, and glucose metabolites produced via the polyol pathway. These glucose-related endothelio-toxins notably induce an alteration of the glomerular filtration barrier by increasing the permeability of glomerular endothelial cells, altering endothelial glycocalyx, and finally, inducing endothelial cell apoptosis. The glomerular endothelial dysfunction results in albuminuria. In addition, high glucose and by-products impair the endothelial repair capacities by reducing the number and function of endothelial progenitor cells. In this review, we summarize the mechanisms of renal endothelial toxicity of high glucose/glucose by-products, which encompass changes in synthesis of growth factors like TGF-β and VEGF, induction of oxidative stress and inflammation, and reduction of NO bioavailability. We finally present potential therapies to reduce endothelial dysfunction in diabetic kidney disease

    Accumulation of protein-bound uremic toxins: the kidney remains the leading culprit in the gut-liver-kidney axis

    No full text
    International audienceProtein-bound uremic toxins (PBUTs) accumulate in chronic kidney disease (CKD) and are poorly removed by dialysis. Gryp et al. demonstrated that fecal bacteria from patients with CKD do not produce more PBUTs than do those from healthy controls and that the accumulation of PBUTs, as CKD progresses, is mainly due to their reduced renal elimination by glomerular filtration and tubular secretion. This work underlines the importance of studying the metabolism of PBUTs along the diet-gut-liver-kidney axis. Copyright (C) 2020, International Society of Nephrology

    Characteristics and information searched for by French patients with ă systemic lupus erythematosus: A web-community data-driven online survey

    No full text
    International audienceIntroduction: To provide information about the needs of patients with ă systemic lupus erythematosus (SLE) using Carenity, the first European ă online platform for patients with chronic diseases. Methods: At one year ă after its creation, all posts from the Carenity SLE community were ă collected and analysed. A focused cross-sectional online survey was ă performed. Results: The SLE community included 521 people (93% females; ă mean age: 39.8 years). Among a total of 6702 posts, 2232 were classified ă according to disease-related topics. The 10 most common topics were ă `lupus and...' either `treatment', `fatigue', `entourage', `sun ă exposure', `diagnosis', `autoimmune diseases', `pregnancy', ă `contraception', `symptoms' or `sexuality'. 112 SLE patients ă participated in the online survey. At the time of diagnosis, only 17 ă (15%) patients had heard of SLE and 84 (75%) expressed a need for more ă information on outcomes (27%), treatments (27%), daily life (14%), ă patients' associations (11%), symptoms (8%), the disease (8%) and ă psychosocial aspects (7%). When treatment was initiated, 48 patients ă (43%) would have liked more information about side effects (46%), ă long-term effects (21%), treatment duration/cessation (12.5%) and type ă (10%) and mechanism of action (8%) of treatments. All participants ă except one had used the internet to find information about SLE. Sources ă of information included healthcare providers (51%/61%/67%), ă journals/magazines (7%/12%/6%), lupus Websites (51%/77%/40%), web ă forums/blogs (34%/53%/19%), patients' associations (11%/23%/9%) ă accessed at `just before diagnosis', `just after diagnosis' and `before ă treatment initiation'. Conclusions: Online patient communities provide ă original unbiased information that can help improve provision of ă information to SLE patients

    Serum GlycA Level Is Elevated in Active Systemic Lupus Erythematosus and Correlates to Disease Activity and Lupus Nephritis Severity

    No full text
    OBJECTIVE: Reliable non-invasive biomarkers are needed to assess disease activity and prognosis in patients with systemic lupus erythematosus (SLE). Glycoprotein acetylation (GlycA), a novel biomarker for chronic inflammation, has been reported to be increased in several inflammatory diseases. We investigated the relevance of serum GlycA in SLE patients exhibiting various levels of activity and severity, especially with regards to renal involvement. METHODS: Serum GlycA was measured by nuclear magnetic resonance spectroscopy in samples from well characterized SLE patients and from both healthy controls and patients with other kidney diseases (KD). Disease activity was evaluated using the Systemic Lupus Erythematosus Activity Index 2000 (SLEDAI-2K). Renal severity was assessed by kidney biopsy. RESULTS: Serum GlycA was elevated in active (n = 105) compared to quiescent SLE patients (n = 39, p < 10-6), healthy controls (n = 20, p = 0.009) and KD controls (n = 21, p = 0.04), despite a more severely altered renal function in the latter. GlycA level was correlated to disease activity (SLEDAI-2K, ρ = 0.37, p < 10-4), Creactive protein, neutrophil count, triglyceride levels, proteinuria and inversely to serum albumin. In patients with biopsy-proven lupus nephritis (LN), GlycA levels were higher in proliferative (n = 26) than non-proliferative LN (n = 10) in univariate analysis (p = 0.04), and was shown to predict proliferative LN independently of renal parameters, immunological activity, neutrophil count and daily corticosteroid dosage by multivariate analysis (p < 5 × 10-3 for all models). In LN patients with repeated longitudinal GlycA measurement (n = 11), GlycA varied over time and seemed to peak at the time of the flare. CONCLUSIONS: GlycA, as a summary measure for different inflammatory processes, could be a valuable biomarker of disease activity in patients with SLE, and a non-invasive biomarker of pathological severity in the context of LN.status: publishe
    corecore