9 research outputs found
Evaluation of large language models for discovery of gene set function
Gene set analysis is a mainstay of functional genomics, but it relies on
manually curated databases of gene functions that are incomplete and unaware of
biological context. Here we evaluate the ability of OpenAI's GPT-4, a Large
Language Model (LLM), to develop hypotheses about common gene functions from
its embedded biomedical knowledge. We created a GPT-4 pipeline to label gene
sets with names that summarize their consensus functions, substantiated by
analysis text and citations. Benchmarking against named gene sets in the Gene
Ontology, GPT-4 generated very similar names in 50% of cases, while in most
remaining cases it recovered the name of a more general concept. In gene sets
discovered in 'omics data, GPT-4 names were more informative than gene set
enrichment, with supporting statements and citations that largely verified in
human review. The ability to rapidly synthesize common gene functions positions
LLMs as valuable functional genomics assistants
Drugst.One -- A plug-and-play solution for online systems medicine and network-based drug repurposing
In recent decades, the development of new drugs has become increasingly
expensive and inefficient, and the molecular mechanisms of most pharmaceuticals
remain poorly understood. In response, computational systems and network
medicine tools have emerged to identify potential drug repurposing candidates.
However, these tools often require complex installation and lack intuitive
visual network mining capabilities. To tackle these challenges, we introduce
Drugst.One, a platform that assists specialized computational medicine tools in
becoming user-friendly, web-based utilities for drug repurposing. With just
three lines of code, Drugst.One turns any systems biology software into an
interactive web tool for modeling and analyzing complex protein-drug-disease
networks. Demonstrating its broad adaptability, Drugst.One has been
successfully integrated with 21 computational systems medicine tools. Available
at https://drugst.one, Drugst.One has significant potential for streamlining
the drug discovery process, allowing researchers to focus on essential aspects
of pharmaceutical treatment research.Comment: 45 pages, 6 figures, 7 table
Regeneration and DNA demethylation do not trigger PDX-1 expression in rat hepatocytes
AIM: To explore the possibility that PDX-1 gene is reactivated as a consequence of molecular events that occur during liver regeneration
Recommended from our members
Translating desktop success to the web in the cytoscape project
Cytoscape is an open-source bioinformatics environment for the analysis, integration, visualization, and query of biological networks. In this perspective piece, we describe our project to bring the Cytoscape desktop application to the web while explaining our strategy in ways relevant to others in the bioinformatics community. We examine opportunities and challenges in developing bioinformatics software that spans both the desktop and web, and we describe our ongoing efforts to build a Cytoscape web application, highlighting the principles that guide our development
Transcriptional regulatory networks of circulating immune cells in type 1 diabetes: A community knowledgebase
Investigator-generated transcriptomic datasets interrogating circulating immune cell (CIC) gene expression in clinical type 1 diabetes (T1D) have underappreciated re-use value. Here, we repurposed these datasets to create an open science environment for the generation of hypotheses around CIC signaling pathways whose gain or loss of function contributes to T1D pathogenesis. We firstly computed sets of genes that were preferentially induced or repressed in T1D CICs and validated these against community benchmarks. We then inferred and validated signaling node networks regulating expression of these gene sets, as well as differentially expressed genes in the original underlying T1D case:control datasets. In a set of three use cases, we demonstrated how informed integration of these networks with complementary digital resources supports substantive, actionable hypotheses around signaling pathway dysfunction in T1D CICs. Finally, we developed a federated, cloud-based web resource that exposes the entire data matrix for unrestricted access and re-use by the research community
Recommended from our members
NDEx IQuery: a multi-method network gene set analysis leveraging the Network Data Exchange
MotivationThe investigation of sets of genes using biological pathways is a common task for researchers and is supported by a wide variety of software tools. This type of analysis generates hypotheses about the biological processes that are active or modulated in a specific experimental context.ResultsThe Network Data Exchange Integrated Query (NDEx IQuery) is a new tool for network and pathway-based gene set interpretation that complements or extends existing resources. It combines novel sources of pathways, integration with Cytoscape, and the ability to store and share analysis results. The NDEx IQuery web application performs multiple gene set analyses based on diverse pathways and networks stored in NDEx. These include curated pathways from WikiPathways and SIGNOR, published pathway figures from the last 27 years, machine-assembled networks using the INDRA system, and the new NCI-PID v2.0, an updated version of the popular NCI Pathway Interaction Database. NDEx IQuery's integration with MSigDB and cBioPortal now provides pathway analysis in the context of these two resources.Availability and implementationNDEx IQuery is available at https://www.ndexbio.org/iquery and is implemented in Javascript and Java
Interpretation of cancer mutations using a multiscale map of protein systems.
A major goal of cancer research is to understand how mutations distributed across diverse genes affect common cellular systems, including multiprotein complexes and assemblies. Two challenges—how to comprehensively map such systems and how to identify which are under mutational selection—have hindered this understanding. Accordingly, we created a comprehensive map of cancer protein systems integrating both new and published multi-omic interaction data at multiple scales of analysis. We then developed a unified statistical model that pinpoints 395 specific systems under mutational selection across 13 cancer types. This map, called NeST (Nested Systems in Tumors), incorporates canonical processes and notable discoveries, including a PIK3CA-actomyosin complex that inhibits phosphatidylinositol 3-kinase signaling and recurrent mutations in collagen complexes that promote tumor proliferation. These systems can be used as clinical biomarkers and implicate a total of 548 genes in cancer evolution and progression. This work shows how disparate tumor mutations converge on protein assemblies at different scales