234,004 research outputs found

    CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models

    Full text link
    Large pre-trained language models (LLMs) have been shown to have significant potential in few-shot learning across various fields, even with minimal training data. However, their ability to generalize to unseen tasks in more complex fields, such as biology, has yet to be fully evaluated. LLMs can offer a promising alternative approach for biological inference, particularly in cases where structured data and sample size are limited, by extracting prior knowledge from text corpora. Our proposed few-shot learning approach uses LLMs to predict the synergy of drug pairs in rare tissues that lack structured data and features. Our experiments, which involved seven rare tissues from different cancer types, demonstrated that the LLM-based prediction model achieved significant accuracy with very few or zero samples. Our proposed model, the CancerGPT (with ∼\sim 124M parameters), was even comparable to the larger fine-tuned GPT-3 model (with ∼\sim 175B parameters). Our research is the first to tackle drug pair synergy prediction in rare tissues with limited data. We are also the first to utilize an LLM-based prediction model for biological reaction prediction tasks

    Machine learning in the real world with multiple objectives

    Full text link
    Machine learning (ML) is ubiquitous in many real-world applications. Existing ML systems are based on optimizing a single quality metric such as prediction accuracy. These metrics typically do not fully align with real-world design constraints such as computation, latency, fairness, and acquisition costs that we encounter in real-world applications. In this thesis, we develop ML methods for optimizing prediction accuracy while accounting for such real-world constraints. In particular, we introduce multi-objective learning in two different setups: resource-efficient prediction and algorithmic fairness in language models. First, we focus on decreasing the test-time computational costs of prediction systems. Budget constraints arise in many machine learning problems. Computational costs limit the usage of many models on small devices such as IoT or mobile phones and increase the energy consumption in cloud computing. We design systems that allow on-the-fly modification of the prediction model for each input sample. These sample-adaptive systems allow us to leverage wide variability in sample complexity where we learn policies for selecting cheap models for low complexity instances and using descriptive models only for complex ones. We utilize multiple--objective approach where one minimizes the system cost while preserving predictive accuracy. We demonstrate significant speed-ups in the fields of computer vision, structured prediction, natural language processing, and deep learning. In the context of fairness, we first demonstrate that a naive application of ML methods runs the risk of amplifying social biases present in data. This danger is particularly acute for methods based on word embeddings, which are increasingly gaining importance in many natural language processing applications of ML. We show that word embeddings trained on Google News articles exhibit female/male gender stereotypes. We demonstrate that geometrically, gender bias is captured by unique directions in the word embedding vector space. To remove bias we formulate a empirical risk objective with fairness constraints to remove stereotypes from embeddings while maintaining desired associations. Using crowd-worker evaluation as well as standard benchmarks, we empirically demonstrate that our algorithms significantly reduces gender bias in embeddings, while preserving its useful properties such as the ability to cluster related concepts

    GINNs:Graph-Informed Neural Networks for Multiscale Physics

    Get PDF
    We introduce the concept of a Graph-Informed Neural Network (GINN), a hybrid approach combining deep learning with probabilistic graphical models (PGMs) that acts as a surrogate for physics-based representations of multiscale and multiphysics systems. GINNs address the twin challenges of removing intrinsic computational bottlenecks in physics-based models and generating large data sets for estimating probability distributions of quantities of interest (QoIs) with a high degree of confidence. Both the selection of the complex physics learned by the NN and its supervised learning/prediction are informed by the PGM, which includes the formulation of structured priors for tunable control variables (CVs) to account for their mutual correlations and ensure physically sound CV and QoI distributions. GINNs accelerate the prediction of QoIs essential for simulation-based decision-making where generating sufficient sample data using physics-based models alone is often prohibitively expensive. Using a real-world application grounded in supercapacitor-based energy storage, we describe the construction of GINNs from a Bayesian network-embedded homogenized model for supercapacitor dynamics, and demonstrate their ability to produce kernel density estimates of relevant non-Gaussian, skewed QoIs with tight confidence intervals.Comment: 20 pages, 8 figure
    • …
    corecore