Search CORE

234,004 research outputs found

CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models

Author: Ding Ying
Jaiswal Ajay
Jiang Xianqian
Kamath Advaith
Kim Yejin
Li Tianhao
Shetty Sandesh
Publication venue
Publication date: 17/04/2023
Field of study

Large pre-trained language models (LLMs) have been shown to have significant potential in few-shot learning across various fields, even with minimal training data. However, their ability to generalize to unseen tasks in more complex fields, such as biology, has yet to be fully evaluated. LLMs can offer a promising alternative approach for biological inference, particularly in cases where structured data and sample size are limited, by extracting prior knowledge from text corpora. Our proposed few-shot learning approach uses LLMs to predict the synergy of drug pairs in rare tissues that lack structured data and features. Our experiments, which involved seven rare tissues from different cancer types, demonstrated that the LLM-based prediction model achieved significant accuracy with very few or zero samples. Our proposed model, the CancerGPT (with

\sim

124M parameters), was even comparable to the larger fine-tuned GPT-3 model (with

\sim

175B parameters). Our research is the first to tackle drug pair synergy prediction in rare tissues with limited data. We are also the first to utilize an LLM-based prediction model for biological reaction prediction tasks

arXiv.org e-Print Archive

Machine learning in the real world with multiple objectives

Author: Bolukbasi Tolga
Publication venue
Publication date: 03/07/2018
Field of study

Machine learning (ML) is ubiquitous in many real-world applications. Existing ML systems are based on optimizing a single quality metric such as prediction accuracy. These metrics typically do not fully align with real-world design constraints such as computation, latency, fairness, and acquisition costs that we encounter in real-world applications. In this thesis, we develop ML methods for optimizing prediction accuracy while accounting for such real-world constraints. In particular, we introduce multi-objective learning in two different setups: resource-efficient prediction and algorithmic fairness in language models. First, we focus on decreasing the test-time computational costs of prediction systems. Budget constraints arise in many machine learning problems. Computational costs limit the usage of many models on small devices such as IoT or mobile phones and increase the energy consumption in cloud computing. We design systems that allow on-the-fly modification of the prediction model for each input sample. These sample-adaptive systems allow us to leverage wide variability in sample complexity where we learn policies for selecting cheap models for low complexity instances and using descriptive models only for complex ones. We utilize multiple--objective approach where one minimizes the system cost while preserving predictive accuracy. We demonstrate significant speed-ups in the fields of computer vision, structured prediction, natural language processing, and deep learning. In the context of fairness, we first demonstrate that a naive application of ML methods runs the risk of amplifying social biases present in data. This danger is particularly acute for methods based on word embeddings, which are increasingly gaining importance in many natural language processing applications of ML. We show that word embeddings trained on Google News articles exhibit female/male gender stereotypes. We demonstrate that geometrically, gender bias is captured by unique directions in the word embedding vector space. To remove bias we formulate a empirical risk objective with fairness constraints to remove stereotypes from embeddings while maintaining desired associations. Using crowd-worker evaluation as well as standard benchmarks, we empirically demonstrate that our algorithms significantly reduces gender bias in embeddings, while preserving its useful properties such as the ability to cluster related concepts

Boston University Institutional Repository (OpenBU)

GINNs:Graph-Informed Neural Networks for Multiscale Physics

Author: Balokas
Berg
Botev
Béguin
Chen
Daniels
Davis
Eldred
Feng
Frangos
Goodfellow
Gu
Harmandaris
Hastie
Karumuri
Koller
Lagaris
Lagaris
Lau
Lee
Lee
Li
Ling
Liu
Mak
Meng
Motamed
Nagasawa
Nassar
Newman
Paszke
Pearl
Pearl
Pearl
Psichogios
Raissi
Scarselli
Sen
Sen
Sen
Sen
Sen
Sirignano
Smith
Sun
Taverniers
Taverniers
Torre
Tripathy
Um
Um
Verbrugge
Wasserman
Wu
Yang
Ying
Zhang
Zhang
Zhou
Zhou
Zhu
Zhu
Publication venue: 'Elsevier BV'
Publication date: 26/06/2020
Field of study

We introduce the concept of a Graph-Informed Neural Network (GINN), a hybrid approach combining deep learning with probabilistic graphical models (PGMs) that acts as a surrogate for physics-based representations of multiscale and multiphysics systems. GINNs address the twin challenges of removing intrinsic computational bottlenecks in physics-based models and generating large data sets for estimating probability distributions of quantities of interest (QoIs) with a high degree of confidence. Both the selection of the complex physics learned by the NN and its supervised learning/prediction are informed by the PGM, which includes the formulation of structured priors for tunable control variables (CVs) to account for their mutual correlations and ensure physically sound CV and QoI distributions. GINNs accelerate the prediction of QoIs essential for simulation-based decision-making where generating sufficient sample data using physics-based models alone is often prohibitively expensive. Using a real-world application grounded in supercapacitor-based energy storage, we describe the construction of GINNs from a Bayesian network-embedded homogenized model for supercapacitor dynamics, and demonstrate their ability to produce kernel density estimates of relevant non-Gaussian, skewed QoIs with tight confidence intervals.Comment: 20 pages, 8 figure

arXiv.org e-Print Archive

Crossref

University of Dundee Online Publications