5 research outputs found
LLM Performance Predictors are good initializers for Architecture Search
Large language models (LLMs) have become an integral component in solving a
wide range of NLP tasks. In this work, we explore a novel use case of using
LLMs to build performance predictors (PP): models that, given a specific deep
neural network architecture, predict its performance on a downstream task. We
design PP prompts for LLMs consisting of: (i) role: description of the role
assigned to the LLM, (ii) instructions: set of instructions to be followed by
the LLM to carry out performance prediction, (iii) hyperparameters: a
definition of each architecture-specific hyperparameter and (iv)
demonstrations: sample architectures along with their efficiency metrics and
'training from scratch' performance. For machine translation (MT) tasks, we
discover that GPT-4 with our PP prompts (LLM-PP) can predict the performance of
architecture with a mean absolute error matching the SOTA and a marginal
degradation in rank correlation coefficient compared to SOTA performance
predictors. Further, we show that the predictions from LLM-PP can be distilled
to a small regression model (LLM-Distill-PP). LLM-Distill-PP models
surprisingly retain the performance of LLM-PP largely and can be a
cost-effective alternative for heavy use cases of performance estimation.
Specifically, for neural architecture search (NAS), we propose a Hybrid-Search
algorithm for NAS (HS-NAS), which uses LLM-Distill-PP for the initial part of
search, resorting to the baseline predictor for rest of the search. We show
that HS-NAS performs very similar to SOTA NAS across benchmarks, reduces search
hours by 50% roughly, and in some cases, improves latency, GFLOPs, and model
size
Uncovering the Subtype-Specific Temporal Order of Cancer Pathway Dysregulation
Cancer is driven by genetic mutations that dysregulate pathways important for proper cell function. Therefore, discovering these cancer pathways and their dysregulation order is key to understanding and treating cancer. However, the heterogeneity of mutations between different individuals makes this challenging and requires that cancer progression is studied in a subtype-specific way. To address this challenge, we provide a mathematical model, called Subtype-specific Pathway Linear Progression Model (SPM), that simultaneously captures cancer subtypes and pathways and order of dysregulation of the pathways within each subtype. Experiments with synthetic data indicate the robustness of SPM to problem specifics including noise compared to an existing method. Moreover, experimental results on glioblastoma multiforme and colorectal adenocarcinoma show the consistency of SPM’s results with the existing knowledge and its superiority to an existing method in certain cases. The implementation of our method is available at https://github.com/Dalton386/SPM
On Efficient Approximate Queries over Machine Learning Models
The question of answering queries over ML predictions has been gaining
attention in the database community. This question is challenging because the
cost of finding high quality answers corresponds to invoking an oracle such as
a human expert or an expensive deep neural network model on every single item
in the DB and then applying the query. We develop a novel unified framework for
approximate query answering by leveraging a proxy to minimize the oracle usage
of finding high quality answers for both Precision-Target (PT) and
Recall-Target (RT) queries. Our framework uses a judicious combination of
invoking the expensive oracle on data samples and applying the cheap proxy on
the objects in the DB. It relies on two assumptions. Under the Proxy Quality
assumption, proxy quality can be quantified in a probabilistic manner w.r.t.
the oracle. This allows us to develop two algorithms: PQA that efficiently
finds high quality answers with high probability and no oracle calls, and PQE,
a heuristic extension that achieves empirically good performance with a small
number of oracle calls. Alternatively, under the Core Set Closure assumption,
we develop two algorithms: CSC that efficiently returns high quality answers
with high probability and minimal oracle usage, and CSE, which extends it to
more general settings. Our extensive experiments on five real-world datasets on
both query types, PT and RT, demonstrate that our algorithms outperform the
state-of-the-art and achieve high result quality with provable statistical
guarantees.Comment: Submitted to VLDB 2023, 16 pages, 10 figures; added formal claims for
section