A Transformer-Based Approach for Gene Discovery in Radiation Response Under Data-Sparse Conditions

Kashyap, Sohum

text

oai:digitalcommons.imsa.edu:sir_presentations-2504

A Transformer-Based Approach for Gene Discovery in Radiation Response Under Data-Sparse Conditions

Authors: Sohum Kashyap
Publication date: 17 April 2025
Publisher: DigitalCommons@IMSA

Abstract

This paper investigates the application of Geneformer, a transformer-based model, for identifying genes that cause transitions between radiation levels in data-sparse situations. Traditional differential gene expression (DGE) methods often face limitations when data availability is minimal. Preprocessing was done to leverage high-throughput single-cell RNA sequencing data to ensure accurate analysis of the genes responsible for transitions in irradiated cell states. Statistical techniques, including t-tests, Wilcoxon rank-sum, and logistic regression, were employed to rank gene expression across four radiation exposures (0, 10, 100, and 1000 mGy). The Geneformer transformer-based model was fine-tuned on the tokenized data with hyperparameter optimization. This yielded significant improvements in classification accuracy as validated by two-dimensional embedding representations and in-silico perturbation experiments. When both processes were tested on data subsets consisting of 1024, 256, and 128 cells, the finetuned Geneformer model consistently outperformed the traditional DGE method. Overall, the findings demonstrate how Geneformer detects subtle shifts in gene expression with high precision and reliably identifies key genetic drivers of radiation response, thereby offering a viable alternative to conventional DGE approaches in low-data environments

text

Similar works

Full text

Illinois Mathematics and Science Academy: DigitalCommons@IMSA

oai:digitalcommons.imsa.edu:si...

Last time updated on 22/06/2025

This paper was published in Illinois Mathematics and Science Academy: DigitalCommons@IMSA.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.