21 research outputs found
CoViT: Real-time phylogenetics for the SARS-CoV-2 pandemic using Vision Transformers
Real-time viral genome detection, taxonomic classification and phylogenetic
analysis are critical for efficient tracking and control of viral pandemics
such as Covid-19. However, the unprecedented and still growing amounts of viral
genome data create a computational bottleneck, which effectively prevents the
real-time pandemic tracking. For genomic tracing to work effectively, each new
viral genome sequence must be placed in its pangenomic context. Re-inferring
the full phylogeny of SARS-CoV-2, with datasets containing millions of samples,
is prohibitively slow even using powerful computational resources. We are
attempting to alleviate the computational bottleneck by modifying and applying
Vision Transformer, a recently developed neural network model for image
recognition, to taxonomic classification and placement of viral genomes, such
as SARS-CoV-2. Our solution, CoViT, places SARS-CoV-2 genome accessions onto
SARS-CoV-2 phylogenetic tree with the accuracy of 94.2%. Since CoViT is a
classification neural network, it provides more than one likely placement.
Specifically, one of the two most likely placements suggested by CoViT is
correct with the probability of 97.9%. The probability of the correct placement
to be found among the five most likely placements generated by CoViT is 99.8%.
The placement time is 0.055s per individual genome running on NVIDIAs GeForce
RTX 2080 Ti GPU. We make CoViT available to research community through GitHub:
https://github.com/zuherJahshan/covit.Comment: 11 pages, 4 figures, 2 table