Phyloepidemiology and adaptive evolution of SARS-CoV2 during the first and second wave of COVID-19 in India

Abstract

Introduction. This study aimed to identify the circulating lineages of SARS-CoV-2, prevalent mutations in these lineages, and the selective pressure on the genome of SARS-CoV-2 during the first and second waves of the COVID-19 pandemic in India. Methods. We downloaded 1,451 sequences from June 2020 to June 2021 from the National Center for Biotechnology Information (NCBI) and the Global Initiative on Sharing All Influenza Database (GISAID). We identified the lineages using the Pangolin COVID-19 lineage assigner. Results. We found 41 circulating lineages in India during the year studied. Thirty-five lineages were circulating during the first wave and twenty during the second wave, including six new lineages. During the first wave in 2020, only one Variant of Concern (Alpha) was found, but during the second wave in 2021, three Variants of Concern (Alpha, Beta, and Delta) were in circulation, as well as one Variant Under Monitoring. The most frequent mutations observed were S: D614G, NSP3: F106F, NSP12b: P314L, ORF3a: Q57H, M: Y71Y, NSP14:C279C, S: D294D, and N: S194L. The ten most mutated samples all belonged to the Delta variant of B.1.617.2 lineage and were found in the second wave. Five mutations in the spike protein (L452R, T478K, E484Q, N501Y, and D614G), responsible for increased transmissibility and reduced neutralization by convalescent sera, were majorly prevalent during the second wave. D614G, L452R, and T478K were present at prevalence rates of 88.25%, 21.04%, and 16.80%, respectively. The major selection was purifying selection, but a few sites in the NSP2, NSP3, NSP13, S protein, ORF3a, and ORF9 evolved under positive selection. Conclusion. We report six novel mutations (three in NSP2 [P129A, V381A, V381F], one in NSP3 [P822S], and one in the S protein [Q23R]) that evolved under positive selection pressure

    Similar works