Search CORE

4 research outputs found

A USER-FRIENDLY TOOL FOR SIMPLIFIED GENOMICS DATA MINING FROM LARGE VCF FILES

Author: Daniyarov Asset
Kairov Ulykbek
Karabayev Daniyar
Molkenov Askhat
Seisenova Ainur
Sharip Aigul
Yerulanuly Kaiyrgali
Zhumadilov Zhaxybay
Publication venue: International conference "MODERN PERSPECTIVES FOR BIOMEDICAL SCIENCES: FROM BENCH TO BEDSIDE”; National Laboratory Astana
Publication date: 01/01/2020
Field of study

Introduction: High-throughput sequencing platforms generate a massive amount of high-dimensional genomic datasets that are available for analysis. Modern and user-friendly bioinformatics tools for analysis and interpretation of genomics data becomes essential during the analysis of sequencing data. Variant Call Format (VCF) is a standard format containing genomic information and variants of sequenced samples. Existing tools for processing VCF files don’t usually have an intuitive graphical interface, but instead have just a command-line interface that may be challenging to use for the broader biomedical community interested in genomics data analysis. We present re-Searcher, a new bioinformatics application with a user-friendly GUI developed to simplify genomic data mining from VCF files. Methods: re-Searcher application was written in a Python 3. Pandas library solves the problem of analyzing large VCF files by not loading the whole file directly into RAM, but instead pre-processing it in chunks. Simple and intuitive GUI was built using Tkinter library. Results: The generalized workflow of the re-Searcher consists of several steps: selecting an input file, setting up necessary filtering parameters, data processing, and exporting a filtered output VCF file. re-Searcher browses and opens VCF files with extensions .txt or .vcf, before performing the following filtering and extraction options: header extraction, keyword search, sample extraction, and genotype format conversion. Conclusion: Exploring and analyzing VCF files generated after the bioinformatics processing of sequencing data is one of the important steps performed by researchers during analysis and metaanalysis of genotype/phenotype associations. We have developed and introduced an easy-to-use bioinformatics tool, re-Searcher, with several unique features for mining big VCF files and realized with a simple graphical user interface that makes it easily available for clinicians and researchers without any computational skills. The software publicly available on the GitHub repository (https://github.com/ LabBandSB/re-Searcher

Nazarbayev University Repository

Universal whole-genome Oxford nanopore sequencing of SARS-CoV-2 using tiled amplicons

Author: Aitkulova Akbota
Akilzhanova Ainur
Daniyarov Asset
Kairov Ulykbek
Kalendar Ruslan
Karabayev Daniyar
Otarbay Zhenis
Rakhimova Saule
Sarbassov Dos
Tynyshtykbayeva Nuray
Publication venue
Publication date: 01/06/2023
Field of study

There is need to develop a universally applicable end-to-end viral outbreak sample handling platforms to generate real-time epidemiological information that can be interpreted and applied by public health authorities. Highly sensitive and efficient whole-genome sequencing of the SARS-CoV-2 virus is critical for understanding viral transmission dynamics. Here, we developed a comprehensive multiplexed set of primers adapted for the Oxford Nanopore Rapid Barcoding library kit that allows universal SARS-CoV-2 genome sequencing. This primer set is designed to set up any variants of the primers pool for whole-genome sequencing of SARS-CoV-2 using single- or double-tiled amplicons from 1.2 kb to 4.8 kb with the Oxford Nanopore. This multiplexed set of primers is also applicable for tasks like targeted SARS-CoV-2 genome sequencing. We here proposed an optimized protocol to synthesize cDNA using Maxima H Minus Reverse Transcriptase with a set of SARS-CoV-2 specific primers, which has high yields of cDNA template for RNA and is capable of long-length cDNA synthesis from a wide range of RNA amounts and quality. The protocol proposed allows whole-genome sequencing of the SARS-CoV-2 virus with tiled amplicons up to 4.8 kb on low-titter virus samples and even where RNA degradation has occurred. This protocol reduces the time and cost from RNA to genome sequence compared to the Midnight multiplex PCR method for SARS-CoV-2 genome sequencing using the Oxford Nanopore.Peer reviewe

Directory of Open Access Journals

Helsingin yliopiston digitaalinen arkisto

RE-SEARCHER: GUI-BASED BIOINFORMATICS TOOL FOR SIMPLIFIED GENOMICS DATA MINING OF VCF FILES

Author: Daniyarov Asset
Kabimoldayev Ilyas
Kairov Ulykbek
Karabayev Daniyar
Molkenov Askhat
Seisenova Ainur
Sharip Aigul
Yerulanuly Kaiyrgali
Zhumadilov Zhaxybay
Publication venue: 'PeerJ'
Publication date: 01/05/2021
Field of study

Background. High-throughput sequencing platforms generate a massive amount of high-dimensional genomic datasets that are available for analysis. Modern and user-friendly bioinformatics tools for analysis and interpretation of genomics data becomes essential during the analysis of sequencing data. Different standard data types and file formats have been developed to store and analyze sequence and genomics data. Variant Call Format (VCF) is the most widespread genomics file type and standard format containing genomic information and variants of sequenced samples. Results. Existing tools for processing VCF files don’t usually have an intuitive graphical interface, but instead have just a command-line interface that may be challenging to use for the broader biomedical community interested in genomics data analysis. re-Searcher solves this problem by pre-processing VCF files by chunks to not load RAM of computer. The tool can be used as standalone user-friendly multiplatform GUI application as well as web application (https://nla-lbsb.nu.edu.kz). The software including source code as well as tested VCF files and additional information are publicly available on the GitHub repository (https://github.com/LabBandSB/re-Searcher)

Directory of Open Access Journals

Nazarbayev University Repository

Recommended from our members

WHOLE-GENOME SEQUENCING DATA OF KAZAKH INDIVIDUALS

Author: Akilzhanova Ainur
D.Terwilliger Joseph D.
Daniyarov Asset
H.Lee Joseph
Kairov Ulykbek
Karabayev Daniyar
Kozhamkulov Ulan
Molkenov Askhat
Rakhimova Saule
Sharip Aigul
Zhumadilov Zhaxybay
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2021
Field of study

Kazakhstan is a Central Asian crossroad of European and Asian populations situated along the way of the Great Silk Way. The territory of Kazakhstan has historically been inhabited by nomadic tribes and today is the multi-ethnic country with the dominant Kazakh ethnic group. We sequenced and analyzed the whole-genomes of five ethnic healthy Kazakh individuals with high coverage using next-generation sequencing platform. This whole-genome sequence data of healthy Kazakh individuals can be a valuable reference for biomedical studies investigating disease associations and population-wide genomic studies of ethnically diverse Central Asian region...

Columbia University Academic Commons

Nazarbayev University Repository