7 research outputs found

    Sweet Sorghum Genotypes Testing in the High Latitude Rainfed Steppes of the Northern Kazakhstan (for Feed and Biofuel)

    Get PDF
    Twenty-eight sweet sorghum (Sorghum bicolor (L.) Moench) genotypes of the different ecological and geographic origins: Kazakhstan, Russia, India, Uzbekistan, and China were tested in the high latitude rainfed conditions of northern Kazakhstan. The genotypes demonstrated high biomass production (up to 100 t·ha-1 and more). The genotypes ripening to full reproductive seeds were selected for seed production and introduction in the northern Kazakhstan. Lactic acid bacteria Lactobacillus plantarum S-1, Streptococcus thermophilus F-1 and Lactococcus lactis F-4 essentially enhance the fermentation process, suppressing undesirable microbiological processes, reducing the loss of nutrient compounds, accelerating in 2 times maturation ensilage process and providing higher quality of the feed product

    LAUNCH OF Q-SYMPHONY BIOINFORMATICS COMPUTING SYSTEM: A HIGH-PERFORMANCE CLUSTER FOR ANALYSIS OF LARGE-SCALE GENOMIC DATASETS

    No full text
    Introduction: One whole human genome, provided by next generation sequencing platforms, in raw format takes 20 to 50 GB. In the course of bioinformatics analysis and data analysis, the data volume increases to 300-500 GB per genome. with an increase in the number of samples, the occupied volume increases. Such a large amount of data required for the analysis of whole genomes demands powerful computing power in the form of servers and data warehouses combined into clusters. We at Laboratory of Bioinformatics and Systems Biology have developed and launched Q-Symphony bioinformatics computing system called (“Qazaq Symphony of Bioinformatics”) for bioinformatics analyses of solving large scale genomic datasets. Materials and methods: The Q-Symphony bioinformatics computing system consists 12high-performance HPE servers: 1control node, 8 compute nodes, 1fat-memory compute node, and 2storage nodes. The system runs on Red Hat Enterprise Linux. The management node controls access to user profiles, data warehouse and Moab Workload Manager. The total number of processing cores is 172, the total amount of RAM is 3072GB, and the total storage capacity is 198 TB, a peak performance of the system of 7.3 TFlops. All nodes use high-speed Infiniband network connections, which allow the data exchange between nodes at 100 Gbps speed. The computational capabilities of the Q-symphony system allow us to evenly distribute resources for each task performed, monitor the load on processor and memory resources in real time, and queue and execute sequentially large lists of tasks. Results: Benchmark measurements performed on Q-symphony system showed an increase of subtasks execution from 15 to 54 times compared to standard solutions built on similar computational processors. Conclusion: The presence of Q-Symphony, well-established and proven bioinformatics methods will make it possible to successfully analyze large-scale human genomic data and determine structural genomic variants and carry out complex comparative and population analysis

    META-ANALYSIS OF CANCER TRANSCRIPTOMES USING INDEPENDENT COMPONENT ANALYSIS

    No full text
    Introduction: Independent Component Analysis (ICA) is a matrix factorization method for data dimension reduction. ICA has been widely applied for the analysis of transcriptomic data for blind separation of biological, environmental and technical factors affecting gene expression. This study aimed to analyze cancer data using the ICA for identification and comprehensive analysis of reproducible signaling pathways and molecular signatures in cancer. Materials and Methods: In this study, four independent cancer transcriptomic datasets GSE26886, GSE69925, GSE32701and GSE21293 (Affymetrix) from GEO databases were used. R Bioconductor and Matlab have been used for normalization. A bioinformatics tool «BiODICA - Independent Component Analysis of Big Omics Data» was applied to compute independent components (ICs). Gene Set Enrichment Analysis (GSEA) and ToppGene uncovered the most significantly enriched pathways. Construction and visualization of gene networks and graphs were performed using the OFTEN method, Cytoscape and HPRD database. Results: The correlation graph between decompositions into 30 ICs was built with absolute correlation values exceeding 0.15. Clusters of components - pseudocliques were observed in the structure of the correlation graph. Top 500 most contributing genes of each ICs in pseudocliques were mapped to the PPI network to construct signaling pathways for gene interaction. Some cliques were composed of densely interconnected nodes and included components common to most cancer types, while others were common to some of them. Conclusion: The results of this investigation may reveal potential biomarkers of carcinogenesis, functional subsystems in the tumor cells, and helpful in predicting the early development of a tumor

    IDENTIFICATION OF KAZAKH SPECIFIC GENOMIC VARIANTS USING COMPARATIVE GENOMICS ANALYSIS

    No full text
    Introduction: The modern development of high-performance genomic technologies opens up new possibilities for studying the human genome. Large-scale genomic research generates huge amounts of data, the active development of bioinformatics with the availability of modern methods and approaches of analysis makes it possible to create detailed databases and comprehensively study genomic data. One of contemporary task is to study and identify specific genomic variants of population by detailed analysis of complete genome and complete exome data comparison with open large-scale population datasets. Materials and methods: Materials of the study are 14 complete genomes and 125 complete exomes of Kazakhstani individuals. Our dataset was replenished with data from large whole genome population datasets (SGDP, PRJEB26349, HGDP and 1000 Genomes) for comparative population genomics and to search and identify specific genomic variants. The data in the raw format was mapped and aligned on a single reference genome hg19, then genomic variants were searched and an individual map of the found variants was formed for each dataset in the VCF format. For replenished datasets formed a general map of all variants, which were then excluded from the total number variants found for of Kazakh sampling to search for specific genomic variants. Then the filtered variants were annotated and interpreted. Results: For Kazakр whole exomes were found 9 heterozygous or mutant variants unique among formed genomic databases. 7 variants located on the intron region, 1on the upstream and the last variant frameshift deletion on exonic region. For the Kazakh whole genomes were found 4732heterozygous or mutant variants, 517 variants presented among all Kazakh samples and 144 variants were completely mutant. Only 8 SNVs are located at exonic region: 4 synonymous SNV, 3 nonsynonymous SNV, and 1frameshift deletion. Conclusion: We have discovered unique several genomic variants specific for now to the kazakh individuals. These results can serve as a basis for the creation of a Kazakh reference genome, subsequent research and comparative analysis of Kazakh individuals with various populations of the world. Grant references: AP05135430; MES RK

    Can conservation agriculture increase soil carbon sequestration? A modelling approach

    No full text
    Conservation agriculture (CA) involves complex and interactive processes that ultimately determine soil carbon (C) storage, making it difficult to identify clear patterns. To solve these problems, we used the ARMOSA process-based crop model to simulate the contribution of different CA components (minimum soil disturbance, permanent soil cover with crop residues and/or cover crops, and diversification of plant species) to soil organic carbon stock (SOC) sequestration at 0\u201330\u202fcm soil depth and to compare it with SOC evolution under conventional agricultural practices. We simulated SOC changes in three sites located in Central Asia (Almalybak, Kazakhstan), Northern Europe (Jokioinen, Finland) and Southern Europe (Lombriasco, Italy), which have contrasting soils, organic carbon contents, climates, crops and management intensity. Simulations were carried out for the current climate conditions (1998\u20132017) and future climatic scenario (period 2020\u20132040, scenario Representative Concentration Pathway RCP 6.0). Five cropping systems were simulated: conventional systems under ploughing with monoculture and residues removed (Conv\u202f 12\u202fR) or residues retained (Conv\u202f+\u202fR); no-tillage (NT); CA and CA with a cover crop, Italian ryegrass (CA\u202f+\u202fCC). In Conv\u202f 12\u202fR, Conv\u202f+\u202fR and NT, the simulated monocultures were spring barley in Almalybak and Jokioinen, and maize in Lombriasco. In all sites, conventional systems led to SOC decline of 170\u20131000\u202fkg\u202fha 121 yr 121, whereas NT can slightly increase the SOC. CA and CA\u202f+\u202fCC have the potential for a C sequestration rate of 0.4% yr 121 or higher in Almalybak and Jokioinen, and thus, the objective of the \u201c4 per 1000\u201d initiative can be achieved. Cover crops (in CA\u202f+\u202fCC) have a potential for a C sequestration rate of 0.36\u20130.5% yr 121 in Southern Finland and in Southern Kazakhstan under the current climate conditions, and their role will grow in importance in the future. Even if in Lombriasco it was not possible to meet the \u201c4 per 1000\u201d, there was a SOC increase under CA and CA\u202f+\u202fCC. In conclusion, the simultaneous adoption of all the three CA principles becomes more and more relevant in order to accomplish soil C sequestration as an urgent action to combat climate change and to ensure food security
    corecore