692 research outputs found

    Whole genome re-sequencing reveals genome-wide variations among parental lines of 16 mapping populations in chickpea (Cicer arietinum L.)

    Get PDF
    Background Chickpea (Cicer arietinum L.) is the second most important grain legume cultivated by resource poor farmers in South Asia and Sub-Saharan Africa. In order to harness the untapped genetic potential available for chickpea improvement, we re-sequenced 35 chickpea genotypes representing parental lines of 16 mapping populations segregating for abiotic (drought, heat, salinity), biotic stresses (Fusarium wilt, Ascochyta blight, Botrytis grey mould, Helicoverpa armigera) and nutritionally important (protein content) traits using whole genome re-sequencing approach. Results A total of 192.19 Gb data, generated on 35 genotypes of chickpea, comprising 973.13 million reads, with an average sequencing depth of ~10 X for each line. On an average 92.18 % reads from each genotype were aligned to the chickpea reference genome with 82.17 % coverage. A total of 2,058,566 unique single nucleotide polymorphisms (SNPs) and 292,588 Indels were detected while comparing with the reference chickpea genome. Highest number of SNPs were identified on the Ca4 pseudomolecule. In addition, copy number variations (CNVs) such as gene deletions and duplications were identified across the chickpea parental genotypes, which were minimum in PI 489777 (1 gene deletion) and maximum in JG 74 (1,497). A total of 164,856 line specific variations (144,888 SNPs and 19,968 Indels) with the highest percentage were identified in coding regions in ICC 1496 (21 %) followed by ICCV 97105 (12 %). Of 539 miscellaneous variations, 339, 138 and 62 were inter-chromosomal variations (CTX), intra-chromosomal variations (ITX) and inversions (INV) respectively. Conclusion Genome-wide SNPs, Indels, CNVs, PAVs, and miscellaneous variations identified in different mapping populations are a valuable resource in genetic research and helpful in locating genes/genomic segments responsible for economically important traits. Further, the genome-wide variations identified in the present study can be used for developing high density SNP arrays for genetics and breeding applications

    Soil Classification and Crop Prediction Using Machine

    Get PDF
    Soil classification is a major problem and a heated topic in many countries. The world's population is drastically increasing at an alarming rate which in turn makes the demand for food crops. Farmers are forced to block soil cultivation since their conventional methods are insufficient to fulfil escalating needs. To optimize agricultural output, farmers must understand the best soil type for a certain crop, which has an impact on growing food demand. There areseveral methods for categorizing soil in a scientific way, but each has its own set of disadvantages, such as time and effort. Computer-based soil classification approaches are essential since they will aid farmers in the field and will be quick. Advanced Machine Learning technique-based soil classification methodologies can be used to classify soil and extract various featuresfrom it

    NGS-QCbox and Raspberry for Parallel, Automated and Rapid Quality Control Analysis of Large-Scale Next Generation Sequencing (Illumina) Data

    Get PDF
    Rapid popularity and adaptation of next generation sequencing (NGS) approaches have generated huge volumes of data. High throughput platforms like Illumina HiSeq produce terabytes of raw data that requires quick processing. Quality control of the data is an important component prior to the downstream analyses. To address these issues, we have developed a quality control pipeline, NGS-QCbox that scales up to process hundreds or thousands of samples. Raspberry is an in-house tool, developed in C language utilizing HTSlib (v1.2.1) (http://htslib.org), for computing read/base level statistics. It can be used as stand-alone application and can process both compressed and uncompressed FASTQ format files. NGS-QCbox integrates Raspberry with other open-source tools for alignment (Bowtie2), SNP calling (SAMtools) and other utilities (bedtools) towards analyzing raw NGS data at higher efficiency and in high-throughput manner. The pipeline implements batch processing of jobs using Bpipe (https://github.com/ssadedin/bpipe) in parallel and internally, a fine grained task parallelization utilizing OpenMP. It reports read and base statistics along with genome coverage and variants in a user friendly format. The pipeline developed presents a simple menu driven interface and can be used in either quick or complete mode. In addition, the pipeline in quick mode outperforms in speed against other similar existing QC pipeline/tools. The NGS-QCbox pipeline, Raspberry tool and associated scripts are made available at the URL https://github.com/CEG-ICRISAT/NGS-QCbox and https://github.com/ CEG-ICRISAT/Raspberry for rapid quality control analysis of large-scale next generation sequencing (Illumina) data

    Prioritization of candidate genes in "QTL-hotspot" region for drought tolerance in chickpea (Cicer arietinum L.)

    Get PDF
    A combination of two approaches, namely QTL analysis and gene enrichment analysis were used to identify candidate genes in the "QTL-hotspot" region for drought tolerance present on the Ca4 pseudomolecule in chickpea. In the first approach, a high-density bin map was developed using 53,223 single nucleotide polymorphisms (SNPs) identified in the recombinant inbred line (RIL) population of ICC 4958 (drought tolerant) and ICC 1882 (drought sensitive) cross. QTL analysis using recombination bins as markers along with the phenotyping data for 17 drought tolerance related traits obtained over 1-5 seasons and 1-5 locations split the "QTL-hotspot" region into two subregions namely "QTL-hotspot_a" (15 genes) and "QTL-hotspot_b" (11 genes). In the second approach, gene enrichment analysis using significant marker trait associations based on SNPs from the Ca4 pseudomolecule with the above mentioned phenotyping data, and the candidate genes from the refined "QTL-hotspot" region showed enrichment for 23 genes. Twelve genes were found common in both approaches. Functional validation using quantitative real-time PCR (qRT-PCR) indicated four promising candidate genes having functional implications on the effect of "QTL-hotspot" for drought tolerance in chickpea.Sandip M Kale, Deepa Jaganathan, Pradeep Ruperao, Charles Chen, Ramu Punna, Himabindu Kudapa, Mahendar Thudi, Manish Roorkiwal, Mohan AVSK Katta, Dadakhalandar Doddamani, Vanika Garg, P B Kavi Kishor, Pooran M Gaur, Henry T Nguyen, Jacqueline Batley, David Edwards, Tim Sutton and Rajeev K Varshne

    Association of Gene with Cytoplasmic Male Sterility in Pigeonpea

    Get PDF
    Cytoplasmic male sterility (CMS) has been exploited in the commercial pigeonpea [Cajanus cajan (L.) Millsp.] hybrid breeding system; however, the molecular mechanism behind this system is unknown. To understand the underlying molecular mechanism involved in A4 CMS system derived from C. cajanifolius (Haines) Maesen, 34 mitochondrial genes were analyzed for expression profiling and structural variation analysis between CMS line (ICRISAT Pigeonpea A line, ICPA 2039) and its cognate maintainer (ICPB 2039). Expression profiling of 34 mitochondrial genes revealed nine genes with significant fold differential gene expression at P ≤ 0.01, including one gene, nad4L, with 1366-fold higher expression in CMS line as compared with the maintainer. Structural variation analysis of these mitochondrial genes identified length variation between ICPA 2039 and ICPB 2039 for nad7a (subunit of nad7 gene). Sanger sequencing of nad4L and nad7a genes in the CMS and the maintainer lines identified two single nucleotide polymorphisms (SNPs) in upstream region of nad4L and a deletion of 10 bp in nad7a in the CMS line. Protein structure evaluation showed conformational changes in predicted protein structures for nad7a between ICPA 2039 and ICPB 2039 lines. All above analyses indicate association of nad7a gene with the CMS for A4 cytoplasm in pigeonpea. Additionally, one polymerase chain reaction (PCR) based Indel marker (nad7a_del) has been developed and validated for testing genetic purity of A4 derived CMS lines to strengthen the commercial hybrid breeding program in pigeonpea

    Draft genome sequence of Sclerospora graminicola, the pearl millet downy mildew pathogen:Genome sequence of pearl millet downy mildew pathogen

    Get PDF
    Sclerospora graminicola pathogen is one of the most important biotic production constraints of pearl millet worldwide. We report a de novo whole genome assembly and analysis of pathotype 1. The draft genome assembly contained 299,901,251 bp with 65,404 genes. Pearl millet [Pennisetum glaucum (L.) R. Br.], is an important crop of the semi-arid and arid regions of the world. It is capable of growing in harsh and marginal environments with highest degree of tolerance to drought and heat among cereals (1). Downy mildew is the most devastating disease of pearl millet caused by Sclerospora graminicola (sacc. Schroet), particularly on genetically uniform hybrids. Estimated annual grain yield loss due to downy mildew is approximately 10?80 % (2-7). Pathotype 1 has been reported to be the highly virulent pathotype of Sclerospora graminicola in India (8). We report a de novo whole genome assembly and analysis of Sclerospora graminicola pathotype 1 from India. A susceptible pearl millet genotype Tift 23D2B1P1-P5 was used for obtaining single-zoospore isolates from the original oosporic sample. The library for whole genome sequencing was prepared according to the instructions by NEB ultra DNA library kit for Illumina (New England Biolabs, USA). The libraries were normalised, pooled and sequenced on Illumina HiSeq 2500 (Illumina Inc., San Diego, CA, USA) platform at 2 x100 bp length. Mate pair (MP) libraries were prepared using the Nextera mate pair library preparation kit (Illumina Inc., USA). 1 ?g of Genomic DNA was subject to tagmentation and was followed by strand displacement. Size selection tagmented/strand displaced DNA was carried out using AmpureXP beads. The libraries were validated using an Agilent Bioanalyser using DNA HS chip. The libraries were normalised, pooled and sequenced on Illumina MiSeq (Illumina Inc., USA) platform at 2 x300 bp length. The whole genome sequencing was performed by sequencing of 7.38 Gb with 73,889,924 paired end reads from paired end library, and 1.15 Gb with 3,851,788 reads from mate pair library generated from Illumina HiSeq2500 and Illumina MiSeq, respectively. The sequences were assembled using various assemblers like ABySS, MaSuRCA, Velvet, SOAPdenovo2, and ALLPATHS-LG. The assembly generated by MaSuRCA (9) algorithm was observed superior over other algorithms and hence used for scaffolding using SSPACE. Assembled draft genome sequence of S. graminicola pathotype 1 was 299,901,251 bp long, with a 47.2 % GC content consisting of 26,786 scaffolds with N50 of 17,909 bp with longest scaffold size of 238,843 bp. The overall coverage was 40X. The draft genome sequence was used for gene prediction using AUGUSTUS. The completeness of the assembly was investigated using CEGMA and revealed 92.74% proteins completely present and 95.56% proteins partially present, while BUSCO fungal dataset indicated 64.9% complete, 12.4% fragmented, 22.7% missing out of 290 BUSCO groups. A total of 52,285 predicted genes were annotated using BLASTX and 38,120 genes were observed with significant BLASTX match. Repetitive element analysis in the assembly revealed 8,196 simple repeats, 1,058 low complexity repeats and 5,562 dinucleotide to hexanucleotide microsatellite repeats.publishersversionPeer reviewe

    CicArVarDB: SNP and InDel database for advancing genetics research and breeding applications in chickpea

    Get PDF
    Molecular markers are valuable tools for breeders to help accelerate crop improvement. High throughput sequencing technologies facilitate the discovery of large-scale variations such as single nucleotide polymorphisms (SNPs) and simple sequence repeats (SSRs). Sequencing of chickpea genome along with re-sequencing of several chickpea lines has enabled the discovery of 4.4 million variations including SNPs and InDels. Here we report a repository of 1.9 million variations (SNPs and InDels) anchored on eight pseudomolecules in a custom database, referred as CicArVarDB that can be accessed at http://cicarvardb.icrisat.org/. It includes an easy interface for users to select variations around specific regions associated with quantitative trait loci, with embedded webBLAST search and JBrowse visualisation. We hope that this database will be immensely useful for the chickpea research community for both advancing genetics research as well as breeding applications for crop improvement

    Matrix-assisted laser desorption ionization hydrogen/deuterium exchange studies to probe peptide conformational changes

    Get PDF
    AbstractHydrogen/deuterium (H/D) exchange chemistry monitored by matrix-assisted laser desorption ionization time-of-flight (MALDI-TOF) mass spectrometry is used to study solution phase conformational changes of bradykinin, α-melanocyte stimulating hormone, and melittin as water is added to methanol-d4, acetonitrile, and isopropanol-d8 solutions. The results are interpreted in terms of a preference for the peptides to acquire more compact conformations in organic solvents as compared to the random conformations. Our interpretation is supported by circular dichroism spectra of the peptides in the same solvent systems and by previously published structural data for the peptides. These results demonstrate the utility of MALDI-TOF as a method to monitor the H/D exchange chemistry of peptides and investigations of solution-phase conformations of biomolecules
    corecore