115 research outputs found

    Clustering Categorical Data: Soft Rounding k-modes

    Full text link
    Over the last three decades, researchers have intensively explored various clustering tools for categorical data analysis. Despite the proposal of various clustering algorithms, the classical k-modes algorithm remains a popular choice for unsupervised learning of categorical data. Surprisingly, our first insight is that in a natural generative block model, the k-modes algorithm performs poorly for a large range of parameters. We remedy this issue by proposing a soft rounding variant of the k-modes algorithm (SoftModes) and theoretically prove that our variant addresses the drawbacks of the k-modes algorithm in the generative model. Finally, we empirically verify that SoftModes performs well on both synthetic and real-world datasets

    Prioritization of candidate genes in "QTL-hotspot" region for drought tolerance in chickpea (Cicer arietinum L.)

    Get PDF
    A combination of two approaches, namely QTL analysis and gene enrichment analysis were used to identify candidate genes in the "QTL-hotspot" region for drought tolerance present on the Ca4 pseudomolecule in chickpea. In the first approach, a high-density bin map was developed using 53,223 single nucleotide polymorphisms (SNPs) identified in the recombinant inbred line (RIL) population of ICC 4958 (drought tolerant) and ICC 1882 (drought sensitive) cross. QTL analysis using recombination bins as markers along with the phenotyping data for 17 drought tolerance related traits obtained over 1-5 seasons and 1-5 locations split the "QTL-hotspot" region into two subregions namely "QTL-hotspot_a" (15 genes) and "QTL-hotspot_b" (11 genes). In the second approach, gene enrichment analysis using significant marker trait associations based on SNPs from the Ca4 pseudomolecule with the above mentioned phenotyping data, and the candidate genes from the refined "QTL-hotspot" region showed enrichment for 23 genes. Twelve genes were found common in both approaches. Functional validation using quantitative real-time PCR (qRT-PCR) indicated four promising candidate genes having functional implications on the effect of "QTL-hotspot" for drought tolerance in chickpea.Sandip M Kale, Deepa Jaganathan, Pradeep Ruperao, Charles Chen, Ramu Punna, Himabindu Kudapa, Mahendar Thudi, Manish Roorkiwal, Mohan AVSK Katta, Dadakhalandar Doddamani, Vanika Garg, P B Kavi Kishor, Pooran M Gaur, Henry T Nguyen, Jacqueline Batley, David Edwards, Tim Sutton and Rajeev K Varshne

    Population Genetics and Structure of a Global Foxtail Millet Germplasm Collection

    Get PDF
    Foxtail millet is one among the most ancient crops of dryland agriculture. It is the second most important crop among millets, grown for grains or forage. Foxtail millet germplasm resources provide reservoirs of novel alleles and genes for crop improvement that have remained mostly unexplored. We genotyped a set of 190 foxtail millet germplasm accessions (including 155 accessions of the foxtail millet core collection) using genotyping-by-sequencing (GBS) for rapid single nucleotide polymorphisms (SNP) characterization to study population genetics and structure, which enable allele mining through association mapping approaches. After filtering a total 350,000 raw SNPs identified across 190 germplasm accessions for Minor Allele Frequency (MAF), coverage for samples and coverage for sites, we retained 181 accessions with 17,714 high quality SNPs with ≥ 5% MAF. Genetic structure analyses revealed that foxtail millet germplasm accessions are structured along both on the basis of races and geographic origin, and the maximum proportion of variation was due to among individuals within populations. Accessions of race indica were less diverse and are highly differentiated from those of maxima and moharia. Genome-wide linkage disequilibrium (LD) analysis showed on an average LD extends up to ~150 kbp, and varied with individual chromosomes. The utility of these data for performing genome-wide association studies was tested with plant pigmentation and days to flowering, and identified significant marker-trait associations. This SNP data provides a foundation for exploration of foxtail millet diversity and for mining novel alleles and mapping genes for economically important traits

    The Genetic Makeup of a Global Barnyard Millet Germplasm Collection

    Get PDF
    Barnyard millet (Echinochloa spp.) is an important crop for many smallholder farmers in southern and eastern Asia. It is valued for its drought tolerance, rapid maturation, and superior nutritional qualities. Despite these characteristics there are almost no genetic or genomic resources for this crop in either cultivated species [E. colona (L.) Link and E. crus-galli (L.) P. Beauv.]. Recently, a core collection of 89 barnyard millet accessions was developed at the genebank at the International Crops Research Institute for the Semi-Arid Tropics (ICRISAT). To enhance the use of this germplasm and genomic research in barnyard millet improvement, we report the genetic characterization of this core collection using whole-genome genotyping-by-sequencing. We identified several thousand single-nucleotide polymorphisms segregating in the core collection, and we use them to show patterns of population structure and phylogenetic relationships among the accessions. We determine that there are probably four population clusters within the E. colona accessions and three such clusters within E. crus-galli. These clusters match phylogenetic relationships but by and large do not correspond to classification into individual races or clusters based on morphology. Geospatial data available for a subset of samples indicates that the clusters probably originate from geographic divisions. In all, these data will be useful to breeders working to improve this crop for smallholder farmers. This work also serves as a case study of how modern genomics can rapidly characterize crops, including ones with little to no prior genetic data

    Cotranscriptional Set2 Methylation of Histone H3 Lysine 36 Recruits a Repressive Rpd3 Complex

    Get PDF
    SummaryThe yeast histone deacetylase Rpd3 can be recruited to promoters to repress transcription initiation. Biochemical, genetic, and gene-expression analyses show that Rpd3 exists in two distinct complexes. The smaller complex, Rpd3C(S), shares Sin3 and Ume1 with Rpd3C(L) but contains the unique subunits Rco1 and Eaf3. Rpd3C(S) mutants exhibit phenotypes remarkably similar to those of Set2, a histone methyltransferase associated with elongating RNA polymerase II. Chromatin immunoprecipitation and biochemical experiments indicate that the chromodomain of Eaf3 recruits Rpd3C(S) to nucleosomes methylated by Set2 on histone H3 lysine 36, leading to deacetylation of transcribed regions. This pathway apparently acts to negatively regulate transcription because deleting the genes for Set2 or Rpd3C(S) bypasses the requirement for the positive elongation factor Bur1/Bur2

    Cotranscriptional Set2 Methylation of Histone H3 Lysine 36 Recruits a Repressive Rpd3 Complex

    Get PDF
    SummaryThe yeast histone deacetylase Rpd3 can be recruited to promoters to repress transcription initiation. Biochemical, genetic, and gene-expression analyses show that Rpd3 exists in two distinct complexes. The smaller complex, Rpd3C(S), shares Sin3 and Ume1 with Rpd3C(L) but contains the unique subunits Rco1 and Eaf3. Rpd3C(S) mutants exhibit phenotypes remarkably similar to those of Set2, a histone methyltransferase associated with elongating RNA polymerase II. Chromatin immunoprecipitation and biochemical experiments indicate that the chromodomain of Eaf3 recruits Rpd3C(S) to nucleosomes methylated by Set2 on histone H3 lysine 36, leading to deacetylation of transcribed regions. This pathway apparently acts to negatively regulate transcription because deleting the genes for Set2 or Rpd3C(S) bypasses the requirement for the positive elongation factor Bur1/Bur2
    corecore