2,002 research outputs found

    Data Mining for Simple Sequence Repeats in Oil Palm Expressed Sequence Tags

    Get PDF
    Expressed Sequence Tags or ESTs are small pieces of DNA sequence that are generated by sequencing either one or both ends of an expressed gene. ESTs provide researchers with a quick and inexpensive route for discovering new genes, for obtaining data on gene expression and regulation, and for constructing genome maps. Oil palm EST sequences as available in public domain are downloaded. They were grouped and made contigs using CAP3 and Phrap. Microsatellite repeats are located using 5 softwares (MISA, TRA, TROLL, SSRIT, SSR primer). Among the 5 methods MISA is found to be the best. It can elucidate the compound repeat also. Frequency and total number (202) of SSR were detected. Mononucleotide repeat is more abundant especially ‘A/T’ repeats in Oil palm. Flanking primers were designed using primer3, SSR primers. The results of the study are given as an online database ‘MEMCO’ to help Oil palm researchers

    Discovery of a large set of SNP and SSR genetic markers by high-throughput sequencing of pepper (Capsicum annuum)

    Get PDF
    Genetic markers based on single nucleotide polymorphisms (SNPs) are in increasing demand for genome mapping and fingerprinting of breeding populations in crop plants. Recent advances in high-throughput sequencing provide the opportunity for whole-genome resequencing and identification of allelic variants by mapping the reads to a reference genome. However, for many species, such as pepper (Capsicum annuum), a reference genome sequence is not yet available. To this end, we sequenced the C. annuum cv. "Yolo Wonder" transcriptome using Roche 454 pyrosequencing and assembled de novo 23,748 isotigs and 60,370 singletons. Mapping of 10,886,425 reads obtained by the Illumina GA II sequencing of C. annuum cv. "Criollo de Morclos 334" to the "Yolo Wonder" transcriptome allowed for SNP identification. By setting a threshold value that allows selecting reliable SNPs with minimal loss of information, 11,849 reliable SNPs spread across 5919 isotigs were identified. In addition, 853 single sequence repeats were obtained. This information has been made available online

    Species-independent detection of RNA virus by representational difference analysis using non-ribosomal hexanucleotides for reverse transcription

    Get PDF
    A method for the isolation of genomic fragments of RNA virus based on cDNA representational difference analysis (cDNA RDA) was developed. cDNA RDA has been applied for the subtraction of poly(A)(+) RNAs but not for poly(A)(−) RNAs, such as RNA virus genomes, owing to the vast quantity of ribosomal RNAs. We constructed primers for inefficient reverse transcription of ribosomal sequences based on the distribution analysis of hexanucleotide patterns in ribosomal RNA. The analysis revealed that distributions of hexanucleotide patterns in ribosomal RNA and virus genome were different. We constructed 96 hexanucleotides (non-ribosomal hexanucleotides) and used them as mixed primers for reverse transcription of cDNA RDA. A synchronous analysis of hexanucleotide patterns in known viral sequences showed that all the known genomic-size viral sequences include non-ribosomal hexanucleotides. In a model experiment, when non-ribosomal hexanucleotides were used as primers, in vitro transcribed plasmid RNA was efficiently reverse transcribed when compared with ribosomal RNA of rat cells. Using non-ribosomal primers, the cDNA fragments of severe acute respiratory syndrome coronavirus and bovine parainfluenza virus 3 were efficiently amplified by subtracting the cDNA amplicons derived from uninfected cells from those that were derived from virus-infected cells. The results suggest that cDNA RDA with non-ribosomal primers can be used for species-independent detection of viruses, including new viruses

    Mining microsatellite markers from public expressed sequence tags databases for the study of threatened plants

    Get PDF
    Background: Simple Sequence Repeats (SSRs) are widely used in population genetic studies but their classical development is costly and time-consuming. The ever-increasing available DNA datasets generated by high-throughput techniques offer an inexpensive alternative for SSRs discovery. Expressed Sequence Tags (ESTs) have been widely used as SSR source for plants of economic relevance but their application to non-model species is still modest. Methods: Here, we explored the use of publicly available ESTs (GenBank at the National Center for Biotechnology Information-NCBI) for SSRs development in non-model plants, focusing on genera listed by the International Union for the Conservation of Nature (IUCN). We also search two model genera with fully annotated genomes for EST-SSRs, Arabidopsis and Oryza, and used them as controls for genome distribution analyses. Overall, we downloaded 16 031 555 sequences for 258 plant genera which were mined for SSRsand their primers with the help of QDD1. Genome distribution analyses in Oryza and Arabidopsis were done by blasting the sequences with SSR against the Oryza sativa and Arabidopsis thaliana reference genomes implemented in the Basal Local Alignment Tool (BLAST) of the NCBI website. Finally, we performed an empirical test to determine the performance of our EST-SSRs in a few individuals from four species of two eudicot genera, Trifolium and Centaurea. Results: We explored a total of 14 498 726 EST sequences from the dbEST database (NCBI) in 257 plant genera from the IUCN Red List. We identify a very large number (17 102) of ready-to-test EST-SSRs in most plant genera (193) at no cost. Overall, dinucleotide and trinucleotide repeats were the prevalent types but the abundance of the various types of repeat differed between taxonomic groups. Control genomes revealed that trinucleotide repeats were mostly located in coding regions while dinucleotide repeats were largely associated with untranslated regions. Our results from the empirical test revealed considerable amplification success and transferability between congenerics. Conclusions: The present work represents the first large-scale study developing SSRs by utilizing publicly accessible EST databases in threatened plants. Here we provide a very large number of ready-to-test EST-SSR (17 102) for 193 genera. The cross-species transferability suggests that the number of possible target species would be large. Since trinucleotide repeats are abundant and mainly linked to exons they might be useful in evolutionary and conservation studies. Altogether, our study highly supports the use of EST databases as an extremely affordable and fast alternative for SSR developing in threatened plants

    Evaluation of Perfect Microsatellites in Nile Tilapia (Oreochromis niloticus) Genome

    Get PDF
    Microsatellites or simple sequence repeats (SSRs) consist of a sizable part of genomes and play a crucial role in the function of genes and the organization of the genome. The complete availability of a genome sequence for Nile tilapia (Oreochromis niloticus) provides the possibility of accomplishing a genome-wide analysis of SSRs in this species. I analyzed the abundance and density of perfect SSRs in the Nile tilapia genome and observed a sum of 252,047 microsatellites with 1–6 bp nucleotide motifs. This indicates that about 2.7 % of the Nile tilapia whole genome sequence (927.77Mb) is made up of perfect SSRs, with an average length of 135.68bp/Mb. The average density and frequency of perfect SSRs were 271.69 loci/Mb and 5834.46 bp/Mb, respectively. The six classes of perfect SSRs proportional distribution within the Nile tilapia genome were not even. Dinucleotide repeats (40.13 %) with a total count of 101145 of an average length of 26.11 bp happen to be the most abundant class of SSRs, while the percentages of mononucleotide, trinucleotide, tetranucleotide, pentanucleotide, and hexanucleotide repeats was 31.88 %, 11.98 %, 11.52 %, 4.22 %, and 0.26 %, accordingly. The various classes of SSRs repeat differ in their number of repeats with the highest being 95. My results indicate that 21 motifs contain the prevalent categories with a frequency above 1 locus/Mb: A, AAC, AAG, AATAG, AATTC, AC, AG, AGAT, AT, ATCT, ATG, ATGG, ATT, ATTT, C, CCT, CTG, CTTT, GT,  GTTT. View Article DOI: 10.47856/ijaast.2022.v09i10.00

    CCNF mutations in amyotrophic lateral sclerosis and frontotemporal dementia

    Get PDF
    • …
    corecore