317 research outputs found

    ā€œOne code to find them allā€: a perl tool to conveniently parse RepeatMasker output files

    Get PDF
    International audienceBackground: Of the different bioinformatic methods used to recover transposable elements (TEs) in genome sequences, one of the most commonly used procedures is the homology-based method proposed by the RepeatMasker program. RepeatMasker generates several output files, including the .out file, which provides annotations for all detected repeats in a query sequence. However, a remaining challenge consists of identifying the different copies of TEs that correspond to the identified hits. This step is essential for any evolutionary/comparative analysis of the different copies within a family. Different possibilities can lead to multiple hits corresponding to a unique copy of an element, such as the presence of large deletions/insertions or undetermined bases, and distinct consensus corresponding to a single full-length sequence (like for long terminal repeat (LTR)-retrotransposons). These possibilities must be taken into account to determine the exact number of TE copies. Results: We have developed a perl tool that parses the RepeatMasker .out file to better determine the number and positions of TE copies in the query sequence, in addition to computing quantitative information for the different families. To determine the accuracy of the program, we tested it on several RepeatMasker .out files corresponding to two organisms (Drosophila melanogaster and Homo sapiens) for which the TE content has already been largely described and which present great differences in genome size, TE content, and TE families. Conclusions: Our tool provides access to detailed information concerning the TE content in a genome at the family level from the .out file of RepeatMasker. This information includes the exact position and orientation of each copy, its proportion in the query sequence, and its quality compared to the reference element. In addition, our tool allows a user to directly retrieve the sequence of each copy and obtain the same detailed information at the family level when a local library with incomplete TE class/subclass information was used with RepeatMasker. We hope that this tool will be helpful for people working on the distribution and evolution of TEs within genomes

    La traduction et lā€™intraduisible (rĆ©sumĆ© de la communication)

    Get PDF

    Analisis Produk Dan Inovasi Pangan: Bumbu Racik Nasi Goreng Kedelai Hitam (Buked Hitam)

    Get PDF
    Nasi goreng merupakan makanan yang banyak dikonsumsi di Indonesia karena rasanya yang sudah familiar di lidah masyarakat Indonesia. Nasi goreng biasanya dibuat dengan cara menggoreng nasi disertai tambahan bumbu-bumbu dapur seperti bawang merah dan bawang putih. Kandungan gizi dalam nasi goreng yang biasa dikonsumsi biasanya didominasi oleh karbohidrat dan lemak, sehingga nasi goreng ini seringkali dijauhi oleh konsumen yang sedang diet. Selain itu, apabila nasi goring dikonsumsi terlalu banyak tanpa tambahan lauk yang mengandung protein, akan membuat kelebihan berat badan karena karbohidrat dan lemak yang tinggi tidak baik untuk tubuh jika tidak diimbangi dengan kandungan gizi lainnya. Kedelai hitam sebagai salah satu varietas dari kedelai dengan warna kulit hitam dan memiliki berbagai kandungan gizi bias memberikan banyak manfaat bagi tubuh manusia. Kedelai hitam merupakan bahan pangan yang kaya protein, serat, vitamin, mineral, serta antioksidan yang sangat bermanfaat bagi jantung dan tulang, bahkan untuk menurunkan berat badan. Oleh karena dalam eksperimen ini dikembangkan sebuah produk bumbu racik nasi goreng kedelai hitam yang diberi nama ā€œBuked Hitamā€. Produk Buked Hitam yang dihasilkan berupa olahan bumbu pasta yang digunakan untuk membuat nasi goreng. Produk ini merupakan pengembangan produk dari kedelai hitam dengan substitusi bumbu-bumbu yang biasa digunakan untuk membuat nasi goreng. Selain memiliki rasa, aroma, dan warna yang unik ketika diaplikasikan pada nasi goreng, produk ini memiliki kelebihan dibandingkan produk bumbu nasi goreng lain yaitu tinggi protein

    4DXpress: a database for cross-species expression pattern comparisons

    Get PDF
    In the major animal model species like mouse, fish or fly, detailed spatial information on gene expression over time can be acquired through whole mount in situ hybridization experiments. In these species, expression patterns of many genes have been studied and data has been integrated into dedicated model organism databases like ZFIN for zebrafish, MEPD for medaka, BDGP for Drosophila or GXD for mouse. However, a central repository that allows users to query and compare gene expression patterns across different species has not yet been established. Therefore, we have integrated expression patterns for zebrafish, Drosophila, medaka and mouse into a central public repository called 4DXpress (expression database in four dimensions). Users can query anatomy ontology-based expression annotations across species and quickly jump from one gene to the orthologues in other species. Genes are linked to public microarray data in ArrayExpress. We have mapped developmental stages between the species to be able to compare developmental time phases. We store the largest collection of gene expression patterns available to date in an individual resource, reflecting 16 505 annotated genes. 4DXpress will be an invaluable tool for developmental as well as for computational biologists interested in gene regulation and evolution. 4DXpress is available at http://ani.embl.de/4DXpress

    An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs

    Get PDF
    Background: Transcription factors (TFs) control transcription by binding to specific regions of DNA called transcription factor binding sites (TFBSs). The identification of TFBSs is a crucial problem in computational biology and includes the subtask of predicting the location of known TFBS motifs in a given DNA sequence. It has previously been shown that, when scoring matches to known TFBS motifs, interdependencies between positions within a motif should be taken into account. However, this remains a challenging task owing to the fact that sequences similar to those of known TFBSs can occur by chance with a relatively high frequency. Here we present a new method for matching sequences to TFBS motifs based on intuitionistic fuzzy sets (IFS) theory, an approach that has been shown to be particularly appropriate for tackling problems that embody a high degree of uncertainty. Results: We propose SCintuit, a new scoring method for measuring sequence-motif affinity based on IFS theory. Unlike existing methods that consider dependencies between positions, SCintuit is designed to prevent overestimation of less conserved positions of TFBSs. For a given pair of bases, SCintuit is computed not only as a function of their combined probability of occurrence, but also taking into account the individual importance of each single base at its corresponding position. We used SCintuit to identify known TFBSs in DNA sequences. Our method provides excellent results when dealing with both synthetic and real data, outperforming the sensitivity and the specificity of two existing methods in all the experiments we performed. Conclusions: The results show that SCintuit improves the prediction quality for TFs of the existing approaches without compromising sensitivity. In addition, we show how SCintuit can be successfully applied to real research problems. In this study the reliability of the IFS theory for motif discovery tasks is proven

    High-density SNP genotyping array for hexaploid wheat and its secondary and tertiary gene pool

    Get PDF
    In wheat, a lack of genetic diversity between breeding lines has been recognized as a significant block to future yield increases. Species belonging to bread wheat's secondary and tertiary gene pools harbour a much greater level of genetic variability, and are an important source of genes to broaden its genetic base. Introgression of novel genes from progenitors and related species has been widely employed to improve the agronomic characteristics of hexaploid wheat, but this approach has been hampered by a lack of markers that can be used to track introduced chromosome segments. Here, we describe the identification of a large number of single nucleotide polymorphisms that can be used to genotype hexaploid wheat and to identify and track introgressions from a variety of sources. We have validated these markers using an ultra-high-density Axiom(Ā®) genotyping array to characterize a range of diploid, tetraploid and hexaploid wheat accessions and wheat relatives. To facilitate the use of these, both the markers and the associated sequence and genotype information have been made available through an interactive web site

    FlyTED: the Drosophila Testis Gene Expression Database

    Get PDF
    FlyTED, the Drosophila Testis Gene Expression Database, is a biological research database for gene expression images from the testis of the fruit fly Drosophila melanogaster. It currently contains 2762 mRNA in situ hybridization images and ancillary metadata revealing the patterns of gene expression of 817 Drosophila genes in testes of wild type flies and of seven meiotic arrest mutant strains in which spermatogenesis is defective. This database has been built by adapting a widely used digital library repository software system, EPrints (http://eprints.org/software/), and provides both web-based search and browse interfaces, and programmatic access via an SQL dump, OAI-PMH and SPARQL. FlyTED is available at http://www.fly-ted.org/
    • ā€¦
    corecore