358 research outputs found

    Accessing numeric data via flags and tags: A final report on a real world experiment

    Get PDF
    An experiment is reported which: extended the concepts of data flagging and tagging to the aerospace scientific and technical literature; generated experience with the assignment of data summaries and data terms by documentation specialists; and obtained real world assessments of data summaries and data terms in information products and services. Inclusion of data summaries and data terms improved users' understanding of referenced documents from a subject perspective as well as from a data perspective; furthermore, a radical shift in document ordering behavior occurred during the experiment toward proportionately more requests for data-summarized items

    An accurate and interpretable model for siRNA efficacy prediction

    Get PDF
    BACKGROUND: The use of exogenous small interfering RNAs (siRNAs) for gene silencing has quickly become a widespread molecular tool providing a powerful means for gene functional study and new drug target identification. Although considerable progress has been made recently in understanding how the RNAi pathway mediates gene silencing, the design of potent siRNAs remains challenging. RESULTS: We propose a simple linear model combining basic features of siRNA sequences for siRNA efficacy prediction. Trained and tested on a large dataset of siRNA sequences made recently available, it performs as well as more complex state-of-the-art models in terms of potency prediction accuracy, with the advantage of being directly interpretable. The analysis of this linear model allows us to detect and quantify the effect of nucleotide preferences at particular positions, including previously known and new observations. We also detect and quantify a strong propensity of potent siRNAs to contain short asymmetric motifs in their sequence, and show that, surprisingly, these motifs alone contain at least as much relevant information for potency prediction as the nucleotide preferences for particular positions. CONCLUSION: The model proposed for prediction of siRNA potency is as accurate as a state-of-the-art nonlinear model and is easily interpretable in terms of biological features. It is freely available on the web a

    HSP-Wrap: The Design and Evaluation of Reusable Parallelism for a Subclass of Data-Intensive Applications

    Get PDF
    There is an increasing gap between the rate at which data is generated by scientific and non-scientific fields and the rate at which data can be processed by available computing resources. In this paper, we introduce the fields of Bioinformatics and Cheminformatics; two fields where big data has become a problem due to continuing advances in the technologies that drives these fields: such as gene sequencing and small ligand exploration. We introduce high performance computing as a means to process this growing base of data in order to facilitate knowledge discovery. We enumerate goals of the project including reusability, efficiency, reliability, and scalability. We then describe the implementation of a software scheduler which aims to improve input and output performance of a targeted collection of informatics tools, as well as the profiling and optimization needed to tune the software. We evaluate the performance of the software with a scalability study of the Bioinformatics tools BLAST, HMMER, and MUSCLE; as well as the Cheminformatics tool DOCK6

    Estudo de codões de iniciação alternativos em Candida cylindracea

    Get PDF
    Mestrado em Biomedicina MolecularA Candida cylindracea constitui um caso particular do grupo de leveduras CTG-clade - apresenta uma total conversão do codão CUG de leucina (standard) em serina, em vez de o fazer de forma ambígua como os restantes membros do grupo. Para além disso, após a sequenciação e anotação do seu genoma completo e do seu mRNA, verificou-se que a Candida cylindracea possuí uma frequência consideravelmente elevada de genes iniciados pelos codões alternativos CTG e TTG relativamente às outras espécies filogeneticamente próximas, cuja grande maioria dos genes é iniciada por ATG (standard). Durante este trabalho foi validada a anotação do genoma desta espécie de modo a descartar possíveis artefactos, utilizando o MAKER como ferramenta. As sequências anotadas foram introduzidas na plataforma ANACONDA para desvendar algumas das principais características do genoma e do transcriptoma desta espécie. A análise destes dados basou-se em encontar diferenças significativas entre os diferentes tipos de sequências, de acordo com o seu codão de iniciação, tanto no genoma como no transcriptoma. A notória diferença entre a frequencia dos codões de iniciação das sequências de DNA e RNA, por sua vez, abriu portas à especulação acerca da presença de fenómenos de RNA editing. Ao reunir as peças deste puzzle tão singular, espera-se conseguir dar um passo em frente na compreensão do funcionamento do genoma de acordo com a relevância deste fenómeno. Resta para isso entender de que forma estas diferenças poderão estar conectadas e influenciar o genoma. Estudos posteriores com recurso a novas técnicas da era ómica poderão fornecer novos discernimentos nesta materia.Candida cylindracea yeast is a peculiar case within the CTG clade – its total conversion of the CUG leucine codon into serine contrasts with the ambiguous way that the rest of the yeasts belonging to this group decode the CUG codon. Furthermore, after the sequencing and annotation of its complete genome and its mRNA sequences, it was yet ascertained that Candida cylindracea has a substantial frequency of alternative initiation codons, when compared to other phylogenetically close species, where the majority of the genes is started with the standard ATG codon. MAKER was used as annotation tool to validate the previous annotation of Candida cylindracea’s genome and transcriptome in order to forgo possible artifacts. The sequences produced were introduced in the ANACONDA platform to unveil some of the main features of the genome and transcriptome of this species. The analysis of this data was based in finding the significant differences between the distinct types of sequences according to their initiation codon, in both genome and transcriptome levels. The considerable differences between the DNA and the RNA sequences regarding their initiation codon allowed instigating the presence of RNA editing phenomena. Putting it all together, these singular events are expected to yield a better comprehension of the genome functioning. It is, therefore, necessary to understand in which ways these differences may be connected and if they influence the genome. Posterior studies resorting to new techniques of the omics era can provide new insights on this matter.

    Erros na tradução do mRNA em Candida albicans

    Get PDF
    Mestrado em Biologia AplicadaThe genetic code establishes the rules that determine the transfer of genetic information from nucleic acids to proteins. The importance of the genetic code in genome decoding and its high conservation suggests that its evolution is highly restricted or even frozen. Despite this, various prokaryotic and eukaryotic genetic code alterations have been found, showing that the code is surprisingly flexible. For instance, the human pathogen Candida albicans contains an ambiguous tRNACAG that decodes a CUG codon as Ser (97%) and as Leu (3%). To further study ambiguity in other amino acid codons, we have engineered 8 mutant tRNASer that misincorporate Ser at 8 different codons belonging to distinct amino acids families (Glu, Arg, Asn, Cys, Phe, Gln, His and Pro) in Candida albicans. The wild-type tRNA was subjected to site-directed mutagenesis in order to change its anticodon to CUC, CCU, GUU, GCA, GAA, CUG, GUG and GGG. The tRNA stability, the cellular changes and the stress response of the resulting mistranslating strains were evaluated through northern blot analysis, cell transformation efficiency, growth rate and expression of a HSP104-GFP reporter system. A phenotypic screening probing various environmental stress conditions was performed in order to further characterize these strains. Experimental data suggest that these genetic code ambiguities affect fitness negatively in standard growth conditions and introduce growth advantages in presence of stress conditions. Thus, stress response triggered by codon ambiguity increase adaptation potential.O código genético estabelece regras que determinam a transferência de informação genética a partir dos ácidos nucleicos para proteínas. A importância do código genético na descodificação do genoma e sua alta conservação sugere que a sua evolução é altamente restrita. Apesar disso, várias alterações no código genético dos procariotas e eucariotas têm sido encontradas, mostrando que o código é surpreendentemente flexível. Por exemplo, o patogénico humano Candida albicans contém um tRNACAG ambíguo que descodifica o codão CUG como Ser (97%) e como Leu (3%). Para continuar o estudo da ambiguidade noutros codões, induzimos 8 tRNASer mutantes, que incorporam incorretamente o aminoácido serina a 8 codões diferentes, pertencentes a distintas famílias de aminoácidos (Glu, Arg, Asn, Cys, Phe, Gln, His e Pro), em Candida albicans. O tRNA não mutado foi submetido a mutagénese dirigida, a fim de modificar o seu anticodão UGA para CUC, CCU, GUU, GCA, GAA, CUG, GUG e GGG. A estabilidade do tRNA, as alterações celulares e resposta ao stress das estirpes mutantes resultantes foram avaliadas através da análise de Northern blot, da eficiência de transformação das células, da taxa de crescimento e da expressão do sistema repórter HSP104-GFP. Além disso, a caracterização fenotípica em determinadas condições de stress foi realizada com o intuito de caracterizar melhor essas estirpes. Os dados experimentais sugerem que essas ambiguidades ao código genético afetam negativamente a aptidão das células em condições de crescimento normais e introduzem vantagens no crescimento na presença de condições de stress. Assim, a resposta ao stress provocada pela ambiguidade dos codões pode aumentar o potencial de adaptação

    Understanding the pathogenesis of myotonic dystrophy type 1

    Get PDF
    To identify the full range of targets and the pathogenic consequences, we sought to mimic the pathogenesis of myotonic dystrophy type 1 with temporal and spatial control: temporal to reproduce the developmental pathogenesis of the congenital form, and spatial to isolate tissue specific pathology. To do this, we attempted to use the Cre-lox system for the conditional expression of an EGFP reporter-linked expanded CUG repeat RNA in the mouse. Expression of the transgene was controlled by Cre excision of a transcriptional stop, placed upstream of the EGFP-expanded repeat open reading frame. The transgenes were constructed and tested successfully, and a normal length repeat transgenic line was established. Unfortunately generation of the expanded repeat line was not successful. The constructs were used to generate cell-culture models of DM1, in both human and murine cells, which mimicked the nuclear foci formation and MBNL1 co-localisation seen in patient cells. Expression of exogenous MBNL1/GFP fusion protein in this model resulted in an increase in the size of foci, indicating that MBNL1 protein is limiting within the cell, and may possibly play a protective role. The murine DM1 cell-culture model was used to investigate the effects of expanded CUG repeat expression on splicing within the transcriptome. The differential effect between 5 and 250 repeat RNA expression using Affymetrix whole transcript and exon arrays was compared. Using whole genome arrays, 6 genes were down-regulated and 128 upregulated. With exon arrays, 58 genes showed alternative exon usage. Six genes were selected for further bioinformatics analysis: MtmR4, which has possible neuromuscular involvement; Kcnk4, Narg1, Ttyh1 and Bptf, potentially related to brain development; and Cacna1c, a promising candidate for heart conductance defects and sudden death
    • …
    corecore