7 research outputs found

    Aplicação da inteligência artificial na anotação automática de genomas bacterianos

    Get PDF
    Orientador : Prof. Dr. Fábio de Oliveira PedrosaCo-Orientador: Prof. Dr. Roberto Tadeu RaittzDissertação (mestrado) - Universidade Federal do Paraná, Setor de Educação Profissional e Tecnológica, Programa de Pós-Graduação em Bioinformática. Defesa: Curitiba, 16/02/2012Bibliografia: fls. 81-86Resumo: O propósito da anotação é identificar sequências de DNA codificadoras de RNAs ou proteínas, esse processo é importante porque atribuem funções moleculares aos produtos gênicos. Para isso, são utilizadas ferramentas computacionais de anotação de genes que usam alinhamentos de sequência de proteína ou de DNA com o propósito de identificar genes homólogos e utilizar as informações de banco de dados de domínio público para inferir a função do gene. Embora sejam técnicas eficientes, elas podem estar sujeitas a erros quando realizada sem curadoria de um perito, em particular quando ocorre inexistência de grau de similaridade significativo de uma sequência comparada com outras sequências ou quando o banco de dados é composto por sequências parciais. Além disso, a taxa de erro de anotação pode ser significativamente aumentada quando a sequência de proteína de consulta é nova, compartilhando nenhuma semelhança com qualquer sequência disponível em bases de dados. Por esses motivos, neste trabalho desenvolveu-se uma ferramenta para verificar anotação de genes em genomas completos de bactérias, o programa Bioinformatics Tool Based on Bacterial Genomes Comparison (BOBBLES). Ele realiza a verificação da predição de genes computacionalmente propostos pelo programa Hybrid-Gene Finder (HGF). O programa BOBBLES compara a anotação de um genoma de referência completo de bactérias com os genes identificados pelo programa HGF. Este programa utiliza duas abordagens de comparação de sequências, uma utilizando pesquisas de similaridade de sequência através do programa BlastP e a outra utilizando o programa SILA. Ambas as abordagens servem para decidir se as sequências sugeridas pelo programa HGF foram anotadas corretamente. Para testar a ferramenta BOBBLES, utilizou-se um conjunto composto por 14 genomas bacterianos completos. Foram encontrados 365 novos genes e 101 genes com melhor ou similar grau alinhamento em fase de leitura diferente do genoma de referência, resultando em uma porcentagem de acerto de aproximadamente 76 % para esse conjunto de genomas, utilizando o alinhamento das sequências com o programa SILA. Já com o alinhamento realizado pelo programa Blastp obteve-se 529 novos genes. No entanto, o tempo médio estimado de execução do programa BOBBLES tendo em seu algoritmo a ferramenta SILA é de pelo menos cinco vezes mais rápido do que utilizando o programa BlastP. Essa diferença de tempo é justificada pelo fato do programa SILA realizar os alinhamentos das sequências com indexação recursiva em um banco de dados local, o banco de dados de proteínas não redundantes do NCBI, conhecido por NR.Abstract: The annotation purpose is to identify DNA sequences coding for proteins or RNAs, this process is important because it gives the molecular function for the genes products. For that, it's used Gene Annotation tools using protein or DNA sequences alignments to identify homologous genes and use information from the public database to infer gene function. Although these are efficient techniques, they can be error-prone when performed without curation of an expert, particularly in cases of similarity sequence with no degree of similarity with other sequences that may be relevant or when the database is composed by partial sequences. In addition, annotation error rate can be significantly increased when it's a new query protein sequence, sharing no similarity with any available sequence in databases. Therefore, this work has developed a tool to verify genes annotation in complete bacterial genomes, the Bioinformatics Tool Based on Bacterial Genomes Comparison program (BOBBLES). It realizes the computationally gene prediction performed by Hybrid-Gene Finder (HGF). The BOBBLES compares a previous complete bacterial genome annotation with the genes identified by HGF program. This program uses two sequence comparison approaches, the first one using the BlastP program, and another approach using the SILA program, to decide whether they were recorded correctly. The BOBBLES was tested using a set composed of 14 complete bacterial genomes. These tests obtained 365 new genes and 101 genes with better or similar alignment in process of reading different from the reference genome, resulting in 76% of correct results for genomes set which used the alignment of sequences with the SILA program. But using the BlastP program, 529 new genes were obtained. However, the estimated average execution time for the BOBBLES program using SILA program was at least five times faster than using the BlastP program. This time difference is justified by the fact that the SILA program performs the alignments of the sequences with recursive indexing into a local database, the NCBI's non-redundant protein sequence (NR) database

    Teoretický popis nerovnovážných procesů transformace energie na úrovni molekulárních struktur

    Get PDF
    Title: Theoretical description of unequilibrium energy transformation processes on the level of molecular structures Author: Viktor Holubec Department: Department of Macromolecular Physics Supervisor: prof. RNDr. Petr Chvosta, CSc., Department of Macromolecular Physics Abstract: The thesis is devoted to the thermodynamics of externally driven mesoscopic sys- tems. These systems are so small that the thermodynamic limit ceases to hold and the probabilistic character of the second law cannot be ignored. Thermal forces becomes comparable to other forces acting on the system and they have to be incorporated in the underlying dynamical law, i.e., in the master equation for discrete systems, and in the Fokker-Planck equation for continuous ones. In the first part of the thesis we investigate dynamics and energetics of mesoscopic systems during non-equilibrium isothermal processes. Due to the stochastic na- ture of the dynamics, the work done on the system by the external forces must be treated as a random variable. We derive an exact analytical form of the work probability density for several model systems. In particular, the knowledge of the exact formula improves the analysis of experimental data using the recent- ly discovered fluctuation theorems. In the second part of the thesis we study a non-equilibrium...Název práce: Teoretický popis nerovnovážných procesů transformace energie na úrovni molekulárních struktur Autor: Viktor Holubec Katedra: Katedra makromolekulární fyziky Vedoucí disertační práce: prof. RNDr. Petr Chvosta, CSc., Katedra makro- molekulární fyziky Abstrakt: Práce je věnována termodynamice mezoskopických systémů, které jsou vystaveny časově závislým vnějším silám. Systémů tak malých, že neplatí termodynamická limita a explicitně se vyjevuje pravděpodobnostní charakter druhého termody- namického zákona. Tepelné síly jsou srovnatelné s ostatními silami působícími na systém a musí být tedy explicitně zahrnuty ve výchozí pohybové rovnici. Pro diskrétní systémy je pohybovou rovnicí mistrovská rovnice, pro spojité systémy Fokker-Planckova rovnice. V první části práce studujeme dynamiku a energetiku mezoskopických systémů v průběhu nerovnovážných izotermických procesů. Vzh- ledem k stochastickému charakteru dynamiky je náhodnou veličinou i práce ko- naná na systému vnějšími silami. Pro několik modelových systémů odvozujeme přesný analytický tvar hustoty pravděpodobnosti pro práci. Explicitní formule jsou důležité zejména s ohledem na analýzu experimentálních dat při současném využití nedávno...Department of Macromolecular PhysicsKatedra makromolekulární fyzikyFaculty of Mathematics and PhysicsMatematicko-fyzikální fakult

    L'implantation de la terminologie française dans un domaine de pointe : cas de la génétique médicale au Québec, un comparatif avec la France

    Get PDF
    Cette recherche a pour toile de fond le Québec. Face à sa situation historicosociodémographicopolitique, renforcée par l’hégémonie américaine, cette province du Canada se trouve linguistiquement fragilisée. Elle se révèle toutefois non dépourvue, grâce à la loi 101 promulguée en 1977 et à la Charte de la langue française qui l’a dictée. Or, nous nous interrogeons sur la situation terminolinguistique actuelle, suivant près de 38 années d’efforts consentis aux plans d’aménagements linguistique et terminologique par l’Office québécois de la langue française, l’institution qui a pour mission d’appliquer cette politique. Notre recherche a pour objectif principal de rendre compte des pratiques terminologiques du domaine de la génétique médicale au Québec, en parallèle avec celui de la France, mère patrie. Le vocabulaire spécialisé dont ce dernier est doté voit sa terminologie généralement créée aux États-Unis d’Amérique, comme la plupart des terminologies issues des domaines de pointe. D’une part, elle vérifie notre hypothèse de recherche voulant que les officialismes se trouveraient en position ex aequo avec les anglicismes, suivis des variantes. D’autre part, elle évalue si l’intégrité terminolinguistique de ce domaine est menacée. Une enquête est alors mise en place, puis des protocoles sont établis. Elle se fixe pour objectif secondaire de formuler des propositions aménagistes, tandis que la mise en application de certains outils que nous avons développés pour encourager l’utilisation d’une terminologie française sont proposés en tant que perspectives d’avenir.The backdrop for this research is Quebec. Given its historical, social, demographic and political context, reinforced by American hegemony, this province of Canada is linguistically sensitive. It is not, however, without recourse, thanks to Bill 101, enacted in 1977 through to the Charter of the French Language. We explore the current terminolinguistic context, after nearly 38 years of linguistic and terminological development planning efforts by the Office québécois de la langue française, which is mandated to enforce the Charter. The main purpose of this study is to report on terminology practices in the field of medical genetics in Quebec, in contrast with that of France, the motherland. The specialized vocabulary terminology in this field generally originates in the United States of America, as is the case for most highly technical fields. On the one hand, the study tests the hypothesis that officialisms are on par with anglicisms, followed by variants. On the other, it assesses whether or not terminolinguistic integrity in this field is threatened. An investigation put in place and protocols are established. The secondary purpose of the study is to formulate proposals, whereas applications of certain tools that we developed are proposed to promote the use of French terminology
    corecore