11 research outputs found

    PRODOC: a resource for the comparison of tethered protein domain architectures with in-built information on remotely related domain families

    Get PDF
    PROtein Domain Organization and Comparison (PRODOC) comprises several programs that enable convenient comparison of proteins as a sequence of domains. The in-built dataset currently consists of ∼698 000 proteins from 192 organisms with complete genomic data, and all the SWISSPROT proteins obtained from the Pfam database. All the entries in PRODOC are represented as a sequence of functional domains, assigned using hidden Markov models, instead of as a sequence of amino acids. On average 69% of the proteins in the proteomes and 49% of the residues are covered by functional domain assignments. Software tools allow the user to query the dataset with a sequence of domains and identify proteins with the same or a jumbled or circularly permuted arrangement of domains. As it is proposed that proteins with jumbled or the same domain sequences have similar functions, this search tool is useful in assigning the overall function of a multi-domain protein. Unique features of PRODOC include the generation of alignments between multi-domain proteins on the basis of the sequence of domains and in-built information on distantly related domain families forming superfamilies. It is also possible using PRODOC to identify domain sharing and gene fusion events across organisms. An exhaustive genome–genome comparison tool in PRODOC also enables the detection of successive domain sharing and domain fusion events across two organisms. The tool permits the identification of gene clusters involved in similar biological processes in two closely related organisms. The URL for PRODOC is

    Protein family classification using multiple-class neural networks.

    Get PDF
    The objective of genomic sequence analysis is to retrieve important information from the vast amount of genomic sequence data, such as DNA, RNA and protein sequences. The main task includes the interpretation of the function of DNA sequence on a genomic scale, the comparisons among genomes to gain insight into the universality of biological mechanisms and into the details of gene structure and function, the determination of the structure of all proteins and protein family classification. With its many features and capabilities for recognition, generalization and classification, artificial neural network technology is well suited for sequence analysis. At the state of the art, many methods have been devised to determine if a given protein sequence is member of a given protein superfamily. This is a binary classification problem, and efficient neural network techniques are mentioned in literature for solving such problem. In this Master\u27s thesis, we consider the problem of classifying given protein sequences into one among at least three protein families using neural networks, and, propose two methods: Pair-wise Multiple Classification Approach and Single Network Approach for this problem. In Pair-wise Multiple Classification Approach , several sub-networks are employed to perform the task whereas a compact network system is used in Single Network Approach . We performed experiments, using SNNS and UOWNNS neural network simulator on our NNs with different input/output representation, and reported accuracies as high as 95%. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .Z54. Source: Masters Abstracts International, Volume: 43-01, page: 0248. Adviser: Alioune Ngom. Thesis (M.Sc.)--University of Windsor (Canada), 2004

    DAROGAN: Enzyme function prediction from multiple sequence alignments

    Get PDF

    Use of gene targeting to study the mouse ERCC1 gene

    Get PDF

    Integrative analysis of genomic data

    Get PDF
    This thesis is composed of three different projects, and aims to predict substrates which transported by transmembrane proteins, understand the effects caused by copy number alterations (CNAs) on target proteins of antineoplastic (AN) agents, and on the genes in antineoplastic resistance pathways in cancer patients. In the first project, we propose a computational method to classify membrane transporters from three organisms (Escherichia coli, Saccharomyces cerevisiae and Homo sapiens) according to their transported substrates. Our method focuses on neighboring genes that show high co-expression with query gene. Then, we identified frequent gene ontology (GO) terms among these co-expressed neighbors and used a support vector machine classifier to annotate the substrate specificity of the query gene. The second project analyses CNAs and clinical data of 31 tumor types from The Cancer Genome Atlas (TCGA). We found that the genome sequences of tumor patients generally contain more recurrently deleted CNAs than recurrently amplified CNAs. We observed certain signs of apparently compensating effects of CNAs. The third project continues the idea of chemoresistance as suggested in the second one. This project utilized TCGA CNAs data from both normal and tumor tissues. We found that the genome sequences of tumor tissues contain more recurrently amplified CNAs of genes in cancer antineoplastic resistance pathways than normal tissues.Diese Arbeit besteht aus drei verschiedenen Projekten, die darauf abzielen Substrate die von Transmembranproteinen transportiert werden vorherzusagen, die Auswirkungen sog. Kopienzahlvariationen (CNAs) sowohl auf Zielproteine von Antineoplastischen Medikamenten als auch auf die zugehörigen Gene in den entsprechenden Resistenzwegen von Krebspatienten zu verstehen. Im ersten Projekt wird eine computergestützte Methode zur Klassifizierung von Transmembrantransportern dreier Organismen (Escherichia coli, Saccharomyces cerevisiae und Homo sapiens) anhand der von ihnen transportierten Substrate vorgestellt. Im zweiten Projekt wurden CNAs und klinische Daten von 31 Tumorarten die aus dem Cancer Genome Atlas (TCGA) stammen analysiert. Dabei stellte sich heraus, daß die genomischen Sequenzen von Tumorpatienten im allgemeinen mehr wiederkehrend deletierte CNAs aufweisen als wiederkehrend amplifizierte CNAs. Ebenfalls beobachtet wurden bestimmte Anzeichen für offensichtlich kompensatorische Effekte durch CNAs. Wie im vorgehenden Projekt wurde auch im dritten Teil der Arbeit die Idee der Chemoresistenz weiterverfolgt. Hierbei wurden CNA-Daten von normalem Gewebe, als auch von Tumorgewebe aus dem TCGA verwendet. Dabei wurde festgestellt, daß die genomischen Sequenzen von Tumorgewebe mehr wiederkehrend amplifizierte CNAs von Genen aufweisen, welche sich in Resistenzwegen von Antineoplastica befinden, als dies in normalem Gewebe der Fall ist

    Insights into the dynamics of methicillin-resistant staphylococci in animals : a focus on Staphylococcus pseudintermedius in dogs

    Get PDF
    Tese especialmente elaborada para a obtenção do grau de Doutor em Ciências Veterinárias, especialidade de ClínicaStaphylococci are a group of bacteria with clinical, agricultural, and economic importance because of their wide range of virulence factors and ability to become resistant to antimicrobials. This thesis has pursued three main objectives: I. Determine the frequency of methicillin-resistant S. aureus (MRSA) strains in several animal species, identify the characteristics of strains present in animals and comparison with human strains MRSA nasal screening was performed in 71 horses and 307 calves, and the observed frequencies were 3% and 2%, respectively. Seventy-four MRSA isolated from 2001 to 2014 were characterized: fourteen spa types, three SCCmec types and three clonal complexes (CC) 5, CC22 and CC398, were found. Most isolates were multidrug-resistant. Fourteen MRSA CC398 strains had qac genes (13 qacG and 1 qacJ), while 4 isolates (three CC5 and one CC22) had insertions in the norA promoter gene. MRSA linages from pets (CC5 and CC22) harboured specific sets of virulence genes and a lower number of resistance genes than CC398 from livestock-animals. II. Reveal antimicrobial/biocide susceptibility patterns/trends and resistance genes in methicillin-resistant staphylococci (MRS) Several antimicrobial resistance patterns and genes were found in MRS from horses. Minimum bactericidal concentrations of biocides chlorhexidine acetate, benzalkonium chloride, triclosan and glutaraldehyde were lower than the recommended in-use concentrations for veterinary medicine, although two MRS carried plasmid-borne qacA and sh-fabI or qacB and qacH-like genes. An investigation on the evolution of resistance to 38 antimicrobials, corresponding mechanisms and molecular characteristics of 644 clinical Staphylococcus spp. isolates obtained from companion animals between 1999-2014 revealed resistance to the majority of antimicrobials and the number of mecA-positive strains increased significantly over time. Considering S. pseudintermedius, the methicillin-susceptible (MSSP) were genetically more diverse than methicillin-resistant (MRSP). All MRSP and two MSSP strains were multidrug- resistant, with several antimicrobial resistance genes identified. One MSSP isolate harbored a qacA and another a qacB gene. Three biocide products had high bactericidal activity (Otodine®, Clorexyderm Spot Gel®, Dermocanis Piocure-M®), while Skingel® failed to achieve a five log reduction in the bacterial counting. III. Study of the pathogenesis of S. pseudintermedius in dogs The agr type III predominated in MRSP. Five virulence genes were found in all strains and only spsO gene was significantly associated with MSSP. MSSP produced more biofilm on BHIB and BHIB+1% glucose than MRSP isolates. Several virulence genes encoding surface proteins and toxins were highly expressed in the MRSP strain (compared to MSSP). By whole proteome characterization of S. pseudintermedius through 2DE MALDI-TOF/TOF MS approach we were able to identify 367 unique proteins, of which 39 were surface proteins. By subsequent use of the serological proteome analysis (SERPA) approach we identified 4 antigenic proteins with promising features for vaccine development. These results indicate that MRS were widely disseminated in the studied animal population, the environment and people in contact with these animals. The resistant trends and mechanisms detected in MRS strains are worrying and make animals a reservoir of important MRS clones and genes. Biocides are still a good therapeutic choice, even in the presence of efflux genes. Higher expression of virulence genes may play a role in the rapid and widespread of MRSP clones. Dogs are able to mount an IgG-response against S. pseudintermedius and the proteins identified by the immune system can in the future be used as vaccine candidates.RESUMO - Estudo da dinâmica de estafilococos meticilina-resistente em animais – um foco no Staphylococcus pseudintermedius em cães - Os estafilococos são um grupo de bactérias com importância clínica, agrícola e económica devido à ampla gama de fatores de virulência e pela sua capacidade de se tornarem resistentes aos antimicrobianos. Esta tese debruçou-se sobre três objetivos principais: I. Determinar a frequência de estirpes S. aureus meticilina-resistente (MRSA) em diversas espécies animais, identificar as características das estirpes presentes em animais e comparar com estirpes humanas Colhemos zaragatoas de 71 cavalos e 307 vitelos para pesquisa de MRSA, e observaramse frequências de 3% e 2%, respetivamente. Foram caracterizadas setenta e quatro estirpes MRSA isoladas entre 2001-2014: catorze tipos de spa, três tipos de SCCmec e três complexos clonais (CC) 5, CC22 e CC398, foram encontrados. A maioria das estirpes (74%) eram multirresistentes. Catorze estirpes de MRSA CC398 tinha genes qac (13 qacG e 1 qacJ), enquanto 4 (três CC5 e um CC22) tinham inserções no gene promotor norA. As linhagens de MRSA de animais de estimação (CC5 e CC22) tinham conjuntos específicos de genes de virulência e um menor número de genes de resistência do que as linhagens associadas aos animais de produção (CC398). II. Revelar padrões/ tendências de suscetibilidade antimicrobiana/biocida e genes de resistência em estafilococos meticilina-resistente (MRS) Foram encontrados vários padrões e genes de resistência antimicrobiana em MRS de cavalos. As concentrações bactericidas mínimas dos biocidas acetato de clorhexidina, cloreto de benzalcónio, triclosan e glutaraldeído foram menores do que as recomendadas em medicina veterinária, embora dois MRS tivessem os genes plasmídicos qacA e sh-fabI ou qacB e um qacH-semelhante. Uma investigação sobre a evolução da resistência a 38 antimicrobianos, mecanismos correspondentes e características moleculares de 644 Staphylococcus spp. clínicos obtidos de animais de companhia entre 1999-2014 revelou resistência à maioria dos antimicrobianos. O número de estirpes mecA-positivo aumentou significativamente ao longo do tempo. Quanto aos S. pseudintermedius, os meticilina-suscetível (MSSP) eram geneticamente mais diversos do que os meticilina-resistente (MRSP). Todos os MRSP e 2 MSSP eram multirresistentes, com vários genes de resistência identificados. Um MSSP tinha um gene qacA e outro um qacB. Três produtos biocidas tinham elevada atividade bactericida (Otodine®, Clorexyderm Spot Gel®, Dermocanis Piocure-M®), enquanto Skingel® não conseguiu atingir uma redução de 5 log na contagem bacteriana. III. Estudo da patogenicidade de S. pseudintermedius em cães O tipo III agr predominou nos MRSP. Cinco genes de virulência foram encontrados em todas as estirpes e só o gene spsO foi significativamente associado com MSSP. MSSP produziu mais biofilme em BHIB e BHIB + 1% glucose que as estirpes de MRSP. Vários genes de virulência que codificam proteínas e toxinas de superfície foram altamente expressos na estirpe MRSP (em comparação com MSSP). Através da caracterização do proteoma total de S. pseudintermedius pela abordagem 2DE MALDI-TOF/TOF MS fomos capazes de identificar 367 proteínas únicas, das quais 39 eram proteínas de superfície. Posteriormente utilizámos a análise do proteoma serológico (SERPA) que identificou quatro proteínas antigénicas com características promissoras para o desenvolvimento de vacinas. Estes resultados indicam que MRS estavam amplamente disseminados na população animal estudada, no ambiente e nas pessoas em contato com esses animais. As tendências de resistência e os mecanismos detetados em estirpes MRS são preocupantes tornando os animais um reservatório de clones MRS e genes. Os biocidas ainda são uma boa opção terapêutica, mesmo na presença de bombas de efluxo. Uma maior expressão de genes de virulência pode desempenhar um papel na rápida expansão de clones de MRSP. Os cães foram capazes de montar uma resposta IgG contra S. pseudintermedius e as proteínas identificadas pelo sistema imunológico podem, no futuro, ser utilizadas como candidatos vacinais

    Profiling patterns of interhelical associations in membrane proteins.

    Get PDF
    A novel set of methods has been developed to characterize polytopic membrane proteins at the topological, organellar and functional level, in order to reduce the existing functional gap in the membrane proteome. Firstly, a novel clustering tool was implemented, named PROCLASS, to facilitate the manual curation of large sets of proteins, in readiness for feature extraction. TMLOOP and TMLOOP writer were implemented to refine current topological models by predicting membrane dipping loops. TMLOOP applies weighted predictive rules in a collective motif method, to overcome the inherent limitations of single motif methods. The approach achieved 92.4% accuracy in sensitivity and 100% reliability in specificity and 1,392 topological models described in the Swiss-Prot database were refined. The subcellular location (TMLOCATE) and molecular function (TMFUN) prediction methods rely on the TMDEPTH feature extraction method along data mining techniques. TMDEPTH uses refined topological models and amino acid sequences to calculate pairs of residues located at a similar depth in the membrane. Evaluation of TMLOCATE showed a normalized accuracy of 75% in discriminating between proteins belonging to the main organelles. At a sequence similarity threshold of 40%, TMFLTN predicted main functional classes with a sensitivity of 64.1-71.4%) and 70% of the olfactory GPCRs were correctly predicted. At a sequence similarity threshold of 90%, main functional classes were predicted with a sensitivity of 75.6-92.8%) and class A GPCRs were sub-classified with a sensitivity of 84.5%>-92.9%. These results reflect a direct association between the spatial arrangement of residues in the transmembrane regions and the capacity for polytopic membrane proteins to carry out their functions. The developed methods have for the first time categorically shown that the transmembrane regions hold essential information associated with a wide range of functional properties such as filtering and gating processes, subcellular location and molecular function

    PRINTS prepares for the new millennium

    No full text
    PRINTS is a diagnostic collection of protein fingerprints. Fingerprints exploit groups of motifs to build characteristic family signatures, offering improved diagnostic reliability over single-motif approaches by virtue of the mutual context provided by motif neighbours. Around 1000 fingerprints have now been created and stored in PRINTS. The September 1998 release (version 20.0), encodes approximately 5700 motifs, covering a range of globular and membrane proteins, modular polypeptides and so on. The database is accessible via the DbBrowser Web Server at http://www.biochem.ucl.ac.uk/bsm/dbbrowser /. In addition to supporting its continued growth, recent enhancements to the resource include a BLAST server, and more efficient fingerprint search software, with improved statistics for estimating the reliability of retrieved matches. Current efforts are focused on the design of more automated methods for database maintenance; implementation of an object-relational schema for efficient data management; and integration with PROSITE, profiles, Pfam and ProDom, as part of the international InterPro project, which aims to unify protein pattern databases and offer improved tools for genome analysis
    corecore