7 research outputs found

    Search of Related Enzymes

    Get PDF
    Milióny nových proteínov, ktoré sa objavujú každý rok, nie je možné charakterizovať klasickými biochemickými metódami pre ich časovú náročnosť a cenu. Medzi nepreskúmanými proteínami sa môžu nachádzať enzýmy uplatniteľné v priemysle i v univerzitnom prostredí, najmä na ekologickú výrobu chemických zlúčenín. Výsledkom práce je webová aplikácia, ktorá na základe vstupných proteínov vyhľadá v databáze podobné proteíny, ktoré prefiltruje pomocou esenciálnych rezíduí vstupných proteínov a označí ich ako domnelé biokatalyzátory. K získaným proteínom sa pridajú anotácie, aby sa mohol používateľ informovane rozhodnúť, ktoré proteíny podrobí experimentálnemu overeniu v laboratóriu. Vytvorený nástroj uľahčuje viackrokovú analýzu a poskytuje návrhy proteínov na experimentálne overenie ich enzymatickej funkcie. Webové rozhranie je voľne dostupné na https://loschmidt.chemi.muni.cz/enzymeminer/. Nástroj bol publikovaný v medzinárodnom časopise Nucleic Acids Research.Millions of new proteins discovered each year cannot be characterized by classical biochemical methods due to their demands of time and cost. Among the unexplored proteins, there may be enzymes useful in both industry and academy, mostly for ecological production of chemical compounds. The result of the thesis is a web application which, based on the input proteins, searches the database for similar proteins. The proteins are filtered using essential residues of the input proteins and marked as putative biocatalysts. Finally, the proteins are annotated so that the user can make an informed decision about which proteins to select for experimental laboratory verification. The developed tool facilitates multi-step analysis and recommends proteins for experimental verification of their enzymatic function. The web interface is freely available at https://loschmidt.chemi.muni.cz/enzymeminer/. The tool was published in the international journal Nucleic Acids Research.

    EnzymeMiner: automated mining of soluble enzymes with diverse structures, catalytic properties and stabilities

    Get PDF
    Millions of protein sequences are being discovered at an incredible pace, representing an inexhaustible source of biocatalysts. Despite genomic databases growing exponentially, classical biochemical characterization techniques are time-demanding, cost-ineffective and low-throughput. Therefore, computational methods are being developed to explore the unmapped sequence space efficiently. Selection of putative enzymes for biochemical characterization based on rational and robust analysis of all available sequences remains an unsolved problem. To address this challenge, we have developed EnzymeMiner-a web server for automated screening and annotation of diverse family members that enables selection of hits for wet-lab experiments. EnzymeMiner prioritizes sequences that are more likely to preserve the catalytic activity and are heterologously expressible in a soluble form in Escherichia coli. The solubility prediction employs the in-house SoluProt predictor developed using machine learning. EnzymeMiner reduces the time devoted to data gathering, multi-step analysis, sequence prioritization and selection from days to hours. The successful use case for the haloalkane dehalogenase family is described in a comprehensive tutorial available on the EnzymeMiner web page

    EnzymeMiner: Exploration of sequence space of enzymes

    Get PDF
    Please click Additional Files below to see the full abstract

    Analysis of Cancer-Associated Protein Mutations

    No full text
    PredictSNP ONCO is a web tool that analyses the impact of protein mutations on the development of oncological diseases. The tool integrates several bioinformatics tools and provides the result of its own predictor, which classifies mutations as oncogenic or benign. Apart from that, PredictSNP ONCO provides the results of a virtual screening of drugs that could inhibit the function of affected proteins. The web interface is freely available at https://loschmidt.chemi.muni.cz/predictsnp-onco/

    Search of Related Enzymes

    No full text
    Millions of new proteins discovered each year cannot be characterized by classical biochemical methods due to their demands of time and cost. Among the unexplored proteins, there may be enzymes useful in both industry and academy, mostly for ecological production of chemical compounds. The result of the thesis is a web application which, based on the input proteins, searches the database for similar proteins. The proteins are filtered using essential residues of the input proteins and marked as putative biocatalysts. Finally, the proteins are annotated so that the user can make an informed decision about which proteins to select for experimental laboratory verification. The developed tool facilitates multi-step analysis and recommends proteins for experimental verification of their enzymatic function. The web interface is freely available at https://loschmidt.chemi.muni.cz/enzymeminer/. The tool was published in the international journal Nucleic Acids Research

    Training and test datasets for the PredictONCO tool

    No full text
    <p>This dataset was used for training and validating the <a href="https://loschmidt.chemi.muni.cz/predictonco/">PredictONCO </a>web tool, supporting decision-making in precision oncology by extending the bioinformatics predictions with advanced computing and machine learning. The dataset consists of 1073 single-point mutants of 42 proteins, whose effect was classified as Oncogenic (509 data points) and Benign (564 data points). All mutations were annotated with a clinically verified effect and were compiled from the ClinVar and OncoKB databases. The dataset was manually curated based on the available information in other precision oncology databases (The Clinical Knowledgebase by The Jackson Laboratory, Personalized Cancer Therapy Knowledge Base by MD Anderson Cancer Center, cBioPortal, DoCM database) or in the primary literature. To create the dataset, we also removed any possible overlaps with the data points used in the PredictSNP consensus predictor and its constituents. This was implemented to avoid any test set data leakage due to using the PredictSNP score as one of the features (see below).</p><p>The entire dataset (<strong>SEQ</strong>) was further annotated by the pipeline of PredictONCO. Briefly, the following six features were calculated regardless of the structural information available: essentiality of the mutated residue (yes/no), the conservation of the position (the conservation grade and score), the domain where the mutation is located (cytoplasmic, extracellular, transmembrane, other), the PredictSNP score, and the number of essential residues in the protein. For approximately half of the data (<strong>STR</strong>: 377 and 76 oncogenic and benign data points, respectively), the structural information was available, and six more features were calculated: FoldX and Rosetta ddg_monomer scores, whether the residue is in the catalytic pocket (identification of residues forming the ligand-binding pocket was obtained from P2Rank), and the pKa changes (the minimum and maximum changes as well as the number of essential residues whose pKa was changed – all values obtained from PROPKA3). For both <strong>STR </strong>and <strong>SEQ </strong>datasets, 20% of the data was held out for testing. The data split was implemented at the position level to ensure that no position from the test data subset appears in the training data subset. </p><p>For more details about the tool, please visit the <a href="https://loschmidt.chemi.muni.cz/predictonco/help">help page</a> or <a href="https://loschmidt.chemi.muni.cz/peg/contact/">get in touch with us</a>.</p&gt
    corecore