Computational Approaches to Generating Diverse Enzyme Panels

Abstract

Ph. D. ThesisMotivation Enzymes are complex macromolecules crucial to life on earth. From bacteria to human beings, all organisms use enzymes to catalyse the many thousands of chemical reactions occurring in their cells. Enzyme functions are so diverse that the use of enzymes in industries like pharmaceuticals and agriculture has gained popularity over recent years as ”biocatalysts”. Unfortunately, the confident laboratory-based characterisation of enzyme function has lagged behind a massive increase in sequencing data, slowing down initiatives that look to use biocatalysts as part of their chemical processes. Computational methods for identifying biocatalysts do exist, but often falter due to the complexity of enzymes and sequence bias, leaving much of the catalytic space of enzymes and their families undiscovered. This thesis has two major themes: the development of in silico approaches for curating diverse panels of novel enzyme sequences for experimental characterisation, and of tooling that integrates in silico panel creation and in vitro enzyme characterisation into a unified and iterative framework. Contributions of this thesis The contributions of this thesis can be divided into the two larger themes, starting with the diverse panel selection of sequences from an enzyme family: • A novel type of protein network based on patterns of coevolving residues that can be used to identify functionally-interesting groupings in enzyme families. • The automatic sampling of functionally diverse subsets of enzyme sequences by solving the maximum diversity problem. - i - • A study into the viability of artificially increasing enzyme family diversity through neural networks-based generation of synthetic sequences. The second theme, which deals with built tools for bridging the gap between the in silico and in vitro side of enzyme family exploration: • A platform that integrates the panel selection process and resulting characterisation data to promote an iterative approach to exploring enzyme families. • A repository for storing the metadata generated by the major steps of characterisation assays in the lab.EPSRC and Prozomix Limite

Similar works

This paper was published in Newcastle University eTheses.

Having an issue?

Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.