3 research outputs found
Provenance, propagation and quality of biological annotation
PhD ThesisBiological databases have become an integral part of the life sciences, being used
to store, organise and share ever-increasing quantities and types of data. Biological
databases are typically centred around raw data, with individual entries being
assigned to a single piece of biological data, such as a DNA sequence. Although essential,
a reader can obtain little information from the raw data alone. Therefore,
many databases aim to supplement their entries with annotation, allowing the current
knowledge about the underlying data to be conveyed to a reader. Although annotations
come in many di erent forms, most databases provide some form of free text
annotation.
Given that annotations can form the foundations of future work, it is important that a
user is able to evaluate the quality and correctness of an annotation. However, this is
rarely straightforward. The amount of annotation, and the way in which it is curated,
varies between databases. For example, the production of an annotation in some
databases is entirely automated, without any manual intervention. Further, sections
of annotations may be reused, being propagated between entries and, potentially,
external databases. This provenance and curation information is not always apparent
to a user.
The work described within this thesis explores issues relating to biological annotation
quality. While the most valuable annotation is often contained within free text, its lack
of structure makes it hard to assess. Initially, this work describes a generic approach
that allows textual annotations to be quantitatively measured. This approach is based
upon the application of Zipf's Law to words within textual annotation, resulting in a
single value, . The relationship between the value and Zipf's principle of least e ort
provides an indication as to the annotations quality, whilst also allowing annotations
to be quantitatively compared.
Secondly, the thesis focuses on determining annotation provenance and tracking any
subsequent propagation. This is achieved through the development of a visualisation
- i -
framework, which exploits the reuse of sentences within annotations. Utilising this
framework a number of propagation patterns were identi ed, which on analysis appear
to indicate low quality and erroneous annotation.
Together, these approaches increase our understanding in the textual characteristics
of biological annotation, and suggests that this understanding can be used to increase
the overall quality of these resources
Xenopus laevis as a chemical genetic screening tool for drug discovery and development.
In this thesis we explore the applicability of the X.laevis chemical genetic screening model towards drug discovery and drug development. The NCI diversity set II compound library was screened to identify abnormal pigmentation generating phenotypes that may have therapeutic application towards the treatment of melanoma cancer. 13 hit compounds identified were shown to have significantly lower IC50’s in the A375 melanoma cell line when compared to two control cell lines. Using the structural data of compounds screened (combined with the phenotypic data generated by the X.laevis screen), a report in which targets were predicted for each phenotypic category is described. Of the 10 targets predicted to generate an abnormal melanophore migration phenotype, six presented abnormal pigmentation phenotypes by compound antagonists. Two of these targets had no known previous link towards melanoma cancer. Many of the identified targets were also predicted to be targeted by nine out of 13 of the identified NCI compounds in the library screen. Thus, through a combination of forward chemical genetic screening, appropriate cell based assays and chemoinformatical analysis we have developed an efficient and effective screening strategy for the rapid identification of hit compounds that are likely to be acting through either well known or novel targets that may have possible implications towards the treatment of melanoma cancer.
To assess the applicability of the X.laevis model towards drug development, in collaboration with AstraZeneca we designed a renal function toxicity assay. Renal toxicity is a serious concern in the pharmaceutical industry, being responsible for 7% of preclinical compound dropouts. I developed a biochemical assay in which renal function would be monitored by quantfying the concentration of ammonia excreted by embryos into media. A decrease in ammonia detected in the presence of nephrotoxic compounds was hypothesised to iii represent a decrease in renal function, and therefore indicate toxicity. Despite promising preliminary experiments, the original salicylic acid ammonia assay detection method was inhibited by the presence of the compound solvant DMSO. A second assay (the glutamate dehydrogenase assay (GDH)) was trialled which could not detect a change in renal function in response to nephrotoxic compounds when compared to the vehicle control. In its current form, the X.laevis renal function assay is not capable of identifying nephrotoxic compounds and so further work is required