The genus Eimeria belongs to the phylum Apicomplexa, which includes many obligate intra-cellular protozoan parasites of man and livestock. E. tenella is one of seven species that infect
the domestic chicken and cause the intestinal disease coccidiosis which is economy important
for poultry industry. E. tenella is highly pathogenic and is often used as a model species for
the Eimeria biology studies. In this PhD thesis, a comprehensive annotation system named
as \WAGA" (Workflow-based Automatically Genome Annotation) was built and applied to
the E. tenella genome. InforSense KDE, and its BioSense plug-in (products of the InforSense
Company), were the core softwares used to build the workflows.
Workflows were made by integrating individual bioinformatics tools into a single platform.
Each workflow was designed to provide a standalone service for a particular task. Three major
workflows were developed based on the genomic resources currently available for E. tenella.
These were of ESTs-based gene construction, HMM-based gene prediction and protein-based
annotation. Finally, a combining workflow was built to sit above the individual ones to generate
a set of automatic annotations using all of the available information. The overall system and
its three major components were deployed as web servers that are fully tuneable and reusable
for end users. WAGA does not require users to have programming skills or knowledge of the
underlying algorithms or mechanisms of its low level components.
E. tenella was the target genome here and all the results obtained were displayed by GBrowse.
A sample of the results is selected for experimental validation. For evaluation purpose, WAGA
was also applied to another Apicomplexa parasite, Plasmodium falciparum, the causative agent
of human malaria, which has been extensively annotated. The results obtained were compared
with gene predictions of PHAT, a gene finder designed for and used in the P. falciparum genome
project