4 research outputs found
MiCroKit 3.0: an integrated database of midbody, centrosome and kinetochore
During cell division/mitosis, a specific subset of proteins is spatially and temporally assembled into protein super complexes in three distinct regions, i.e. centrosome/spindle pole, kinetochore/centromere and midbody/cleavage furrow/phragmoplast/bud neck, and modulates cell division process faithfully. Although many experimental efforts have been carried out to investigate the characteristics of these proteins, no integrated database was available. Here, we present the MiCroKit database (http://microkit.biocuckoo.org) of proteins that localize in midbody, centrosome and/or kinetochore. We collected into the MiCroKit database experimentally verified microkit proteins from the scientific literature that have unambiguous supportive evidence for subcellular localization under fluorescent microscope. The current version of MiCroKit 3.0 provides detailed information for 1489 microkit proteins from seven model organisms, including Saccharomyces cerevisiae, Schizasaccharomyces pombe, Caenorhabditis elegans, Drosophila melanogaster, Xenopus laevis, Mus musculus and Homo sapiens. Moreover, the orthologous information was provided for these microkit proteins, and could be a useful resource for further experimental identification. The online service of MiCroKit database was implemented in PHP + MySQL + JavaScript, while the local packages were developed in JAVA 1.5 (J2SE 5.0)
GPS-ARM: Computational Analysis of the APC/C Recognition Motif by Predicting D-Boxes and KEN-Boxes
Anaphase-promoting complex/cyclosome (APC/C), an E3 ubiquitin ligase incorporated with Cdh1 and/or Cdc20 recognizes and interacts with specific substrates, and faithfully orchestrates the proper cell cycle events by targeting proteins for proteasomal degradation. Experimental identification of APC/C substrates is largely dependent on the discovery of APC/C recognition motifs, e.g., the D-box and KEN-box. Although a number of either stringent or loosely defined motifs proposed, these motif patterns are only of limited use due to their insufficient powers of prediction. We report the development of a novel GPS-ARM software package which is useful for the prediction of D-boxes and KEN-boxes in proteins. Using experimentally identified D-boxes and KEN-boxes as the training data sets, a previously developed GPS (Group-based Prediction System) algorithm was adopted. By extensive evaluation and comparison, the GPS-ARM performance was found to be much better than the one using simple motifs. With this powerful tool, we predicted 4,841 potential D-boxes in 3,832 proteins and 1,632 potential KEN-boxes in 1,403 proteins from H. sapiens, while further statistical analysis suggested that both the D-box and KEN-box proteins are involved in a broad spectrum of biological processes beyond the cell cycle. In addition, with the co-localization information, we predicted hundreds of mitosis-specific APC/C substrates with high confidence. As the first computational tool for the prediction of APC/C-mediated degradation, GPS-ARM is a useful tool for information to be used in further experimental investigations. The GPS-ARM is freely accessible for academic researchers at: http://arm.biocuckoo.org
Systems analysis of the human cell cycle transcription network
Cell division is one of the most fundamental processes of life whereby one cell replicates
itself to produce two. The molecular machinery that drives and regulates this fundamental
process has been much studied but much remains unknown. This work describes the use of
transcriptomics analyses to identify putative new proteins involved with this process and
subsequent attempts to prove their association with this pathway. Using the latest array
technology, in Chapter 2 I describe studies that examine the expression of genes regulated
during different stages of the human cell cycle. Synchronous populations of neonatal human
dermal fibroblasts (NHDFs) were generated by serum starvation and analysed in two
separate microarray experiments. For the first set array experiments, samples were taken
every 6 hours for 48 hours after serum refeeding, and every 2 hours for 24 hours for the
second experiment. Using BioLayout Express3D, network structure analyses identified four
major clusters of gene expression patterns associated with different stages of the cell cycle:
G0-, early G1-, late G1-, and S/G2/M-phase. By comparison with datasets of other human
cells and tissues, the list of genes in the S/G2/M cluster was refined; genes were only kept in
the list if they were found to be co-expressed in cells and tissues with high levels of cell
proliferation. 706 genes that were co-expressed during S/G2/M-phase were selected for
further analyses. Manual curation showed that 484 are known cell cycle-associated genes, 78
are genes with putative association to the cell cycle, and 75 have known roles in other
biological processes, whilst 69 were entirely uncharacterised genes. In order to investigate
the 69 genes with unknown function, in Chapter 3 I describe how RNAi was used to screen
42 of these genes to see if their knockdown resulted in an effect on cell proliferation. After
extensive assay optimisation, endoribonuclease-prepared siRNA (esiRNA) was delivered to
NHDF cells and the effect of knockdown determined using a real time cell analysis (RTCA)
system. This system monitors the change in electrical resistance induced by growing cell
populations defined as the cell impedance index (CI). Using a Z-scoring cut-off to determine
the hits of the RNAi screening, according to the average value of cell impedance growth rate
(CIGR i.e. a value from transformed CI), 19 of 42 genes were found to significantly affect
the dynamics of cell proliferation, supporting a potential role in cell division. In order to
verify that the unknown proteins localise to structures compatible with a role in the cell
cycle, in Chapter 4 I describe protein localisation studies on 11 of 19 genes of ‘hits’ from
Chapter 3 (we were unable to obtain clones for the other 8 genes) and other genes of
interest. Transfection studies of HEK293T cells with expression clones containing more than
11 ORFs with GFP fused to either the N- or C-terminal were performed. FAM111B and
KIAA1549L appeared to be localised to the centrosome. In order to better understand the
context in which the novel centrosomal proteins that FAM111B might operate, in Chapter 5 I
describe the construction of a large-scale pathway model of centrosome life cycle based on
an extensive literature review. The model is composed of 117 of the most important
centrosome-associated proteins and has been constructed using the modified Edinburgh
Notation (mEPN) scheme. This model was used to better annotate the genes in the original
S/G2/M list and understand which of the genes in the model are regulated during cell
division. This regulatory network model of the centrosome life cycle represents an important
summary of current knowledge and provides a useful resource for further analyses of the
novel centrosomal proteins.
In summary, a list cell cycle gene was derived from microarray experiments by using
network structure analyses. Subsequent analyses filtered the genes that co-expressed during
S/G2/M-phase narrowing down into 706 genes. Of this list, 69 genes had not previously been
associated with the cell cycle. 42 of these unknown genes were analysed by using real time
RNAi screening, 19 of these genes were indeed associated with the cell proliferation, and 2
of these genes with unknown function appear to localise to the centrosome. To predict their
involvement in the centrosome life cycle, a pathway map composed of 117 centrosome-associated
proteins were formed. Although further research is needed to determine their
position in the centrosome life cycle, the pathway can be used for computational modelling
testing their putative function in the system