22 research outputs found

    The Development of a Functional Annotation Pipeline to Characterise Metagenome-Assembled Genomes of Microorganisms Found in Anaerobic Digestion

    Get PDF
    Anaerobic digestion involves the conversion of organic waste into biogas and biofertilisers. Anaerobic digesters are commonly found within the wastewater treatment process in the UK, converting waste sludge into methane. Higher yields of methane are required for AD to become a favourable renewable energy source. The AD process consists of four steps (hydrolysis, acidogenesis, acetogenesis, and methanogenesis) that are driven by complex microbial communities. Hydrogenotrophs and methanogens are rate-determining factors, highlighting the significance of these microbial communities within these dynamic AD environments. Research into these microbial communities will ultimately result in greater yields of methane in AD. A greater understanding of the microbial communities can be achieved via metagenomics, which involves the study of genomes recovered from environmental samples. Metagenomics involves the use of shotgun sequencing. Environmental DNA is sequenced followed by binning, and assembly into metagenome-assembled genomes (MAGs). Functional annotation is carried out to predict the gene function within the MAGs. However, quality and completeness of MAGs varies greatly due to the nature of shotgun sequencing. Large datasets of metagenomic data require large-scale data manipulation and bioinformatic analysis. Genome annotation pipelines (via workflow management tools e.g. Snakemake) allow automation and ensure reproducibility of the genome annotation. A genome annotation pipeline was developed, using Snakmake, to predict the gene function of MAGs recovered from AD. This pipeline was developed to provide an automated tool to functionally annotate MAGs, in order to discover more about the metabolic processes and relationships between microbes that drive the AD process. A confidence system was devised to indicate the quality of annotations provided by orthology-based tools EggNOG and KofamScan, allowing further analysis of low quality ORFs. Reproducibility and reference databases continue to be limitations of bioinformatic pipelines. However, approximately 50% of ORFs are annotated to a high confidence

    Discovery and effects of pharmacological inhibition of the E3 ligase Skp2 by small molecule protein-protein interaction disruptors

    Get PDF
    Skp2 (S-phase kinase-associated protein 2), one component of the SCF E3 ubiquitin ligase complex, directly interacts with Skp1 and indirectly associates with Cullin1 and Rbx1 to bridge the E2 conjugating enzyme with its protein substrate to execute its E3 ligase activity. Skp2 is an Fbox protein (due to it containing an Fbox domain) and it is the rate-limiting component of the SCF complex. Skp2 targets several cell-cycle regulatory proteins for ubiquitination and degradation; most notable and significant for cancer are the cyclin-dependent kinase inhibitor, p27. Skp2 is an oncogene and studies have shown that over-expression of Skp2 leads to increased degradation of p27 and increased proliferation in several tumor types. Additionally, Skp2 is over-expressed in multiple human cancers. Clearly, Skp2 represents an attractive target for attenuating p27 ubiquitination and subsequent cell cycle progression. However, Skp2 does not have an easily identifiable and druggable “pocket” on which small molecules can bind; it interacts with Skp1 through the Fbox domain and binds to an accessory protein called Cks1 to bind to p27. Despite this hurdle, in this study, two selective small molecule inhibitors of the Skp2 SCF complex were discovered via an in silico screen that disrupt two places: the Skp1/Skp2 interaction site and the p27 binding site via targeting hot-spot residues. The Skp1/Skp2 inhibitor disruption resulted in restoring p27 levels in the nucleus and blocks cancer progression and cancer stem cell traits. Additionally, the inhibitors phenocopy the effects of genetic Skp2 deficiency. Two specific residues on Skp2 were predicted to bind to this Skp1/Skp2 inhibitor: Trp97 and Asp98. When these residues were mutated to alanine, the inhibitor lost its ability to bind to Skp2. To investigate the flexibility and understand the conformational change upon inhibitor binding and dynamics of the SCF complex, molecular dynamics simulations, homology models, and structural analysis was carried out on the complex with and without the inhibitors. These simulations showed that the contributions of the N-terminal tail region of Skp2 does not contribute directly to the binding of these inhibitors; but its conformation is important in the context of the other members of the SCF complex. Further dynamics analysis validated the mutagenesis results, showing that the two Skp2 mutants (Trp97Ala, Asp98Ala) that retained Skp1 binding but blocked inhibitor binding were stable, whereas the mutant that was unable to retain Skp1 binding (Trp127Ala) showed destabilization in the Fbox domain. Finally, active recruitment events after post-translational modifications are shown to be possible by the interaction of phosphorylated Ser256 on Skp2 with Lys104 loop region on Cul1 The model shows that this is due to the significant flexibility in the F-box domain of Skp2, making this interaction very likely. These results show that Skp2 is a promising target on which protein-protein interaction disruptors can be designed, and consideration of the dynamics of protein complexes is required to understand ligand binding

    ISCR Annual Report: Fical Year 2004

    Full text link

    Developing a bioinformatics framework for proteogenomics

    Get PDF
    In the last 15 years, since the human genome was first sequenced, genome sequencing and annotation have continued to improve. However, genome annotation has not kept up with the accelerating rate of genome sequencing and as a result there is now a large backlog of genomic data waiting to be interpreted both quickly and accurately. Through advances in proteomics a new field has emerged to help improve genome annotation, termed proteogenomics, which uses peptide mass spectrometry data, enabling the discovery of novel protein coding genes, as well as the refinement and validation of known and putative protein-coding genes. The annotation of genomes relies heavily on ab initio gene prediction programs and/or mapping of a range of RNA transcripts. Although this method provides insights into the gene content of genomes it is unable to distinguish protein-coding genes from putative non-coding RNA genes. This problem is further confounded by the fact that only 5% of the public protein sequence repository at UniProt/SwissProt has been curated and derived from actual protein evidence. This thesis contends that it is critically important to incorporate proteomics data into genome annotation pipelines to provide experimental protein-coding evidence. Although there have been major improvements in proteogenomics over the last decade there are still numerous challenges to overcome. These key challenges include the loss of sensitivity when using inflated search spaces of putative sequences, how best to interpret novel identifications and how best to control for false discoveries. This thesis addresses the existing gap between the use of genomic and proteomic sources for accurate genome annotation by applying a proteogenomics approach with a customised methodology. This new approach was applied within four case studies: a prokaryote bacterium; a monocotyledonous wheat plant; a dicotyledonous grape plant; and human. The key contributions of this thesis are: a new methodology for proteogenomics analysis; 145 suggested gene refinements in Bradyrhizobium diazoefficiens (nitrogen-fixing bacteria); 55 new gene predictions (57 protein isoforms) in Vitis vinifera (grape); 49 new gene predictions (52 protein isoforms) in Homo sapiens (human); and 67 new gene predictions (70 protein isoforms) in Triticum aestivum (bread wheat). Lastly, a number of possible improvements for the studies conducted in this thesis and proteogenomics as a whole have been identified and discussed

    Characterising Load-Induced Changes in 3D Cultured Mesenchymal Stem Cells Through Collagen Isoform Composition and Arrangement

    Get PDF
    Tissue engineering has been highlighted as a potential regenerative medicine therapy for the regeneration of musculoskeletal tissues, many of which have poor healing capacity. Currently there is no ‘gold standard’ approach to tissue engineering with many researchers investigating the effects of different stimuli on different cells in different culture environments. One of these stimuli is mechanical stimulation, a variety of which naturally occur in the body. Mechanical stimulation is often used in tissue engineering to recapitulate the structure, extracellular matrix composition (ECM) and biomechanics of tissues such as tendon, bone and cartilage. The extracellular matrix of these tissues is primarily composed of one fibril forming collagen, for tendon and bone this is collagen type I whilst for cartilage it is collagen II. However an array of additional collagen isoforms play important roles in ECM architecture and maturation. The aim of this thesis was to investigate if collagen synthesis can be used to assess human mesenchymal stem cells (hMSCs) differentiation in response to different mechanical stimulation. Typically tissue engineering studies use the most populous ECM components to highlight the success of the engineered tissues, whilst this makes sense it neglects the minor ECM components. For musculoskeletal tissues fibril forming collagens are routinely the dominant component of the ECM, however without the minor collagens these structure would not function appropriately. The composition of collagens varies across all musculoskeletal tissues, therefore by investigating the complete collagen composition the differentiation of the cells can be identified and the quality of the tissue being engineered can be established. Tensile stimulation, hydrostatic pressure and microgravity were applied to hMSCs seeded within fibrin hydrogels, chosen as it acts as a blank slate material for collagen investigation. These mechanical stimuli were selected as they have all routinely been used to show enhanced or inhibited MSC differentiation, offering a well-established set of mechanical stimulations to investigate their role in the differentiation of hMSCs and how the subsequent collagen production can be used to identify it. Molecular (qPCR and western blot), imaging (histology, TEM and fluorescence) and structural (mechanical testing and μCT) analytical techniques have been used to assess what collagens have been produced and how this relates to the structural development of the engineered tissue. The cell embedded hydrogels had varying responses to the different percentages of cyclic tensile stimulation (0%, 3%, 5% and 10%). These specific strains were selected to assess how hMSCs would respond to the static culture (0%), low physiological dynamic strain (1-4%), high physiological dynamic strain (5-6%) and degenerative dynamic strain (>6%). The 0% and 10% strain groups indicated some osteogenic differentiation through Alizarin red staining and ALP analysis from the culture media. Suggesting that physiologically relevant dynamic strain inhibited osteogenic differentiation. 3% cyclic strain saw a two-fold increase in maximum stress and a slight decrease in fibril diameter compared to the control. The 5% strain group saw increases in tendon collagens COL3A1 and COL11A1 as well as tenogenic markers SCXA and TNMD though expression of the negative tendon marker COL2A1 was also increased. At the protein level collagen II was downregulated whilst collagen III was upregulated compared to the control. The fibril diameter and fibre alignment was found to be highest in the 10% strain group, typically a marker of increased mechanical properties, however with 10% strain only the rate of stress relaxation was increased compared to other groups with a decrease in maximum stress compared to the 3% strain group. The microtissues used for hydrostatic pressure were cultured in one of three culture medias, basic, chondrogenic or osteogenic and with one of four hydrostatic pressure condition, control, 100 kPa, 200 kPa or 300 kPa. The effects of hydrostatic pressure was largely overridden by the differentiation media supplements with the basic media group showing the biggest changes in response to different levels of hydrostatic pressure. The chondrogenic media group displayed the highest level of COL1A1, COL2A1 and COL10A1 suggesting that the hMSCs within this media group were undergoing hypertrophy. At the protein level no microtissues saw significance within a media group suggesting that hydrostatic pressure was not influencing the collagen synthesis of the hMSCs as much as the media types. The μCT analysis showed within the media groups the density of the mineralised particles was largely unchanged for the basic media, chondrogenic media and control and 100 kPa osteogenic media samples with the osteogenic 200 and 300 kPa being near two fold greater than all other conditions, suggesting that with appropriate media the higher loading regimens generated a more developed mineralised structure than the lower hydrostatic pressure. It appeared that the microgravity microtissues were all pre-disposed to spontaneously differentiate towards the osteogenic lineage as seen through the collagen gene expression. Further ALP concentration in the media increased in all culture condition across the three week culture period. PCA analysis showed evidence that the static culture was acting separately from the dynamic and microgravity culture suggesting that the increased nutrient diffusion within the RCCS 4H bioreactor was having a significant effect on the culture. Analysis of the ratio of COL14A1 to COL12A1 was used to demonstrate which culture was the most mature, COL14A1 indicating immaturity and COL12A1 maturity. The microgravity group had the least developed ECM due to the highest ratio, whilst the static group had the most developed ECM due to the lowest ratio. This was further supported through the PCA analysis highlighting COL12A1 as one of the largest contributing variables to the statics groups separation from the other two. This indicated that increased nutrient diffusion was inhibiting the maturation of the MSCs compared to static culture and microgravity was amplifying this effect

    The evolution of shell form in tropical terrestrial microsnails

    Get PDF
    Mollusca form an important animal phylum that first appeared in the Cambrian, and today is,after Arthropoda, the second largest animal phylum, with more than 100,000 extant species(Bieler, 1992, Brusca and Brusca, 2003), with the class Gastropoda accounting for 80% of the extant species in the Mollusca. Despite its species-richness, a generalised gastropod shell architecture is maintained because of conserved developmental processes. All of the shelled gastropods grow by adding, in a unidirectional accretionary way, shell material with the mantle edge organ, usually at different deposition rates around the existing aperture. This shell ontogeny, or to be more specific aperture ontogeny, gives the general spiral form for the shells. However, spiral forms can vary when there are changes in any one of the aspects in the aperture ontogeny profiles, namely, the rate and direction of shell deposition around the aperture, size and shape of the aperture (i.e. mantle edge), and the total length of the shell ontogeny processes. The interplays between these developmental parameters have generated a great diversity in shell form, for which taxonomists and evolutionary biologist are now trying to accurately characterise and to understand with regard to its evolution.This thesis reveals several hitherto unknown aspects of Plectostoma shell forms,in terms of the developmental homology, the aperture ontogeny profile, anti-predation functionality, and evolutionary pattern in shell characters and ontogenetic morphospace evolution. In fact, these are the issues that have been targeted by biologists for centuries in order to improve the way shell shape is characterised and to improve understanding of shell form evolution.he research presented in this thesis was supported by the Netherlands Organization for Scientific Research (NWO, grant no. 819.01.012).UBL - phd migration 201

    A Bioinformatics Approach to Synthetic Lethal Interactions in Cancer with Gene Expression Data

    Get PDF
    Introduction Synthetic lethal genetic interactions are re-emerging as an important concept in the post-genomics era due to their potential for use in precision medicine against cancers. Synthetic lethal drug design exploits the functional redundancy of genes disrupted in cancers (including tumour suppressors) to develop specific treatments against them. CDH1, which encodes E-cadherin, is a tumour supressor gene with loss of function in breast and stomach cancers. Experimental screens have identified candidate synthetic lethal interactions with CDH1, which can be further supported with bioinformatics analysis. Furthermore, gene expression data enables investigation of synthetic lethal pathways and the structure of synthetic lethal genes. Methods A computational methodology, the Synthetic Lethal Prediction Tool (SLIPT) was developed to detect synthetic lethal interactions in gene expression data. The application of this methodology is demonstrated on interactions with CDH1 in breast and stomach cancer data from The Cancer Genome Atlas (TCGA) project. Synthetic lethal genes and pathways were further investigated with unsupervised clustering, gene set over-representation analysis, metagenes, and permutation resampling. In particular, analyses focused on comparing SLIPT gene candidates to an experimental short interfering RNA (siRNA) screen. Network analysis methods were applied to the most supported pathways to test for pathway structure between synthetic lethal candidates. Simulation and modelling was used to assess the statistical performance of SLIPT, including simulated data with correlation structures from graph structures. Results Many candidate synthetic lethal partners of CDH1 were detected in TCGA breast cancer. These genes clustered into several distinct groups, with distinct biological functions and elevated expression in different clinical subtypes. While the number of genes detected by both SLIPT and siRNA was not significant, these contained significantly enriched pathways. In particular, G αi signalling, cytoplasmic microfibres, and extracellular fibrin clotting were robustly supported by both approaches, which is consistent with the known cytoskeletal and cell signalling roles of E-cadherin. Many of these pathways were replicated in stomach cancer data. The pathways supported only by SLIPT included regulation of immune signalling and translation, which were not expected to be detected in an isogenic cell line model but are still candidates for further investigation. Synthetic lethal candidates detected by SLIPT and siRNA were compared within the graph structures of the candidate synthetic lethal pathways. SLIPT genes had lower centrality and were consistently upstream of siRNA candidates, specifically in the G αi signalling pathway. A statistical model of synthetic lethality was used to simulate gene expression data with known synthetic lethal partners for a gene. The SLIPT methodology had high statistical performance when detecting few synthetic lethal partners, which diminished with more synthetic lethal partners or lower sample size. The SLIPT methodology performed better than Pearson correlation or the χ 2 -test. In particular, it performed well with high specificity for datasets containing thousands of genes, or genes positively correlated with the query gene (as expected to occur in gene expression data). SLIPT was robust across correlation structures, including those derived from complex pathway structures, and often distinguished synthetic lethal genes from those positively or negatively correlated with them. Thus this thesis has developed, evaluated, and applied a bioinformatics approach for the discovery of synthetic lethal genes from gene expression data. This approach has been demonstrated to detect biologically informative and clinically relevant candidate synthetic lethal partners for CDH1 in breast and stomach cancers
    corecore