6 research outputs found

    Integrated multiple sequence alignment

    Get PDF
    Sammeth M. Integrated multiple sequence alignment. Bielefeld (Germany): Bielefeld University; 2005.The thesis presents enhancements for automated and manual multiple sequence alignment: existing alignment algorithms are made more easily accessible and new algorithms are designed for difficult cases. Firstly, we introduce the QAlign framework, a graphical user interface for multiple sequence alignment. It comprises several state-of-the-art algorithms and supports their parameters by convenient dialogs. An alignment viewer with guided editing functionality can also highlight or print regions of the alignment. Also phylogenetic features are provided, e.g., distance-based tree reconstruction methods, corrections for multiple substitutions and a tree viewer. The modular concept and the platform-independent implementation guarantee an easy extensibility. Further, we develop a constrained version of the divide-and-conquer alignment such that it can be restricted by anchors found earlier with local alignments. It can be shown that this method shares attributes of both, local and global aligners, in the quality of results as well as in the computation time. We further modify the local alignment step to work on bipartite (or even multipartite) sets for sequences where repeats overshadow valuable sequence information. In the end a technique is established that can accurately align sequences containing eventually repeated motifs. Finally, another algorithm is presented that allows to compare tandem repeat sequences by aligning them with respect to their possible repeat histories. We describe an evolutionary model including tandem duplications and excisions, and give an exact algorithm to compare two sequences under this model

    Phylogenetic-based propagation of functional annotations within the Gene Ontology consortium

    Get PDF
    The goal of the Gene Ontology (GO) project is to provide a uniform way to describe the functions of gene products from organisms across all kingdoms of life and thereby enable analysis of genomic data. Protein annotations are either based on experiments or predicted from protein sequences. Since most sequences have not been experimentally characterized, most available annotations need to be based on predictions. To make as accurate inferences as possible, the GO Consortium's Reference Genome Project is using an explicit evolutionary framework to infer annotations of proteins from a broad set of genomes from experimental annotations in a semi-automated manner. Most components in the pipeline, such as selection of sequences, building multiple sequence alignments and phylogenetic trees, retrieving experimental annotations and depositing inferred annotations, are fully automated. However, the most crucial step in our pipeline relies on software-assisted curation by an expert biologist. This curation tool, Phylogenetic Annotation and INference Tool (PAINT) helps curators to infer annotations among members of a protein family. PAINT allows curators to make precise assertions as to when functions were gained and lost during evolution and record the evidence (e.g. experimentally supported GO annotations and phylogenetic information including orthology) for those assertions. In this article, we describe how we use PAINT to infer protein function in a phylogenetic context with emphasis on its strengths, limitations and guidelines. We also discuss specific examples showing how PAINT annotations compare with those generated by other highly used homology-based methods

    Targeting PH domain proteins for cancer therapy

    Get PDF
    Targeted therapy has been one of the most promising treatment options for cancer during the past decade. Discoveries of potent and selective small molecule inhibitors are critical to new and promising targeted therapy. Pleckstrin Homology (PH) domain proteins are one of the biggest protein families in the human proteome. However, no drugs have been achieved to the late development stages, let alone getting to the market. Thus, a deeper understanding of this protein family is required and there is an urgent need to develop novel small molecule compounds targeting these proteins. Studies of PH domains began around two decades ago and a lot of efforts have been focused on their structures and functions. However, not much is known about their role in cancers, except a few proteins such as AKT. In order to delineate the roles of PH domain proteins in cancers, we performed a comprehensive analysis of 313 PH domain proteins using 13 types of most common cancers in TCGA. From this analysis, we identified the most frequently upregulated and mutated PH domain proteins. Interestingly, we found Tiam1, a guanine nucleotide exchange factor (GEF) specific for Rac1 activation, was overexpressed in several cancers, particularly neuroendocrine prostate cancer. Targeting PH domain proteins remains to be a significant challenge for multiple reasons. First, the binding pockets of most PH domain proteins are unknown due to lacking of PH-PIPs complex crystal structures. Second, these binding pockets are positively charged, which makes it really difficult to design small molecule inhibitors targeting these sites. In order to address these issues, we performed structural sequence alignment of available PH domain structures to identify conserved residues. Also, ensemble docking was performed in order to address the flexibility of the proteins. Through these efforts, we identified two scaffolds as Tiam1 small molecule inhibitors. These inhibitors showed binding affinity to the PH domain using surface plasmon resonance (SPR) assay and inhibition of Rac1 activation in prostate cancer cells. Also, these compounds inhibited prostate cancer cell proliferation and migration in vitro

    Coevolution underlies GPCR-G protein selectivity and functionality

    Get PDF
    G protein-coupled receptors (GPCRs) regulate diverse physiological events, which makes them as the major targets for many approved drugs. G proteins are downstream molecules that receive signals from GPCRs and trigger cell responses. The GPCR-G protein selectivity mechanism on how they properly and timely interact is still unclear. Here, we analyzed model GPCRs (i.e. HTR, DAR) and Gα proteins with a coevolutionary tool, statistical coupling analysis. The results suggested that 5-hydroxytryptamine receptors and dopamine receptors have common conserved and coevolved residues. The Gα protein also have conserved and coevolved residues. These coevolved residues were implicated in the molecular functions of the analyzed proteins. We also found specific coevolving pairs related to the selectivity between GPCR and G protein were identified. We propose that these results would contribute to better understandings of not only the functional residues of GPCRs and Gα proteins but also GPCR-G protein selectivity mechanisms. © 2021, The Author(s).1

    Skipping of Exons by Premature Termination of Transcription and Alternative Splicing within Intron-5 of the Sheep SCF Gene: A Novel Splice Variant

    Get PDF
    Stem cell factor (SCF) is a growth factor, essential for haemopoiesis, mast cell development and melanogenesis. In the hematopoietic microenvironment (HM), SCF is produced either as a membrane-bound (−) or soluble (+) forms. Skin expression of SCF stimulates melanocyte migration, proliferation, differentiation, and survival. We report for the first time, a novel mRNA splice variant of SCF from the skin of white merino sheep via cloning and sequencing. Reverse transcriptase (RT)-PCR and molecular prediction revealed two different cDNA products of SCF. Full-length cDNA libraries were enriched by the method of rapid amplification of cDNA ends (RACE-PCR). Nucleotide sequencing and molecular prediction revealed that the primary 1519 base pair (bp) cDNA encodes a precursor protein of 274 amino acids (aa), commonly known as ‘soluble’ isoform. In contrast, the shorter (835 and/or 725 bp) cDNA was found to be a ‘novel’ mRNA splice variant. It contains an open reading frame (ORF) corresponding to a truncated protein of 181 aa (vs 245 aa) with an unique C-terminus lacking the primary proteolytic segment (28 aa) right after the D175G site which is necessary to produce ‘soluble’ form of SCF. This alternative splice (AS) variant was explained by the complete nucleotide sequencing of splice junction covering exon 5-intron (5)-exon 6 (948 bp) with a premature termination codon (PTC) whereby exons 6 to 9/10 are skipped (Cassette Exon, CE 6–9/10). We also demonstrated that the Northern blot analysis at transcript level is mediated via an intron-5 splicing event. Our data refine the structure of SCF gene; clarify the presence (+) and/or absence (−) of primary proteolytic-cleavage site specific SCF splice variants. This work provides a basis for understanding the functional role and regulation of SCF in hair follicle melanogenesis in sheep beyond what was known in mice, humans and other mammals
    corecore