111 research outputs found

    Mutation@A Glance: An Integrative Web Application for Analysing Mutations from Human Genetic Diseases

    Get PDF
    Although mutation analysis serves as a key part in making a definitive diagnosis about a genetic disease, it still remains a time-consuming step to interpret their biological implications through integration of various lines of archived information about genes in question. To expedite this evaluation step of disease-causing genetic variations, here we developed Mutation@A Glance (http://rapid.rcai.riken.jp/mutation/), a highly integrated web-based analysis tool for analysing human disease mutations; it implements a user-friendly graphical interface to visualize about 40 000 known disease-associated mutations and genetic polymorphisms from more than 2600 protein-coding human disease-causing genes. Mutation@A Glance locates already known genetic variation data individually on the nucleotide and the amino acid sequences and makes it possible to cross-reference them with tertiary and/or quaternary protein structures and various functional features associated with specific amino acid residues in the proteins. We showed that the disease-associated missense mutations had a stronger tendency to reside in positions relevant to the structure/function of proteins than neutral genetic variations. From a practical viewpoint, Mutation@A Glance could certainly function as a ‘one-stop’ analysis platform for newly determined DNA sequences, which enables us to readily identify and evaluate new genetic variations by integrating multiple lines of information about the disease-causing candidate genes

    Single cell RNA-seq reveals profound transcriptional similarity between Barrett's oesophagus and oesophageal submucosal glands

    Get PDF
    Barrett’s oesophagus is a precursor of oesophageal adenocarcinoma. In this common condition, squamous epithelium in the oesophagus is replaced by columnar epithelium in response to acid reflux. Barrett’s oesophagus is highly heterogeneous and its relationships to normal tissues are unclear. Here we investigate the cellular complexity of Barrett’s oesophagus and the upper gastrointestinal tract using RNA-sequencing of single cells from multiple biopsies from six patients with Barrett’s oesophagus and two patients without oesophageal pathology. We find that cell populations in Barrett’s oesophagus, marked by LEFTY1 and OLFM4, exhibit a profound transcriptional overlap with oesophageal submucosal gland cells, but not with gastric or duodenal cells. Additionally, SPINK4 and ITLN1 mark cells that precede morphologically identifiable goblet cells in colon and Barrett’s oesophagus, potentially aiding the identification of metaplasia. Our findings reveal striking transcriptional relationships between normal tissue populations and cells in a premalignant condition, with implications for clinical practice

    A Score of the Ability of a Three-Dimensional Protein Model to Retrieve Its Own Sequence as a Quantitative Measure of Its Quality and Appropriateness

    Get PDF
    BACKGROUND: Despite the remarkable progress of bioinformatics, how the primary structure of a protein leads to a three-dimensional fold, and in turn determines its function remains an elusive question. Alignments of sequences with known function can be used to identify proteins with the same or similar function with high success. However, identification of function-related and structure-related amino acid positions is only possible after a detailed study of every protein. Folding pattern diversity seems to be much narrower than sequence diversity, and the amino acid sequences of natural proteins have evolved under a selective pressure comprising structural and functional requirements acting in parallel. PRINCIPAL FINDINGS: The approach described in this work begins by generating a large number of amino acid sequences using ROSETTA [Dantas G et al. (2003) J Mol Biol 332:449-460], a program with notable robustness in the assignment of amino acids to a known three-dimensional structure. The resulting sequence-sets showed no conservation of amino acids at active sites, or protein-protein interfaces. Hidden Markov models built from the resulting sequence sets were used to search sequence databases. Surprisingly, the models retrieved from the database sequences belonged to proteins with the same or a very similar function. Given an appropriate cutoff, the rate of false positives was zero. According to our results, this protocol, here referred to as Rd.HMM, detects fine structural details on the folding patterns, that seem to be tightly linked to the fitness of a structural framework for a specific biological function. CONCLUSION: Because the sequence of the native protein used to create the Rd.HMM model was always amongst the top hits, the procedure is a reliable tool to score, very accurately, the quality and appropriateness of computer-modeled 3D-structures, without the need for spectroscopy data. However, Rd.HMM is very sensitive to the conformational features of the models' backbone

    Incorporating background frequency improves entropy-based residue conservation measures

    Get PDF
    BACKGROUND: Several entropy-based methods have been developed for scoring sequence conservation in protein multiple sequence alignments. High scoring amino acid positions may correlate with structurally or functionally important residues. However, amino acid background frequencies are usually not taken into account in these entropy-based scoring schemes. RESULTS: We demonstrate that using a relative entropy measure that incorporates amino acid background frequency results in improved performance in identifying functional sites from protein multiple sequence alignments. CONCLUSION: Our results suggest that the application of appropriate background frequency information may lead to more biologically relevant results in many areas of bioinformatics

    Depletion of somatic mutations in splicing-associated sequences in cancer genomes

    Get PDF
    Abstract Background An important goal of cancer genomics is to identify systematically cancer-causing mutations. A common approach is to identify sites with high ratios of non-synonymous to synonymous mutations; however, if synonymous mutations are under purifying selection, this methodology leads to identification of false-positive mutations. Here, using synonymous somatic mutations (SSMs) identified in over 4000 tumours across 15 different cancer types, we sought to test this assumption by focusing on coding regions required for splicing. Results Exon flanks, which are enriched for sequences required for splicing fidelity, have ~ 17% lower SSM density compared to exonic cores, even after excluding canonical splice sites. While it is impossible to eliminate a mutation bias of unknown cause, multiple lines of evidence support a purifying selection model above a mutational bias explanation. The flank/core difference is not explained by skewed nucleotide content, replication timing, nucleosome occupancy or deficiency in mismatch repair. The depletion is not seen in tumour suppressors, consistent with their role in positive tumour selection, but is otherwise observed in cancer-associated and non-cancer genes, both essential and non-essential. Consistent with a role in splicing modulation, exonic splice enhancers have a lower SSM density before and after controlling for nucleotide composition; moreover, flanks at the 5’ end of the exons have significantly lower SSM density than at the 3’ end. Conclusions These results suggest that the observable mutational spectrum of cancer genomes is not simply a product of various mutational processes and positive selection, but might also be shaped by negative selection

    Results of the ANCHOR prospective, multicenter registry of EndoAnchors for type Ia endoleaks and endograft migration in patients with challenging anatomy

    Get PDF
    ObjectiveProximal attachment site complications continue to occur after endovascular repair of abdominal aortic aneurysms (EVAR), specifically type Ia endoleak and endograft migration. EndoAnchors (Aptus Endosystems, Sunnyvale, Calif) were designed to enhance endograft proximal fixation and sealing, and the current study was undertaken to evaluate the potential benefit of this treatment.MethodsDuring the 23-month period ending in December 2013, 319 subjects were enrolled at 43 sites in the United States and Europe. EndoAnchors were implanted in 242 patients (75.9%) at the time of an initial EVAR procedure (primary arm) and in 77 patients with an existing endograft and proximal aortic neck complications (revision arm). Technical success was defined as deployment of the desired number of EndoAnchors, adequate penetration of the vessel wall, and absence of EndoAnchor fracture. Procedural success was defined as technical success without a type Ia endoleak at completion angiography. Values are expressed as mean ± standard deviation and interquartile range.ResultsThe 238 male (74.6%) and 81 female (25.4%) subjects had a mean age of 74.1 ± 8.2 years. Aneurysms averaged 58 ± 13 (51-63) mm in diameter at the time of EndoAnchor implantation (core laboratory measurements). The proximal aortic neck averaged 16 ± 13 (7-23) mm in length (42.7% <10 mm and 42.7% conical) and 27 ± 4 mm (25-30 mm) in diameter; infrarenal neck angulation was 24 ± 15 (13-34) degrees. The number of EndoAnchors deployed was 5.8 ± 2.1 (4-7). Technical success was achieved in 303 patients (95.0%) and procedural success in 279 patients (87.5%), 217 of 240 (89.7%) and 62 of 77 (80.5%) in the primary and revision arms, respectively. There were 29 residual type Ia endoleaks (9.1%) at the end of the procedure. During mean follow-up of 9.3 ± 4.7 months, 301 patients (94.4%) were free from secondary procedures. Among the 18 secondary procedures, eight were performed for residual type Ia endoleaks and the others were unrelated to EndoAnchors. There were no open surgical conversions, there were no aneurysm-related deaths, and no aneurysm ruptured during follow-up.ConclusionsUse of EndoAnchors to treat existing and acute type Ia endoleaks and endograft migration was successful in most cases. Prophylactic use of EndoAnchors in patients with hostile aortic neck anatomy appears promising, but definitive conclusions must await longer term follow-up data

    Large scale variation in the rate of germ-line de novo mutation, base composition, divergence and diversity in humans

    Get PDF
    It has long been suspected that the rate of mutation varies across the human genome at a large scale based on the divergence between humans and other species. However, it is now possible to directly investigate this question using the large number of de novo mutations (DNMs) that have been discovered in humans through the sequencing of trios. We investi- gate a number of questions pertaining to the distribution of mutations using more than 130,000 DNMs from three large datasets. We demonstrate that the amount and pattern of variation differs between datasets at the 1MB and 100KB scales probably as a consequence of differences in sequencing technology and processing. In particular, datasets show differ- ent patterns of correlation to genomic variables such as replication time. Never-the-less there are many commonalities between datasets, which likely represent true patterns. We show that there is variation in the mutation rate at the 100KB, 1MB and 10MB scale that can- not be explained by variation at smaller scales, however the level of this variation is modest at large scales–at the 1MB scale we infer that ~90% of regions have a mutation rate within 50% of the mean. Different types of mutation show similar levels of variation and appear to vary in concert which suggests the pattern of mutation is relatively constant across the genome. We demonstrate that variation in the mutation rate does not generate large-scale variation in GC-content, and hence that mutation bias does not maintain the isochore struc- ture of the human genome. We find that genomic features explain less than 40% of the explainable variance in the rate of DNM. As expected the rate of divergence between spe- cies is correlated to the rate of DNM. However, the correlations are weaker than expected if all the variation in divergence was due to variation in the mutation rate. We provide evidence that this is due the effect of biased gene conversion on the probability that a mutation will become fixed. In contrast to divergence, we find that most of the variation in diversity can be explained by variation in the mutation rate. Finally, we show that the correlation between divergence and DNM density declines as increasingly divergent species are considered

    Chromatin loop anchors are associated with genome instability in cancer and recombination hotspots in the germline

    Get PDF
    Abstract Background Chromatin loops form a basic unit of interphase nuclear organization, with chromatin loop anchor points providing contacts between regulatory regions and promoters. However, the mutational landscape at these anchor points remains under-studied. Here, we describe the unusual patterns of somatic mutations and germline variation associated with loop anchor points and explore the underlying features influencing these patterns. Results Analyses of whole genome sequencing datasets reveal that anchor points are strongly depleted for single nucleotide variants (SNVs) in tumours. Despite low SNV rates in their genomic neighbourhood, anchor points emerge as sites of evolutionary innovation, showing enrichment for structural variant (SV) breakpoints and a peak of SNVs at focal CTCF sites within the anchor points. Both CTCF-bound and non-CTCF anchor points harbour an excess of SV breakpoints in multiple tumour types and are prone to double-strand breaks in cell lines. Common fragile sites, which are hotspots for genome instability, also show elevated numbers of intersecting loop anchor points. Recurrently disrupted anchor points are enriched for genes with functions in cell cycle transitions and regions associated with predisposition to cancer. We also discover a novel class of CTCF-bound anchor points which overlap meiotic recombination hotspots and are enriched for the core PRDM9 binding motif, suggesting that the anchor points have been foci for diversity generated during recent human evolution. Conclusions We suggest that the unusual chromatin environment at loop anchor points underlies the elevated rates of variation observed, marking them as sites of regulatory importance but also genomic fragility

    Triangle network motifs predict complexes by complementing high-error interactomes with structural information

    Get PDF
    BackgroundA lot of high-throughput studies produce protein-protein interaction networks (PPINs) with many errors and missing information. Even for genome-wide approaches, there is often a low overlap between PPINs produced by different studies. Second-level neighbors separated by two protein-protein interactions (PPIs) were previously used for predicting protein function and finding complexes in high-error PPINs. We retrieve second level neighbors in PPINs, and complement these with structural domain-domain interactions (SDDIs) representing binding evidence on proteins, forming PPI-SDDI-PPI triangles.ResultsWe find low overlap between PPINs, SDDIs and known complexes, all well below 10%. We evaluate the overlap of PPI-SDDI-PPI triangles with known complexes from Munich Information center for Protein Sequences (MIPS). PPI-SDDI-PPI triangles have ~20 times higher overlap with MIPS complexes than using second-level neighbors in PPINs without SDDIs. The biological interpretation for triangles is that a SDDI causes two proteins to be observed with common interaction partners in high-throughput experiments. The relatively few SDDIs overlapping with PPINs are part of highly connected SDDI components, and are more likely to be detected in experimental studies. We demonstrate the utility of PPI-SDDI-PPI triangles by reconstructing myosin-actin processes in the nucleus, cytoplasm, and cytoskeleton, which were not obvious in the original PPIN. Using other complementary datatypes in place of SDDIs to form triangles, such as PubMed co-occurrences or threading information, results in a similar ability to find protein complexes.ConclusionGiven high-error PPINs with missing information, triangles of mixed datatypes are a promising direction for finding protein complexes. Integrating PPINs with SDDIs improves finding complexes. Structural SDDIs partially explain the high functional similarity of second-level neighbors in PPINs. We estimate that relatively little structural information would be sufficient for finding complexes involving most of the proteins and interactions in a typical PPIN
    corecore