352 research outputs found

    Optimal Computation of Avoided Words

    Get PDF
    The deviation of the observed frequency of a word ww from its expected frequency in a given sequence xx is used to determine whether or not the word is avoided. This concept is particularly useful in DNA linguistic analysis. The value of the standard deviation of ww, denoted by std(w)std(w), effectively characterises the extent of a word by its edge contrast in the context in which it occurs. A word ww of length k>2k>2 is a ρ\rho-avoided word in xx if std(w)ρstd(w) \leq \rho, for a given threshold ρ<0\rho < 0. Notice that such a word may be completely absent from xx. Hence computing all such words na\"{\i}vely can be a very time-consuming procedure, in particular for large kk. In this article, we propose an O(n)O(n)-time and O(n)O(n)-space algorithm to compute all ρ\rho-avoided words of length kk in a given sequence xx of length nn over a fixed-sized alphabet. We also present a time-optimal O(σn)O(\sigma n)-time and O(σn)O(\sigma n)-space algorithm to compute all ρ\rho-avoided words (of any length) in a sequence of length nn over an alphabet of size σ\sigma. Furthermore, we provide a tight asymptotic upper bound for the number of ρ\rho-avoided words and the expected length of the longest one. We make available an open-source implementation of our algorithm. Experimental results, using both real and synthetic data, show the efficiency of our implementation

    Alternative 3' UTRs direct localization of functionally diverse protein isoforms in neuronal compartments

    Get PDF
    The proper subcellular localization of RNAs and local translational regulation is crucial in highly compartmentalized cells, such as neurons. RNA localization is mediated by specific cis-regulatory elements usually found in mRNA 3'UTRs. Therefore, processes that generate alternative 3'UTRs-alternative splicing and polyadenylation-have the potential to diversify mRNA localization patterns in neurons. Here, we performed mapping of alternative 3'UTRs in neurites and soma isolated from mESC-derived neurons. Our analysis identified 593 genes with differentially localized 3'UTR isoforms. In particular, we have shown that two isoforms of Cdc42 gene with distinct functions in neuronal polarity are differentially localized between neurites and soma of mESC-derived and mouse primary cortical neurons, at both mRNA and protein level. Using reporter assays and 3'UTR swapping experiments, we have identified the role of alternative 3'UTRs and mRNA transport in differential localization of alternative CDC42 protein isoforms. Moreover, we used SILAC to identify isoform-specific Cdc42 3'UTR-bound proteome with potential role in Cdc42 localization and translation. Our analysis points to usage of alternative 3'UTR isoforms as a novel mechanism to provide for differential localization of functionally diverse alternative protein isoforms

    Close binary stars in the solar-age Galactic open cluster M67

    Get PDF
    We present multi-colour time-series CCD photometry of the solar-age galactic open cluster M67 (NGC 2682). About 3600 frames spread over 28 nights were obtained with the 1.5 m Russian-Turkish and 1.2 m Mercator telescopes. High-precision observations of the close binary stars AH Cnc, EV Cnc, ES Cnc, the δ\delta Scuti type systems EX Cnc and EW Cnc, and some long-period variables belonging to M67 are presented. Three full multi-colour light curves of the overcontact binary AH Cnc were obtained during three observing seasons. Likewise we gathered three light curves of EV Cnc, an EB-type binary, and two light curves of ES Cnc, a blue straggler binary. Parts of the light change of long-term variables S1024, S1040, S1045, S1063, S1242, and S1264 are obtained. Period variation analysis of AH Cnc, EV Cnc, and ES Cnc were done using all times of mid-eclipse available in the literature and those obtained in this study. In addition, we analyzed multi-colour light curves of the close binaries and also determined new frequencies for the δ\delta Scuti systems. The physical parameters of the close binary stars were determined with simultaneous solutions of multi-colour light and radial velocity curves. Finally we determined the distance of M67 as 857(33) pc via binary star parameters, which is consistent with an independent method from earlier studies.Comment: 12 pages, 9 Figures, 13 Table

    An optimized algorithm for detecting and annotating regional differential methylation

    Get PDF
    Background: DNA methylation profiling reveals important differentially methylated regions (DMRs) of the genome that are altered during development or that are perturbed by disease. To date, few programs exist for regional analysis of enriched or whole-genome bisulfate conversion sequencing data, even though such data are increasingly common. Here, we describe an open-source, optimized method for determining empirically based DMRs (eDMR) from high-throughput sequence data that is applicable to enriched whole-genome methylation profiling datasets, as well as other globally enriched epigenetic modification data. Results: Here we show that our bimodal distribution model and weighted cost function for optimized regional methylation analysis provides accurate boundaries of regions harboring significant epigenetic modifications. Our algorithm takes the spatial distribution of CpGs into account for the enrichment assay, allowing for optimization of the definition of empirical regions for differential methylation. Combined with the dependent adjustment for regional p-value combination and DMR annotation, we provide a method that may be applied to a variety of datasets for rapid DMR analysis. Our method classifies both the directionality of DMRs and their genome-wide distribution, and we have observed that shows clinical relevance through correct stratification of two Acute Myeloid Leukemia (AML) tumor sub-types. Conclusions: Our weighted optimization algorithm eDMR for calling DMRs extends an established DMR R pipeline (methylKit) and provides a needed resource in epigenomics. Our method enables an accurate and scalable way of finding DMRs in high-throughput methylation sequencing experiments. eDMR is available for download at http://code.google.com/p/edmr/.Sheng Li, Francine E Garrett-Bakelman, Altuna Akalin, Paul Zumbo, Ross Levine, Bik L To, Ian D Lewis, Anna L Brown, Richard J D’Andrea, Ari Melnick, Christopher E Maso

    Exaggerated CpH methylation in the autism-affected brain

    Get PDF
    BACKGROUND: The etiology of autism, a complex, heritable, neurodevelopmental disorder, remains largely unexplained. Given the unexplained risk and recent evidence supporting a role for epigenetic mechanisms in the development of autism, we explored the role of CpG and CpH (H = A, C, or T) methylation within the autism-affected cortical brain tissue. METHODS: Reduced representation bisulfite sequencing (RRBS) was completed, and analysis was carried out in 63 post-mortem cortical brain samples (Brodmann area 19) from 29 autism-affected and 34 control individuals. Analyses to identify single sites that were differentially methylated and to identify any global methylation alterations at either CpG or CpH sites throughout the genome were carried out. RESULTS: We report that while no individual site or region of methylation was significantly associated with autism after multi-test correction, methylated CpH dinucleotides were markedly enriched in autism-affected brains (~2-fold enrichment at p < 0.05 cutoff, p = 0.002). CONCLUSIONS: These results further implicate epigenetic alterations in pathobiological mechanisms that underlie autism. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13229-017-0119-y) contains supplementary material, which is available to authorized users
    corecore