25 research outputs found
Genome-wide comparative analysis reveals human-mouse regulatory landscape and evolution.
BACKGROUND: Because species-specific gene expression is driven by species-specific regulation, understanding the relationship between sequence and function of the regulatory regions in different species will help elucidate how differences among species arise. Despite active experimental and computational research, relationships among sequence, conservation, and function are still poorly understood. RESULTS: We compared transcription factor occupied segments (TFos) for 116 human and 35 mouse TFs in 546 human and 125 mouse cell types and tissues from the Human and the Mouse ENCODE projects. We based the map between human and mouse TFos on a one-to-one nucleotide cross-species mapper, bnMapper, that utilizes whole genome alignments (WGA). Our analysis shows that TFos are under evolutionary constraint, but a substantial portion (25.1% of mouse and 25.85% of human on average) of the TFos does not have a homologous sequence on the other species; this portion varies among cell types and TFs. Furthermore, 47.67% and 57.01% of the homologous TFos sequence shows binding activity on the other species for human and mouse respectively. However, 79.87% and 69.22% is repurposed such that it binds the same TF in different cells or different TFs in the same cells. Remarkably, within the set of repurposed TFos, the corresponding genome regions in the other species are preferred locations of novel TFos. These events suggest exaptation of some functional regulatory sequences into new function. Despite TFos repurposing, we did not find substantial changes in their predicted target genes, suggesting that CRMs buffer evolutionary events allowing little or no change in the TFos - target gene associations. Thus, the small portion of TFos with strictly conserved occupancy underestimates the degree of conservation of regulatory interactions. CONCLUSION: We mapped regulatory sequences from an extensive number of TFs and cell types between human and mouse using WGA. A comparative analysis of this correspondence unveiled the extent of the shared regulatory sequence across TFs and cell types under study. Importantly, a large part of the shared regulatory sequence is repurposed on the other species. This sequence, fueled by turnover events, provides a strong case for exaptation in regulatory elements
Comparison of anticoagulation quality between acenocoumarol and warfarin in patients with mechanical prosthetic heart valves: Insights from the nationwide PLECTRUM study
Vitamin K antagonists are indicated for the thromboprophylaxis in patients with mechanical prosthetic heart valves (MPHV). However, it is unclear whether some differences between acenocoumarol and warfarin in terms of anticoagulation quality do exist. We included 2111 MPHV patients included in the nationwide PLECTRUM registry. We evaluated anticoagulation quality by the time in therapeutic range (TiTR). Factors associated with acenocoumarol use and with low TiTR were investigated by multivariable logistic regression analysis. Mean age was 56.8 ± 12.3 years; 44.6% of patients were women and 395 patients were on acenocoumarol. A multivariable logistic regression analysis showed that patients on acenocoumarol had more comorbidities (i.e., ≥3, odds ratio (OR) 1.443, 95% confidence interval (CI) 1.081-1.927, p = 0.013). The mean TiTR was lower in the acenocoumarol than in the warfarin group (56.1 ± 19.2% vs. 61.6 ± 19.4%, p < 0.001). A higher prevalence of TiTR (<60%, <65%, or <70%) was found in acenocoumarol users than in warfarin ones (p < 0.001 for all comparisons). Acenocoumarol use was associated with low TiTR regardless of the cutoff used at multivariable analysis. A lower TiTR on acenocoumarol was found in all subgroups of patients analyzed according to sex, hypertension, diabetes, age, valve site, atrial fibrillation, and INR range. In conclusion, anticoagulation quality was consistently lower in MPHV patients on acenocoumarol compared to those on warfarin
Occupancy by key transcription factors is a more accurate predictor of enhancer activity than histone modifications or chromatin accessibility
BACKGROUND: Regulated gene expression controls organismal development, and variation in regulatory patterns has been implicated in complex traits. Thus accurate prediction of enhancers is important for further understanding of these processes. Genome-wide measurement of epigenetic features, such as histone modifications and occupancy by transcription factors, is improving enhancer predictions, but the contribution of these features to prediction accuracy is not known. Given the importance of the hematopoietic transcription factor TAL1 for erythroid gene activation, we predicted candidate enhancers based on genomic occupancy by TAL1 and measured their activity. Contributions of multiple features to enhancer prediction were evaluated based on the results of these and other studies. RESULTS: TAL1-bound DNA segments were active enhancers at a high rate both in transient transfections of cultured cells (39 of 79, or 56%) and transgenic mice (43 of 66, or 65%). The level of binding signal for TAL1 or GATA1 did not help distinguish TAL1-bound DNA segments as active versus inactive enhancers, nor did the density of regulation-related histone modifications. A meta-analysis of results from this and other studies (273 tested predicted enhancers) showed that the presence of TAL1, GATA1, EP300, SMAD1, H3K4 methylation, H3K27ac, and CAGE tags at DNase hypersensitive sites gave the most accurate predictors of enhancer activity, with a success rate over 80% and a median threefold increase in activity. Chromatin accessibility assays and the histone modifications H3K4me1 and H3K27ac were sensitive for finding enhancers, but they have high false positive rates unless transcription factor occupancy is also included. CONCLUSIONS: Occupancy by key transcription factors such as TAL1, GATA1, SMAD1, and EP300, along with evidence of transcription, improves the accuracy of enhancer predictions based on epigenetic features. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s13072-015-0009-5) contains supplementary material, which is available to authorized users
Fast and compact matching statistics analytics.
Fast, lightweight methods for comparing the sequence of ever larger assembled genomes from ever growing databases are increasingly needed in the era of accurate long reads and pan-genome initiatives. Matching statistics is a popular method for computing whole-genome phylogenies and for detecting structural rearrangements between two genomes, since it is amenable to fast implementations that require a minimal setup of data structures. However, current implementations use a single core, take too much memory to represent the result, and do not provide efficient ways to analyze the output in order to explore local similarities between the sequences
Efficient Tools for Comparative Subword Analysis
This paper introduces an efficient implementation of approaches to alignment-free comparative genome analysis and genome-based phylogeny relying on substring composition. Distances derived from substring statistics have been proposed recently as a meaningful alternative to distances derived from sequence alignment. In particular, procaryote phylogenies based on comparative 5- and 6-mer analysis of whole proteomes have successfully been worked out. The present implementation extends the computation of composition-based distances so as to involve allk-mers for anyk up to any preset m aximum length K (including K = ∞). Remarkably, although there may be Θ(L2) distinct strings that occur in a given sequence of length L (and Θ(KL) of length k ≤ K), it is shown that composition-based distances as well as many other details of interest in comparative genome analysis can be computed in O(L) time and space (with a constant that is independent of the size of K, that is, the same constant works for all K).
A typical run with 2 sequences of altogether 1.5 million characters computes their composition-based distance in about 2 s, a performance to be contrasted with the several hours needed, even when restricting attention to substrings of length at most 6, by the direct method in use. This paper
• describes the details of this implementation—an implementation that allows the user to compute composition-based distances for a wide range of instances on data sets of unprecedented size which may be useful in assessing the validity of the approach and to fine-tune the identification of those values of k (or K) yielding the best separators and descriptors in correspondence with different inputs,
• indicates how the proposed algorithm can also be used for other tasks related to the identification and comparative analysis of highly over- or under-represented (sub)strings in given genomes, meta-genomes, or any other sequence families of interest (e.g., all proteins encoded by a given genome, all strings of non-coding or regulatory RNA, all introns, etc.),
• and thus conforms with the increasing need for radically new, fast, and massive techniques for comparative genome analysis
Patients with antiphospholipid syndrome and a first venous or arterial thrombotic event: clinical characteristics, antibody profiles and estimate of the risk of recurrence
Objectives: Thrombosis in antiphospholipid syndrome (APS) involves in most cases the venous circulation. Why in some patients thrombotic APS affects the arterial circulation and in particular cerebral circulation is unknown. In previous studies, both patient characteristics and antiphospholipid antibody types and titers have been associated with arterial thrombosis. Aim of this study was to compare the clinical characteristics and laboratory findings of venous and arterial thrombotic APS from a large series of patients. Methods: Data were retrieved from the Start 2 antiphospholipid, a multicenter prospective register of long-term collected data from Thrombosis Centers in Italy. Results: Of 167 patients with thrombotic APS, 114 (68 %) had a venous and 53 (32 %) had an arterial event as first clinical manifestation. Several clinical characteristics and risk factors were different among groups in univariate analysis. Using logistic regression analysis, reduced creatinine clearance and hyperlipidemia were independent variable for the occurrence of arterial APS. Notably, no difference in antiphospholipid antibody profiles and aβ2-Glycoprotein I levels were found between groups. A higher adjusted global antiphospholipid syndrome score (aGAPSS) was found in arterial group indicating a possible high recurrence rate in arterial APS. Conclusions: These data have pathophysiological and clinical implication since associated conditions might predispose patients to arterial rather than venous events and call to a close monitoring and treatment of arterial APS due to their increased tendency to recurrence