29 research outputs found

    Computational prediction of functional similarity of CRMs

    Get PDF
    EThOS - Electronic Theses Online ServiceWarwick Systems Biology CentreHuman frontier science programGBUnited Kingdo

    A comparison of clustering models for inference of T cell receptor antigen specificity

    Get PDF
    The vast potential sequence diversity of TCRs and their ligands has presented an historic barrier to computational prediction of TCR epitope specificity, a holy grail of quantitative immunology. One common approach is to cluster sequences together, on the assumption that similar receptors bind similar epitopes. Here, we provide an independent evaluation of widely used clustering algorithms for TCR specificity inference, observing some variability in predictive performance between models, and marked differences in scalability. Despite these differences, we find that different algorithms produce clusters with high degrees of similarity for receptors recognising the same epitope. Our analysis highlights an unmet need for improvement of complex models over a simple Hamming distance comparator, and strengthens the case for use of clustering models in TCR specificity inference

    Group A streptococcus induces CD1a-autoreactive T cells and promotes psoriatic inflammation

    Get PDF
    Group A Streptococcus (GAS) infection is associated with multiple clinical sequelae, including different subtypes of psoriasis. Such post-streptococcal disorders have been long known but are largely unexplained. CD1a is expressed at constitutively high levels by Langerhans cells and presents lipid antigens to T cells, but the potential relevance to GAS infection has not been studied. Here, we investigated whether GAS-responsive CD1a-restricted T cells contribute to the pathogenesis of psoriasis. Healthy individuals had high frequencies of circulating and cutaneous GAS-responsive CD4+ and CD8+ T cells with rapid effector functions, including the production of interleukin-22 (IL-22). Human skin and blood single-cell CITE-seq analyses of IL-22-producing T cells showed a type 17 signature with proliferative potential, whereas IFN-Îł-producing T cells displayed cytotoxic T lymphocyte characteristics. Furthermore, individuals with psoriasis had significantly higher frequencies of circulating GAS-reactive T cells, enriched for markers of activation, cytolytic potential, and tissue association. In addition to responding to GAS, subsets of expanded GAS-reactive T cell clones/lines were found to be autoreactive, which included the recognition of the self-lipid antigen lysophosphatidylcholine. CD8+ T cell clones/lines produced cytolytic mediators and lysed infected CD1a-expressing cells. Furthermore, we established cutaneous models of GAS infection in a humanized CD1a transgenic mouse model and identified enhanced and prolonged local and systemic inflammation, with resolution through a psoriasis-like phenotype. Together, these findings link GAS infection to the CD1a pathway and show that GAS infection promotes the proliferation and activation of CD1a-autoreactive T cells, with relevance to post-streptococcal disease, including the pathogenesis and treatment of psoriasis

    Genome organization and chromatin analysis identify transcriptional downregulation of insulin-like growth factor signaling as a hallmark of aging in developing B cells.

    Get PDF
    BACKGROUND: Aging is characterized by loss of function of the adaptive immune system, but the underlying causes are poorly understood. To assess the molecular effects of aging on B cell development, we profiled gene expression and chromatin features genome-wide, including histone modifications and chromosome conformation, in bone marrow pro-B and pre-B cells from young and aged mice. RESULTS: Our analysis reveals that the expression levels of most genes are generally preserved in B cell precursors isolated from aged compared with young mice. Nonetheless, age-specific expression changes are observed at numerous genes, including microRNA encoding genes. Importantly, these changes are underpinned by multi-layered alterations in chromatin structure, including chromatin accessibility, histone modifications, long-range promoter interactions, and nuclear compartmentalization. Previous work has shown that differentiation is linked to changes in promoter-regulatory element interactions. We find that aging in B cell precursors is accompanied by rewiring of such interactions. We identify transcriptional downregulation of components of the insulin-like growth factor signaling pathway, in particular downregulation of Irs1 and upregulation of Let-7 microRNA expression, as a signature of the aged phenotype. These changes in expression are associated with specific alterations in H3K27me3 occupancy, suggesting that Polycomb-mediated repression plays a role in precursor B cell aging. CONCLUSIONS: Changes in chromatin and 3D genome organization play an important role in shaping the altered gene expression profile of aged precursor B cells. Components of the insulin-like growth factor signaling pathways are key targets of epigenetic regulation in aging in bone marrow B cell precursors

    Computational prediction of functional similarity of CRMs

    Get PDF
    Transcriptional regulation of genes is fundamental to all living organisms. The spatial, temporal and condition-specific expression levels of genes are in part determined by inherited regulatory codes in non-coding regions of the DNA. A large set of methods have been proposed to detect conserved regions of regulatory DNA by means of sequence alignments. However, it has become clear that some regulatory regions do not show statistically significant alignments even in the presence of functional conservation. Therefore, detecting and characterising elusive regulatory codes remains a challenging problem. In this thesis we develop and validate a novel computational alignment free model for detection of functional similarity of regulatory sequences. We show that our model can detect functional links between pairs of sequences that do not align with a significant score. We apply the model to a) detect enhancers within the same genome that are likely to have similar functions and b) to detect functionally conserved enhancer regions in orthologous genomes. Our method finds regulatory codes that are common to groups of similar enhancers and consistent with previous biological knowledge. The inputs for our model are two sequences that we wish to compare in terms of their functional similarity as well as a set of transcription factor motifs. The mathematical framework of our model is built on two main components: In the first model component, each sequence is mapped to a vector of estimated occupancy levels for all motifs. These vectors are representing which motifs at what multiplicity and specificity are present in each sequence. In the second model component, a statistical approach is established where we first estimate a probability distribution of motif occupancy levels for sequences that function similar to the template sequence. We then compute a statistical similarity score to evaluate if the sequences are more similar to each other than to random background sequences. Two applications of this model are presented: First it is applied to a set of experimentally validated non-alignable enhancers from D. melanogaster. We show that: • Our model can detect statistical links between these enhancers, • Weak binding sites can make a strong contribution to sequence similarity, • Our model treats statistically significant presence and absence of motifs symmetrically. Similarity of sequences, therefore, can be based on a combination of the two. We show examples of motifs making contributions to sequence similarity through their absence. • Using our model, we can create a network of similarities among the fly enhancers. Groups of enhancers in this network show common regulatory codes. One of these regulatory codes is strongly supported by existing experimental data. In the second application of our model we predict functional subregions of a known D. melanogaster enhancer. To achieve this, we first show that the model can detect the orthology of this enhancer between 10 Drosophila species. We then demonstrate how this statistical link can be used to predict functional subregions within this enhancer

    The rise and fall of machine learning methods in biomedical research [version 2; referees: 2 approved]

    No full text
    In the era of explosion in biological data, machine learning techniques are becoming more popular in life sciences, including biology and medicine. This research note examines the rise and fall of the most commonly used machine learning techniques in life sciences over the past three decades

    On finiteness of multiplication modules

    No full text
    Our main aim in this note, is a further generalization of a result due to D. D. Anderson, i.e., it is shown that if R is a commutative ring, and M a multiplication R-module, such that every prime ideal minimal over Ann (M) is finitely generated, then M contains only a finite number of minimal prime submodules. This immediately yields that if P is a projective ideal of R, such that every prime ideal minimal over Ann (P) is finitely generated, then P is finitely generated. Furthermore, it is established that if M is a multiplication R-module in which every minimal prime submodule is finitely generated, then R contains only a finite number of prime ideals minimal over Ann (M)

    In silico identification of vaccine targets for 2019-nCoV

    No full text
    BACKGROUND: The newly identified coronavirus known as 2019-nCoV has posed a serious global health threat. According to the latest report (18-February-2020), it has infected more than 72,000 people globally and led to deaths of more than 1,016 people in China. METHODS: The 2019 novel coronavirus proteome was aligned to a curated database of viral immunogenic peptides. The immunogenicity of detected peptides and their binding potential to HLA alleles was predicted by immunogenicity predictive models and NetMHCpan 4.0. RESULTS: We report in silico identification of a comprehensive list of immunogenic peptides that can be used as potential targets for 2019 novel coronavirus (2019-nCoV) vaccine development. First, we found 28 nCoV peptides identical to Severe acute respiratory syndrome-related coronavirus (SARS CoV) that have previously been characterized immunogenic by T cell assays. Second, we identified 48 nCoV peptides having a high degree of similarity with immunogenic peptides deposited in The Immune Epitope Database (IEDB). Lastly, we conducted a de novo search of 2019-nCoV 9-mer peptides that i) bind to common HLA alleles in Chinese and European population and ii) have T Cell Receptor (TCR) recognition potential by positional weight matrices and a recently developed immunogenicity algorithm, iPred, and identified in total 63 peptides with a high immunogenicity potential. CONCLUSIONS: Given the limited time and resources to develop vaccine and treatments for 2019-nCoV, our work provides a shortlist of candidates for experimental validation and thus can accelerate development pipeline

    A Comparison of Peak Callers Used for DNase-Seq Data

    Get PDF
    <div><p>Genome-wide profiling of open chromatin regions using DNase I and high-throughput sequencing (DNase-seq) is an increasingly popular approach for finding and studying regulatory elements. A variety of algorithms have been developed to identify regions of open chromatin from raw sequence-tag data, which has motivated us to assess and compare their performance. In this study, four published, publicly available peak calling algorithms used for DNase-seq data analysis (F-seq, Hotspot, MACS and ZINBA) are assessed at a range of signal thresholds on two published DNase-seq datasets for three cell types. The results were benchmarked against an independent dataset of regulatory regions derived from ENCODE in vivo transcription factor binding data for each particular cell type. The level of overlap between peak regions reported by each algorithm and this ENCODE-derived reference set was used to assess sensitivity and specificity of the algorithms. Our study suggests that F-seq has a slightly higher sensitivity than the next best algorithms. Hotspot and the ChIP-seq oriented method, MACS, both perform competitively when used with their default parameters. However the generic peak finder ZINBA appears to be less sensitive than the other three. We also assess accuracy of each algorithm over a range of signal thresholds. In particular, we show that the accuracy of F-Seq can be considerably improved by using a threshold setting that is different from the default value.</p></div

    Chromatin accessibility data sets show bias due to sequence specificity of the DNase I enzyme

    Get PDF
    DNase I is an enzyme which cuts duplex DNA at a rate that depends strongly upon its chromatin environment. In combination with high-throughput sequencing (HTS) technology, it can be used to infer genome-wide landscapes of open chromatin regions. Using this technology, systematic identification of hundreds of thousands of DNase I hypersensitive sites (DHS) per cell type has been possible, and this in turn has helped to precisely delineate genomic regulatory compartments. However, to date there has been relatively little investigation into possible biases affecting this data.We report a significant degree of sequence preference spanning sites cut by DNase I in a number of published data sets. The two major protocols in current use each show a different pattern, but for a given protocol the pattern of sequence specificity seems to be quite consistent. The patterns are substantially different from biases seen in other types of HTS data sets, and in some cases the most constrained position lies outside the sequenced fragment, implying that this constraint must relate to the digestion process rather than events occurring during library preparation or sequencing.DNase I is a sequence-specific enzyme, with a specificity that may depend on experimental conditions. This sequence specificity is not taken into account by existing pipelines for identifying open chromatin regions. Care must be taken when interpreting DNase I results, especially when looking at the precise locations of the reads. Future studies may be able to improve the sensitivity and precision of chromatin state measurement by compensating for sequence bias
    corecore