113 research outputs found

    TransFind—predicting transcriptional regulators for gene sets

    Get PDF
    The analysis of putative transcription factor binding sites in promoter regions of coregulated genes allows to infer the transcription factors that underlie observed changes in gene expression. While such analyses constitute a central component of the in-silico characterization of transcriptional regulatory networks, there is still a lack of simple-to-use web servers able to combine state-of-the-art prediction methods with phylogenetic analysis and appropriate multiple testing corrected statistics, which returns the results within a short time. Having these aims in mind we developed TransFind, which is freely available at http://transfind.sys-bio.net/

    Integrating omics datasets with the OmicsPLS package

    Get PDF
    Background: With the exponential growth in available biomedical data, there is a need for data integration methods that can extract information about relationships between the data sets. However, these data sets might have very different characteristics. For interpretable results, data-specific variation needs to be quantified. For this task, Two-way Orthogonal Partial Least Squares (O2PLS) has been proposed. To facilitate application and development of the methodology, free and open-source software is required. However, this is not the case with O2PLS. Results: We introduce OmicsPLS, an open-source implementation of the O2PLS method in R. It can handle both low- and high-dimensional datasets efficiently. Generic methods for inspecting and visualizing results are implemented. Both a standard and faster alternative cross-validation methods are available to determine the number of components. A simulation study shows good performance of OmicsPLS compared to alternatives, in terms of accuracy and CPU runtime. We demonstrate OmicsPLS by integrating genetic and glycomic data. Conclusions: We propose the OmicsPLS R package: a free and open-source implementation of O2PLS for statistical data integration. OmicsPLS is available at https://cran.r-project.org/package=OmicsPLSand can be installed in R via install.packages("OmicsPLS")

    Targetfinder.org: a resource for systematic discovery of transcription factor target genes

    Get PDF
    Targetfinder.org (http://targetfinder.org/) provides a web-based resource for finding genes that show a similar expression pattern to a group of user-selected genes. It is based on a large-scale gene expression compendium (>1200 experiments, >13 000 genes). The primary application of Targetfinder.org is to expand a list of known transcription factor targets by new candidate target genes. The user submits a group of genes (the ‘seed’), and as a result the web site provides a list of other genes ranked by similarity of their expression to the expression of the seed genes. Additionally, the web site provides information on a recovery/cross-validation test to check for consistency of the provided seed and the quality of the ranking. Furthermore, the web site allows to analyse affinities of a selected transcription factor to the promoter regions of the top-ranked genes in order to select the best new candidate target genes for further experimental analysis

    Gentle Masking of Low-Complexity Sequences Improves Homology Search

    Get PDF
    Detection of sequences that are homologous, i.e. descended from a common ancestor, is a fundamental task in computational biology. This task is confounded by low-complexity tracts (such as atatatatatat), which arise frequently and independently, causing strong similarities that are not homologies. There has been much research on identifying low-complexity tracts, but little research on how to treat them during homology search. We propose to find homologies by aligning sequences with “gentle” masking of low-complexity tracts. Gentle masking means that the match score involving a masked letter is , where is the unmasked score. Gentle masking slightly but noticeably improves the sensitivity of homology search (compared to “harsh” masking), without harming specificity. We show examples in three useful homology search problems: detection of NUMTs (nuclear copies of mitochondrial DNA), recruitment of metagenomic DNA reads to reference genomes, and pseudogene detection. Gentle masking is currently the best way to treat low-complexity tracts during homology search

    Chromosomal-level assembly of the Asian Seabass genome using long sequence reads and multi-layered scaffolding

    Get PDF
    We report here the ~670 Mb genome assembly of the Asian seabass (Lates calcarifer), a tropical marine teleost. We used long-read sequencing augmented by transcriptomics, optical and genetic mapping along with shared synteny from closely related fish species to derive a chromosome-level assembly with a contig N50 size over 1 Mb and scaffold N50 size over 25 Mb that span ~90% of the genome. The population structure of L. calcarifer species complex was analyzed by re-sequencing 61 individuals representing various regions across the species' native range. SNP analyses identified high levels of genetic diversity and confirmed earlier indications of a population stratification comprising three clades with signs of admixture apparent in the South-East Asian population. The quality of the Asian seabass genome assembly far exceeds that of any other fish species, and will serve as a new standard for fish genomics

    Discourse Semantics for the Analysis of Change in Language

    Get PDF
    This paper purports to elaborate and address several issues which lie at the intersection of computational linguistics and psychology. The first issue addressed is that of the interaction between discourse and semantics by virtue of empirical linguistic and psychotherapeutic evidence. This paper then gives a formal account of the knowledge representation and reasoning processes involved in the construction of an XML knowledge base for use in the sematic analysis of psychotherapeutic transcripts. Computational methods for the automatic mark-up and inference of the psychotherapeutic phenomena under investigation are detailed in order to further develop intuitions behind a particular pragmatic theory of language known as the Metamodel. The work presented here ultimately aims to produce a sustainable system for the evaluation of the effectiveness of any given psychotherapeutic technique. The possibility exists for such a system to recognise successful therapeutic mechanisms and further still, to infer new ones, or suggest improvements, or offer novel explanations as to the success or failure of the therapy itself. The work discussed here stems from research in computational linguistics, psychotherapy, and philosophy. The corpus used is a culmination of client transcripts taken before, during, and after therapy. The particular therapeutic technique used here is known as the Metamodel (Bandler and Grinder, 1975). The Metamodel was originally proffered as a method of language analysis suitable for use by practitioners of any psychotherapeutic technique. It theorises that speech utterances are related to a clients deep structure through three primary mechanisms, namely generalisation, deletion, and distortion. Previous hand tagging of our data has proven support for such claims. It is our aim to automate the identification and reasoning process. The issues and processes involved in the automation of such tagging are discussed here. Architectural and philosophical issues relating syntax (or grammar), semantics (Larson and Segal, 1995), and pragmatics (Grice, 1989; Searle, 1969) are raised. Discourse Representation Theory (Kamp, 1981; Asher and Lascarides, 1995) is discussed and used here in order to infer discourse relations.Hosted by the Scholarly Text and Imaging Service (SETIS), the University of Sydney Library, and the Research Institute for Humanities and Social Sciences (RIHSS), the University of Sydney

    Stochastic signalling rewires the interaction map of a multiple feedback network during yeast evolution

    Get PDF
    During evolution, genetic networks are rewired through strengthening or weakening their interactions to develop new regulatory schemes. In the galactose network, the GAL1/GAL3 paralogues and the GAL2 gene enhance their own expression mediated by the Gal4p transcriptional activator. The wiring strength in these feedback loops is set by the number of Gal4p binding sites. Here we show using synthetic circuits that multiplying the binding sites increases the expression of a gene under the direct control of an activator, but this enhancement is not fed back in the circuit. The feedback loops are rather activated by genes that have frequent stochastic bursts and fast RNA decay rates. In this way, rapid adaptation to galactose can be triggered even by weakly expressed genes. Our results indicate that nonlinear stochastic transcriptional responses enable feedback loops to function autonomously, or contrary to what is dictated by the strength of interactions enclosing the circuit

    Occupational exposure to gases/fumes and mineral dust affect DNA methylation levels of genes regulating expression

    Get PDF
    Many workers are daily exposed to occupational agents like gases/fumes, mineral dust or biological dust, which could induce adverse health effects. Epigenetic mechanisms, such as DNA methylation, have been suggested to play a role. We therefore aimed to identify differentially methylated regions (DMRs) upon occupational exposures in never-smokers and investigated if these DMRs associated with gene expression levels. To determine the effects of occupational exposures independent of smoking, 903 never-smokers of the LifeLines cohort study were included. We performed three genome-wide methylation analyses (Illumina 450 K), one per occupational exposure being gases/fumes, mineral dust and biological dust, using robust linear regression adjusted for appropriate confounders. DMRs were identified using comb-p in Python. Results were validated in the Rotterdam Study (233 never-smokers) and methylation-expression associations were assessed using Biobank-based Integrative Omics Study data (n = 2802). Of the total 21 significant DMRs, 14 DMRs were associated with gases/fumes and 7 with mineral dust. Three of these DMRs were associated with both exposures (RPLP1 and LINC02169 (2x)) and 11 DMRs were located within transcript start sites of gene expression regulating genes. We replicated two DMRs with gases/fumes (VTRNA2-1 and GNAS) and one with mineral dust (CCDC144NL). In addition, nine gases/fumes DMRs and six mineral dust DMRs significantly associated with gene expression levels. Our data suggest that occupational exposures may induce differential methylation of gene expression regulating genes and thereby may induce adverse health effects. Given the millions of workers that are exposed daily to occupational exposures, further studies on this epigenetic mechanism and health outcomes are warranted
    corecore