660 research outputs found

    Exploiting Arabic Diacritization for High Quality Automatic Annotation

    Get PDF
    International audienceWe present a novel technique for Arabic morphological annotation. The technique utilizes diacritization to produce morphological annotations of quality comparable to human annotators. Although Arabic text is generally written without diacritics, diacritization is already available for large corpora of Arabic text in several genres. Furthermore, diacritization can be generated at a low cost for new text as it does not require specialized training beyond what educated Arabic typists know. The basic approach is to enrich the input to a state-of-the-art Arabic morphological analyzer with word diacritics (full or partial) to enhance its performance. When applied to fully diacritized text, our approach produces annotations with an accuracy of over 97% on lemma, part-of-speech, and tokenization combined

    Gene set internal coherence in the context of functional profiling

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Functional profiling methods have been extensively used in the context of high-throughput experiments and, in particular, in microarray data analysis. Such methods use available biological information to define different types of functional gene modules (e.g. gene ontology -GO-, KEGG pathways, etc.) whose representation in a pre-defined list of genes is further studied. In the most popular type of microarray experimental designs (e.g. up- or down-regulated genes, clusters of co-expressing genes, etc.) or in other genomic experiments (e.g. Chip-on-chip, epigenomics, etc.) these lists are composed by genes with a high degree of co-expression. Therefore, an implicit assumption in the application of functional profiling methods within this context is that the genes corresponding to the modules tested are effectively defining sets of co-expressing genes. Nevertheless not all the functional modules are biologically coherent entities in terms of co-expression, which will eventually hinder its detection with conventional methods of functional enrichment.</p> <p>Results</p> <p>Using a large collection of microarray data we have carried out a detailed survey of internal correlation in GO terms and KEGG pathways, providing a coherence index to be used for measuring functional module co-regulation. An unexpected low level of internal correlation was found among the modules studied. Only around 30% of the modules defined by GO terms and 57% of the modules defined by KEGG pathways display an internal correlation higher than the expected by chance.</p> <p>This information on the internal correlation of the genes within the functional modules can be used in the context of a logistic regression model in a simple way to improve their detection in gene expression experiments.</p> <p>Conclusion</p> <p>For the first time, an exhaustive study on the internal co-expression of the most popular functional categories has been carried out. Interestingly, the real level of coexpression within many of them is lower than expected (or even inexistent), which will preclude its detection by means of most conventional functional profiling methods. If the gene-to-function correlation information is used in functional profiling methods, the results obtained improve the ones obtained by conventional enrichment methods.</p

    ProfCom: a web tool for profiling the complex functionality of gene groups identified from high-throughput data

    Get PDF
    ProfCom is a web-based tool for the functional interpretation of a gene list that was identified to be related by experiments. A trait which makes ProfCom a unique tool is an ability to profile enrichments of not only available Gene Ontology (GO) terms but also of ‘complex functions’. A ‘Complex function’ is constructed as Boolean combination of available GO terms. The complex functions inferred by ProfCom are more specific in comparison to single terms and describe more accurately the functional role of genes. ProfCom provides a user friendly dialog-driven web page submission available for several model organisms and supports most available gene identifiers. In addition, the web service interface allows the submission of any kind of annotation data. ProfCom is freely available at http://webclu.bio.wzw.tum.de/profcom/

    Gene set-based analysis of polymorphisms: finding pathways or biological processes associated to traits in genome-wide association studies

    Get PDF
    Genome-wide association studies have become a popular strategy to find associations of genes to traits of interest. Despite the high-resolution available today to carry out genotyping studies, the success of its application in real studies has been limited by the testing strategy used. As an alternative to brute force solutions involving the use of very large cohorts, we propose the use of the Gene Set Analysis (GSA), a different analysis strategy based on testing the association of modules of functionally related genes. We show here how the Gene Set-based Analysis of Polymorphisms (GeSBAP), which is a simple implementation of the GSA strategy for the analysis of genome-wide association studies, provides a significant increase in the power testing for this type of studies. GeSBAP is freely available at http://bioinfo.cipf.es/gesbap

    Prise en compte de l'interaction sol-atmosphère-structure dans l'analyse des désordres liés à la sécheresse

    Get PDF
    On présente une étude numérique des désordres induits par la sécheresse en pérennant en compte l'interaction sol-atmosphère-structure. Elle s'appuie sur une modélisation du transfert de masse et de chaleur dans les sols non saturés en prenant en compte l'échange sol-atmosphère. L'influence de la sècheresse sur les structures est étudiée à l'aide d'une approche découplée. La succion induite par la sècheresse est d'abord déterminée. Ensuite, une modélisation par éléments finis est utilisée pour analyser la réponse mécanique du système sol-structure

    BABELOMICS: a suite of web tools for functional annotation and analysis of groups of genes in high-throughput experiments

    Get PDF
    We present Babelomics, a complete suite of web tools for the functional analysis of groups of genes in high-throughput experiments, which includes the use of information on Gene Ontology terms, interpro motifs, KEGG pathways, Swiss-Prot keywords, analysis of predicted transcription factor binding sites, chromosomal positions and presence in tissues with determined histological characteristics, through five integrated modules: FatiGO (fast assignment and transference of information), FatiWise, transcription factor association test, GenomeGO and tissues mining tool, respectively. Additionally, another module, FatiScan, provides a new procedure that integrates biological information in combination with experimental results in order to find groups of genes with modest but coordinate significant differential behaviour. FatiScan is highly sensitive and is capable of finding significant asymmetries in the distribution of genes of common function across a list of ordered genes even if these asymmetries were not extreme. The strong multiple-testing nature of the contrasts made by the tools is taken into account. All the tools are integrated in the gene expression analysis package GEPAS. Babelomics is the natural evolution of our tool FatiGO (which analysed almost 22 000 experiments during the last year) to include more sources on information and new modes of using it. Babelomics can be found at

    Influence of cracks on the soil-atmosphere interaction: numerical coupled model of thermo- atmosphereporous media

    Get PDF
    Soil shrinks as it desiccates, and the magnitude of shrinkage can be large for clayey soils. The drying of soil leads to cracks formation, causing high suctions to develop within. Cracks expose the deep soil and more evaporation can be expected in dry periods. To illustrate the effect of cracking, a numerical model of soil-atmosphere interaction has been developed taking into account the thermo-fluid coupling of an unsaturated clay soil. The model is used to simulate the evolution of evaporation during the drying process. The main results show a significant influence of the presence of cracks on the evaporation. This study also offers a simple method for taking into account the presence of cracks in the soil-atmosphere exchange

    An in silico analysis identifies drugs potentially modulating the cytokine storm triggered by SARS-CoV-2 infection

    Get PDF
    The ongoing COVID-19 pandemic is one of the biggest health challenges of recent decades. Among the causes of mortality triggered by SARS-CoV-2 infection, the development of an inflammatory “cytokine storm” (CS) plays a determinant role. Here, we used transcriptomic data from the bronchoalveolar lavage fluid (BALF) of COVID-19 patients undergoing a CS to obtain gene-signatures associated to this pathology. Using these signatures, we interrogated the Connectivity Map (CMap) dataset that contains the effects of over 5000 small molecules on the transcriptome of human cell lines, and looked for molecules which effects on transcription mimic or oppose those of the CS. As expected, molecules that potentiate immune responses such as PKC activators are predicted to worsen the CS. In addition, we identified the negative regulation of female hormones among pathways potentially aggravating the CS, which helps to understand the gender-related differences in COVID-19 mortality. Regarding drugs potentially counteracting the CS, we identified glucocorticoids as a top hit, which validates our approach as this is the primary treatment for this pathology. Interestingly, our analysis also reveals a potential effect of MEK inhibitors in reverting the COVID-19 CS, which is supported by in vitro data that confirms the anti-inflammatory properties of these compounds.Open access funding provided by Karolinska Institute.S

    GEPAS, a web-based tool for microarray data analysis and interpretation

    Get PDF
    Gene Expression Profile Analysis Suite (GEPAS) is one of the most complete and extensively used web-based packages for microarray data analysis. During its more than 5 years of activity it has continuously been updated to keep pace with the state-of-the-art in the changing microarray data analysis arena. GEPAS offers diverse analysis options that include well established as well as novel algorithms for normalization, gene selection, class prediction, clustering and functional profiling of the experiment. New options for time-course (or dose-response) experiments, microarray-based class prediction, new clustering methods and new tests for differential expression have been included. The new pipeliner module allows automating the execution of sequential analysis steps by means of a simple but powerful graphic interface. An extensive re-engineering of GEPAS has been carried out which includes the use of web services and Web 2.0 technology features, a new user interface with persistent sessions and a new extended database of gene identifiers. GEPAS is nowadays the most quoted web tool in its field and it is extensively used by researchers of many countries and its records indicate an average usage rate of 500 experiments per day. GEPAS, is available at http://www.gepas.org

    In silico drug prescription for targeting cancer patient heterogeneity and prediction of clinical outcome

    Get PDF
    In silico drug prescription tools for precision cancer medicine can match molecular alterations with tailored candidate treatments. These methodologies require large and well-annotated datasets to systematically evaluate their performance, but this is currently constrained by the lack of complete patient clinicopathological data. Moreover, in silico drug prescription performance could be improved by integrating additional tumour information layers like intra-tumour heterogeneity (ITH) which has been related to drug response and tumour progression. PanDrugs is an in silico drug prescription method which prioritizes anticancer drugs combining both biological and clinical evidence. We have systematically evaluated PanDrugs in the Genomic Data Commons repository (GDC). Our results showed that PanDrugs is able to establish an a priori stratification of cancer patients treated with Epidermal Growth Factor Receptor (EGFR) inhibitors. Patients labelled as responders according to PanDrugs predictions showed a significantly increased overall survival (OS) compared to non-responders. PanDrugs was also able to suggest alternative tailored treatments for non-responder patients. Additionally, PanDrugs usefulness was assessed considering spatial and temporal ITH in cancer patients and showed that ITH can be approached therapeutically proposing drugs or combinations potentially capable of targeting the clonal diversity. In summary, this study is a proof of concept where PanDrugs predictions have been correlated to OS and can be useful to manage ITH in patients while increasing therapeutic options and demonstrating its clinical utilityThis work was supported by the Instituto de Salud Carlos III (ISCIII); Marie-Curie Career Integration Grant (CIG334361); and Paradifference Foundatio
    corecore