214 research outputs found

    Multiple Linear Regression and Machine Learning for Predicting the Drinking Water Quality Index in Al-Seine Lake

    Get PDF
    This is the final version. Available on open access from MDPI via the DOI in this recordData Availability Statement: The data sets are available from the corresponding author on reasonable request.Ensuring safe and clean drinking water for communities is crucial, and necessitates effective tools to monitor and predict water quality due to challenges from population growth, industrial activities, and environmental pollution. This paper evaluates the performance of multiple linear regression (MLR) and nineteen machine learning (ML) models, including algorithms based on regression, decision tree, and boosting. Models include linear regression (LR), least angle regression (LAR), Bayesian ridge chain (BR), ridge regression (Ridge), k-nearest neighbor regression (K-NN), extra tree regression (ET), and extreme gradient boosting (XGBoost). The research’s objective is to estimate the surface water quality of Al-Seine Lake in Lattakia governorate using the MLR and ML models. We used water quality data from the drinking water lake of Lattakia City, Syria, during years 2021–2022 to determine the water quality index (WQI). The predictive performance of both the MLR and ML models was evaluated using statistical methods such as the coefficient of determination (R2) and the root mean square error (RMSE) to estimate their efficiency. The results indicated that the MLR model and three of the ML models, namely linear regression (LR), least angle regression (LAR), and Bayesian ridge chain (BR), performed well in predicting the WQI. The MLR model had an R2 of 0.999 and an RMSE of 0.149, while the three ML models had an R2 of 1.0 and an RMSE of approximately 0.0. These results support using both MLR and ML models for predicting the WQI with very high accuracy, which will contribute to improving water quality management

    Selection upon Genome Architecture: Conservation of Functional Neighborhoods with Changing Genes

    Get PDF
    An increasing number of evidences show that genes are not distributed randomly across eukaryotic chromosomes, but rather in functional neighborhoods. Nevertheless, the driving force that originated and maintains such neighborhoods is still a matter of controversy. We present the first detailed multispecies cartography of genome regions enriched in genes with related functions and study the evolutionary implications of such clustering. Our results indicate that the chromosomes of higher eukaryotic genomes contain up to 12% of genes arranged in functional neighborhoods, with a high level of gene co-expression, which are consistently distributed in phylogenies. Unexpectedly, neighborhoods with homologous functions are formed by different (non-orthologous) genes in different species. Actually, instead of being conserved, functional neighborhoods present a higher degree of synteny breaks than the genome average. This scenario is compatible with the existence of selective pressures optimizing the coordinated transcription of blocks of functionally related genes. If these neighborhoods were broken by chromosomal rearrangements, selection would favor further rearrangements reconstructing other neighborhoods of similar function. The picture arising from this study is a dynamic genomic landscape with a high level of functional organization

    Selection upon Genome Architecture: Conservation of Functional Neighborhoods with Changing Genes

    Get PDF
    An increasing number of evidences show that genes are not distributed randomly across eukaryotic chromosomes, but rather in functional neighborhoods. Nevertheless, the driving force that originated and maintains such neighborhoods is still a matter of controversy. We present the first detailed multispecies cartography of genome regions enriched in genes with related functions and study the evolutionary implications of such clustering. Our results indicate that the chromosomes of higher eukaryotic genomes contain up to 12% of genes arranged in functional neighborhoods, with a high level of gene co-expression, which are consistently distributed in phylogenies. Unexpectedly, neighborhoods with homologous functions are formed by different (non-orthologous) genes in different species. Actually, instead of being conserved, functional neighborhoods present a higher degree of synteny breaks than the genome average. This scenario is compatible with the existence of selective pressures optimizing the coordinated transcription of blocks of functionally related genes. If these neighborhoods were broken by chromosomal rearrangements, selection would favor further rearrangements reconstructing other neighborhoods of similar function. The picture arising from this study is a dynamic genomic landscape with a high level of functional organization

    Core Circadian Clock Genes Regulate Leukemia Stem Cells in AML

    Get PDF
    Leukemia stem cells (LSCs) have the capacity to self-renew and propagate disease upon serial transplantation in animal models, and elimination of this cell population is required for curative therapies. Here, we describe a series of pooled, in vivo RNAi screens to identify essential transcription factors (TFs) in a murine model of acute myeloid leukemia (AML) with genetically and phenotypically defined LSCs. These screens reveal the heterodimeric, circadian rhythm TFs Clock and Bmal1 as genes required for the growth of AML cells in vitro and in vivo. Disruption of canonical circadian pathway components produces anti-leukemic effects, including impaired proliferation, enhanced myeloid differentiation, and depletion of LSCs. We find that both normal and malignant hematopoietic cells harbor an intact clock with robust circadian oscillations, and genetic knockout models reveal a leukemia-specific dependence on the pathway. Our findings establish a role for the core circadian clock genes in AML.National Institutes of Health (U.S.) (Grant P01 CA066996)National Institutes of Health (U.S.) (Grant R01 HL082945)National Cancer Institute (U.S.) (Grant P30-CA14051

    PhyloPat: phylogenetic pattern analysis of eukaryotic genes

    Get PDF
    BACKGROUND: Phylogenetic patterns show the presence or absence of certain genes or proteins in a set of species. They can also be used to determine sets of genes or proteins that occur only in certain evolutionary branches. Phylogenetic patterns analysis has routinely been applied to protein databases such as COG and OrthoMCL, but not upon gene databases. Here we present a tool named PhyloPat which allows the complete Ensembl gene database to be queried using phylogenetic patterns. DESCRIPTION: PhyloPat is an easy-to-use webserver, which can be used to query the orthologies of all complete genomes within the EnsMart database using phylogenetic patterns. This enables the determination of sets of genes that occur only in certain evolutionary branches or even single species. We found in total 446,825 genes and 3,164,088 orthologous relationships within the EnsMart v40 database. We used a single linkage clustering algorithm to create 147,922 phylogenetic lineages, using every one of the orthologies provided by Ensembl. PhyloPat provides the possibility of querying with either binary phylogenetic patterns (created by checkboxes) or regular expressions. Specific branches of a phylogenetic tree of the 21 included species can be selected to create a branch-specific phylogenetic pattern. Users can also input a list of Ensembl or EMBL IDs to check which phylogenetic lineage any gene belongs to. The output can be saved in HTML, Excel or plain text format for further analysis. A link to the FatiGO web interface has been incorporated in the HTML output, creating easy access to functional information. Finally, lists of omnipresent, polypresent and oligopresent genes have been included. CONCLUSION: PhyloPat is the first tool to combine complete genome information with phylogenetic pattern querying. Since we used the orthologies generated by the accurate pipeline of Ensembl, the obtained phylogenetic lineages are reliable. The completeness and reliability of these phylogenetic lineages will further increase with the addition of newly found orthologous relationships within each new Ensembl release

    GeneSigDB: a manually curated database and resource for analysis of gene expression signatures

    Get PDF
    GeneSigDB (http://www.genesigdb.org or http://compbio.dfci.harvard.edu/genesigdb/) is a database of gene signatures that have been extracted and manually curated from the published literature. It provides a standardized resource of published prognostic, diagnostic and other gene signatures of cancer and related disease to the community so they can compare the predictive power of gene signatures or use these in gene set enrichment analysis. Since GeneSigDB release 1.0, we have expanded from 575 to 3515 gene signatures, which were collected and transcribed from 1604 published articles largely focused on gene expression in cancer, stem cells, immune cells, development and lung disease. We have made substantial upgrades to the GeneSigDB website to improve accessibility and usability, including adding a tag cloud browse function, facetted navigation and a ‘basket’ feature to store genes or gene signatures of interest. Users can analyze GeneSigDB gene signatures, or upload their own gene list, to identify gene signatures with significant gene overlap and results can be viewed on a dynamic editable heatmap that can be downloaded as a publication quality image. All data in GeneSigDB can be downloaded in numerous formats including .gmt file format for gene set enrichment analysis or as a R/Bioconductor data file. GeneSigDB is available from http://www.genesigdb.org

    The efficacy of chemotherapy is limited by intratumoral senescent cells expressing PD-L2

    Full text link
    Chemotherapy often generates intratumoral senescent cancer cells that strongly modify the tumor microenvironment, favoring immunosuppression and tumor growth. We discovered, through an unbiased proteomics screen, that the immune checkpoint inhibitor programmed cell death 1 ligand 2 (PD-L2) is highly upregulated upon induction of senescence in different types of cancer cells. PD-L2 is not required for cells to undergo senescence, but it is critical for senescent cells to evade the immune system and persist intratumorally. Indeed, after chemotherapy, PD-L2-deficient senescent cancer cells are rapidly eliminated and tumors do not produce the senescence-associated chemokines CXCL1 and CXCL2. Accordingly, PD-L2-deficient pancreatic tumors fail to recruit myeloid-derived suppressor cells and undergo regression driven by CD8 T cells after chemotherapy. Finally, antibody-mediated blockade of PD-L2 strongly synergizes with chemotherapy causing remission of mammary tumors in mice. The combination of chemotherapy with anti-PD-L2 provides a therapeutic strategy that exploits vulnerabilities arising from therapy-induced senescence. © 2024, The Author(s)
    corecore