35 research outputs found

    DDRprot: a database of DNA damage response-related proteins

    Get PDF
    The DNA Damage Response (DDR) signalling network is an essential system that protects the genome’s integrity. The DDRprot database presented here is a resource that integrates manually curated information on the human DDR network and its sub-pathways. For each particular DDR protein, we present detailed information about its function. If involved in post-translational modifications (PTMs) with each other, we depict the position of the modified residue/s in the three-dimensional structures, when resolved structures are available for the proteins. All this information is linked to the original publication from where it was obtained. Phylogenetic information is also shown, including time of emergence and conservation across 47 selected species, family trees and sequence alignments of homologues. The DDRprot database can be queried by different criteria: pathways, species, evolutionary age or involvement in (PTM). Sequence searches using hidden Markov models can be also used.E.A.-L. was supported by the European Commission grant [FP7-REGPOT-2012-2013-1; A.A. was partially supported by the Spanish Ministry of Science and Innovation grant [PS09/02111].Peer reviewe

    Klima-Referenzdatensatz 1961-2015: Analyse und Bewertung der gemessenen meteorologischen Datengrundlage im Freistaat Sachsen sowie Erzeugung eines Klima-Referenzdatensatzes

    Get PDF
    Zum Erhalt eines qualitativ gesicherten Klima- und Klimafolgen-Monitorings in Sachsen wurde die gegebene klimatologische Datengrundlage analysiert und bewertet. Der daraus entwickelte “Klima-Referenzdatensatz Sachsen” ist Grundlage für die regionale Klima- und Klimafolgenanalyse sowie für die Erzeugung neuer sächsischer Klimaprojektionen. Er besteht aus stationsbezogenen Zeitreihen mit Tages- und Monatswerten für die wichtigsten Klimaelemente sowie abgeleiteten Klimagrößen im Zeitraum von 1961 bis 2015. Der “Klima-Referenzdatensatz Sachsen” ist über ReKIS (www.rekis.org) frei zugänglich. Redaktionsschluss: 04.07.201

    LocTree3 prediction of localization

    Get PDF
    The prediction of protein sub-cellular localization is an important step toward elucidating protein function. For each query protein sequence, LocTree2 applies machine learning (profile kernel SVM) to predict the native sub-cellular localization in 18 classes for eukaryotes, in six for bacteria and in three for archaea. The method outputs a score that reflects the reliability of each prediction. LocTree2 has performed on par with or better than any other state-of-the-art method. Here, we report the availability of LocTree3 as a public web server. The server includes the machine learning-based LocTree2 and improves over it through the addition of homology-based inference. Assessed on sequence-unique data, LocTree3 reached an 18-state accuracy Q18 = 80 ± 3% for eukaryotes and a six-state accuracy Q6 = 89 ± 4% for bacteria. The server accepts submissions ranging from single protein sequences to entire proteomes. Response time of the unloaded server is about 90 s for a 300-residue eukaryotic protein and a few hours for an entire eukaryotic proteome not considering the generation of the alignments. For over 1000 entirely sequenced organisms, the predictions are directly available as downloads. The web server is available at http://www.rostlab.org/services/loctree3

    ECOSTRESS: NASA's next generation mission to measure evapotranspiration from the International Space Station

    Get PDF
    The ECOsystem Spaceborne Thermal Radiometer Experiment on Space Station ECOSTRESS) was launched to the International Space Station on June 29, 2018. The primary science focus of ECOSTRESS is centered on evapotranspiration (ET), which is produced as level‐3 (L3) latent heat flux (LE) data products. These data are generated from the level‐2 land surface temperature and emissivity product (L2_LSTE), in conjunction with ancillary surface and atmospheric data. Here, we provide the first validation (Stage 1, preliminary) of the global ECOSTRESS clear‐sky ET product (L3_ET_PT‐JPL, version 6.0) against LE measurements at 82 eddy covariance sites around the world. Overall, the ECOSTRESS ET product performs well against the site measurements (clear‐sky instantaneous/time of overpass: r2 = 0.88; overall bias = 8%; normalized RMSE = 6%). ET uncertainty was generally consistent across climate zones, biome types, and times of day (ECOSTRESS samples the diurnal cycle), though temperate sites are over‐represented. The 70 m high spatial resolution of ECOSTRESS improved correlations by 85%, and RMSE by 62%, relative to 1 km pixels. This paper serves as a reference for the ECOSTRESS L3 ET accuracy and Stage 1 validation status for subsequent science that follows using these data

    Author Correction: The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data

    Get PDF

    The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data

    Get PDF
    The FLUXNET2015 dataset provides ecosystem-scale data on CO2, water, and energy exchange between the biosphere and the atmosphere, and other meteorological and biological measurements, from 212 sites around the globe (over 1500 site-years, up to and including year 2014). These sites, independently managed and operated, voluntarily contributed their data to create global datasets. Data were quality controlled and processed using uniform methods, to improve consistency and intercomparability across sites. The dataset is already being used in a number of applications, including ecophysiology studies, remote sensing studies, and development of ecosystem and Earth system models. FLUXNET2015 includes derived-data products, such as gap-filled time series, ecosystem respiration and photosynthetic uptake estimates, estimation of uncertainties, and metadata about the measurements, presented for the first time in this paper. In addition, 206 of these sites are for the first time distributed under a Creative Commons (CC-BY 4.0) license. This paper details this enhanced dataset and the processing methods, now made available as open-source codes, making the dataset more accessible, transparent, and reproducible.Peer reviewe

    Embeddings from protein language models predict conservation and variant effects

    No full text
    The emergence of SARS-CoV-2 variants stressed the demand for tools allowing to interpret the effect of single amino acid variants (SAVs) on protein function. While Deep Mutational Scanning (DMS) sets continue to expand our understanding of the mutational landscape of single proteins, the results continue to challenge analyses. Protein Language Models (pLMs) use the latest deep learning (DL) algorithms to leverage growing databases of protein sequences. These methods learn to predict missing or masked amino acids from the context of entire sequence regions. Here, we used pLM representations (embeddings) to predict sequence conservation and SAV effects without multiple sequence alignments (MSAs). Embeddings alone predicted residue conservation almost as accurately from single sequences as ConSeq using MSAs (two-state Matthews Correlation Coefficient-MCC-for ProtT5 embeddings of 0.596 ± 0.006 vs. 0.608 ± 0.006 for ConSeq). Inputting the conservation prediction along with BLOSUM62 substitution scores and pLM mask reconstruction probabilities into a simplistic logistic regression (LR) ensemble for Variant Effect Score Prediction without Alignments (VESPA) predicted SAV effect magnitude without any optimization on DMS data. Comparing predictions for a standard set of 39 DMS experiments to other methods (incl. ESM-1v, DeepSequence, and GEMME) revealed our approach as competitive with the state-of-the-art (SOTA) methods using MSA input. No method outperformed all others, neither consistently nor statistically significantly, independently of the performance measure applied (Spearman and Pearson correlation). Finally, we investigated binary effect predictions on DMS experiments for four human proteins. Overall, embedding-based methods have become competitive with methods relying on MSAs for SAV effect prediction at a fraction of the costs in computing/energy. Our method predicted SAV effects for the entire human proteome (~ 20 k proteins) within 40 min on one Nvidia Quadro RTX 8000. All methods and data sets are freely available for local and online execution through bioembeddings.com, https://github.com/Rostlab/VESPA , and PredictProtein

    NLSdb-major update for database of nuclear localization signals and nuclear export signals

    No full text
    NLSdb is a database collecting nuclear export signals (NES) and nuclear localization signals (NLS) along with experimentally annotated nuclear and non-nuclear proteins. NES and NLS are short sequence motifs related to protein transport out of and into the nucleus. The updated NLSdb now contains 2253 NLS and introduces 398 NES. The potential sets of novel NES and NLS have been generated by a simple 'in silico mutagenesis' protocol. We started with motifs annotated by experiments. In step 1, we increased specificity such that no known non-nuclear protein matched the refined motif. In step 2, we increased the sensitivity trying to match several different families with a motif. We then iterated over steps 1 and 2. The final set of 2253 NLS motifs matched 35% of 8421 experimentally verified nuclear proteins (up from 21% for the previous version) and none of 18 278 non-nuclear proteins. We updated the web interface providing multiple options to search protein sequences for NES and NLS motifs, and to evaluate your own signal sequences. NLSdb can be accessed via Rostlab services at: https://rostlab.org/services/nlsdb/

    Klima-Referenzdatensatz 1961-2015: Analyse und Bewertung der gemessenen meteorologischen Datengrundlage im Freistaat Sachsen sowie Erzeugung eines Klima-Referenzdatensatzes

    No full text
    Zum Erhalt eines qualitativ gesicherten Klima- und Klimafolgen-Monitorings in Sachsen wurde die gegebene klimatologische Datengrundlage analysiert und bewertet. Der daraus entwickelte “Klima-Referenzdatensatz Sachsen” ist Grundlage für die regionale Klima- und Klimafolgenanalyse sowie für die Erzeugung neuer sächsischer Klimaprojektionen. Er besteht aus stationsbezogenen Zeitreihen mit Tages- und Monatswerten für die wichtigsten Klimaelemente sowie abgeleiteten Klimagrößen im Zeitraum von 1961 bis 2015. Der “Klima-Referenzdatensatz Sachsen” ist über ReKIS (www.rekis.org) frei zugänglich. Redaktionsschluss: 04.07.201
    corecore