64 research outputs found

    Unraveling the functional dark matter through global metagenomics

    Get PDF
    30 pages, 4 figures, 1 table, supplementary information https://doi.org/10.1038/s41586-023-06583-7.-- Data availability: All of the analysed datasets along with their corresponding sequences are available from the IMG system (http://img.jgi.doe.gov/). A list of the datasets used in this study is provided in Supplementary Data 8. All data from the protein clusters, including sequences, multiple alignments, HMM profiles, 3D structure models, and taxonomic and ecosystem annotation, are available through NMPFamsDB, publicly accessible at www.nmpfamsdb.org. The 3D models are also available at ModelArchive under accession code ma-nmpfamsdb.-- Code availability: Sequence analysis was performed using Tantan (https://gitlab.com/mcfrith/tantan), BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi), LAST (https://gitlab.com/mcfrith/last), HMMER (http://hmmer.org/) and HH-suite3 (https://github.com/soedinglab/hh-suite). Clustering was performed using HipMCL (https://bitbucket.org/azadcse/hipmcl/src/master/). Additional taxonomic annotation was performed using Whokaryote (https://github.com/LottePronk/whokaryote), EukRep (https://github.com/patrickwest/EukRep), DeepVirFinder (https://github.com/jessieren/DeepVirFinder) and MMseqs2 (https://github.com/soedinglab/MMseqs2). 3D modelling was performed using AlphaFold2 (https://github.com/deepmind/alphafold) and TrRosetta2 (https://github.com/RosettaCommons/trRosetta2). Structural alignments were performed using TMalign (https://zhanggroup.org/TM-align/) and MMalign (https://zhanggroup.org/MM-align/). All custom scripts used for the generation and analysis of the data are available at Zenodo (https://doi.org/10.5281/zenodo.8097349)Metagenomes encode an enormous diversity of proteins, reflecting a multiplicity of functions and activities1,2. Exploration of this vast sequence space has been limited to a comparative analysis against reference microbial genomes and protein families derived from those genomes. Here, to examine the scale of yet untapped functional diversity beyond what is currently possible through the lens of reference genomes, we develop a computational approach to generate reference-free protein families from the sequence space in metagenomes. We analyse 26,931 metagenomes and identify 1.17 billion protein sequences longer than 35 amino acids with no similarity to any sequences from 102,491 reference genomes or the Pfam database3. Using massively parallel graph-based clustering, we group these proteins into 106,198 novel sequence clusters with more than 100 members, doubling the number of protein families obtained from the reference genomes clustered using the same approach. We annotate these families on the basis of their taxonomic, habitat, geographical and gene neighbourhood distributions and, where sufficient sequence diversity is available, predict protein three-dimensional models, revealing novel structures. Overall, our results uncover an enormously diverse functional space, highlighting the importance of further exploring the microbial functional dark matterWith the institutional support of the ‘Severo Ochoa Centre of Excellence’ accreditation (CEX2019-000928-S)Peer reviewe

    Evaluation of DNA ploidy in relation with established prognostic factors in patients with locally advanced (unresectable) or metastatic pancreatic adenocarcinoma: a retrospective analysis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Most patients with ductal pancreatic adenocarcinoma are diagnosed with locally advanced (unresectable) or metastatic disease. The aim of this study was to evaluate the prognostic significance of DNA ploidy in relation with established clinical and laboratory variables in such patients.</p> <p>Methods</p> <p>Two hundred and twenty six patients were studied retrospectively. Twenty two potential prognostic variables (demographics, clinical parameters, biochemical markers, treatment modality) were examined.</p> <p>Results</p> <p>Mean survival time was 38.41 weeks (95% c.i.: 33.17–43.65), median survival 27.00 weeks (95% c.i.: 23.18–30.82). On multivariate analysis, 10 factors had an independent effect on survival: performance status, local extension of tumor, distant metastases, ploidy score, anemia under epoetin therapy, weight loss, pain, steatorrhoea, CEA, and palliative surgery and chemotherapy. Patients managed with palliative surgery and chemotherapy had 6.7 times lower probability of death in comparison with patients without any treatment. Patients with ploidy score > 3.6 had 5.0 times higher probability of death in comparison with patients with ploidy score < 2.2 and these with ploidy score 2.2–3.6 had 6.3 times higher probability of death in comparison with patients with ploidy score < 2.2.</p> <p>Conclusion</p> <p>According to the significance of the examined factor, survival was improved mainly by the combination of surgery and chemotherapy, and the presence of low DNA ploidy score.</p

    Option Pricing under the Variance Gamma Process

    Full text link

    A spatio-temporal hybrid neural network-Kriging model for groundwater level simulation

    No full text
    Artificial Neural Networks (ANNs) and Kriging have both been used for hydraulic head simulation. In this study, the two methodologies were combined in order to simulate the spatial and temporal distribution of hydraulic head in a study area. In order to achieve that, a fuzzy logic inference system can also be used. Different ANN architectures and variogram models were tested, together with the use or not of a fuzzy logic system. The developed algorithm was implemented and applied for predicting, spatially and temporally, the hydraulic head in an area located in Bavaria, Germany. The performance of the algorithm was evaluated using leave one out cross validation and various performance indicators were derived. The best results were achieved by using ANNs with two hidden layers, with the use of the fuzzy logic system and by utilizing the power-law variogram. The results obtained from this procedure can be characterized as favorable, since the RMSE of the method is in the order of magnitude of 10-2m. Therefore this method can be used successfully in aquifers where geological characteristics are obscure, but a variety of other, easily accessible data, such as meteorological data can be easily found

    Darling: A Web Application for Detecting Disease-Related Biomedical Entity Associations with Literature Mining

    No full text
    Finding, exploring and filtering frequent sentence-based associations between a disease and a biomedical entity, co-mentioned in disease-related PubMed literature, is a challenge, as the volume of publications increases. Darling is a web application, which utilizes Name Entity Recognition to identify human-related biomedical terms in PubMed articles, mentioned in OMIM, DisGeNET and Human Phenotype Ontology (HPO) disease records, and generates an interactive biomedical entity association network. Nodes in this network represent genes, proteins, chemicals, functions, tissues, diseases, environments and phenotypes. Users can search by identifiers, terms/entities or free text and explore the relevant abstracts in an annotated format

    Water diffusion in polymer coatings containing water-trapping particles : Part 2 : Experimental verification of the mathematical model

    No full text
    The influence of the water microtraps (cross-linked poly(methacrylic acid) sodium salt spherical particles) with high sorption capacity and low diffusion coefficient on the water penetration through the epoxy coating has been investigated. Water diffusion coefficients for the pure epoxy coating as well as composite coatings with 5% and 3.7% content of water traps have been estimated by attenuated total reflection Fourier transform infrared spectroscopy (ATR-FTIR) and microbalance measurement methods. Experimental results were compared with the mathematical model of diffusion in composite media. The presence of the particles capable of binding water reversibly (water traps) significantly slows down the diffusion rate. Composites with water traps dispersed in the whole volume of the coating and sandwich-structured coatings composed of 3 layers with particles located only inside the middle layer have been examined. The diffusion rate has been found to depend not only on the concentration of the water traps but also on the location of the particles inside the coating. Both kinds of composites exhibit lower diffusion coefficient in comparison with the pure coating, however in the case of the sandwich-structured composites this effect is significantly stronger and much closer to that predicted by the model. Water diffusion coefficient for the sandwich-structured composite with 5% addition of water traps is ca. three times lower than for the pure epoxy coating

    Diastolic stress echocardiography detects coronary artery disease in patients with asymptomatic type II diabetes

    No full text
    Objectives Diabetes mellitus is considered as an equivalent of coronary artery disease (CAD). Aim of the study was to investigate whether in asymptomatic patients with type II diabetes, diastolic stress echocardiography may represent an alternative tool for the detection of CAD. Methods The study population consisted of 105 patients with diabetes mellitus (age 61 +/- 9 years, 26% female, duration of diabetes 37 +/- 14 months). We performed an exercise stress test, followed by an echo-study and a single-positron emission tomography. Coronary angiography was performed within 1 month. Results Coronary angiography revealed a coronary artery stenosis of at least 70% in 72 patients (69%, CAD group), while the remaining formed the non-CAD group. Exercise induced an increase of both E/E’ lateral and septal ratios as well as their average in the CAD group and on the contrary a decrease of these ratios in the non-CAD group. Receiver operating curve analysis for discrimination between patients with and without obstructive CAD showed an optimal cut-off value of - 0.0708 for the exercise-induced change of E/E’ average (area under curve 0.892, P &lt; 0.001). Sensitivities of scintigraphy and of diastolic stress echocardiography for detection of CAD were 75.0 and 93.1%, respectively; specificity was 78.8% for both methods. In asymptomatic patients, sensitivities of scintigraphy and diastolic stress echocardiography were 76.9 and 92.3%; specificity of both was 80%. Conclusion In patients with type II diabetes, diastolic stress echocardiography, by means of E/E’ ratio exercise-induced changes, can be used for the diagnosis and severity of CAD and for the detection of occult myocardial ischemia. Coron Artery Dis 21:104-112 (C) 2010 Wolters Kluwer Health vertical bar Lippincott Williams &amp; Wilkins

    Biomolecule and Bioentity Interaction Databases in Systems Biology:A Comprehensive Review

    No full text
    Technological advances in high-throughput techniques have resulted in tremendous growth of complex biological datasets providing evidence regarding various biomolecular interactions. To cope with this data flood, computational approaches, web services, and databases have been implemented to deal with issues such as data integration, visualization, exploration, organization, scalability, and complexity. Nevertheless, as the number of such sets increases, it is becoming more and more difficult for an end user to know what the scope and focus of each repository is and how redundant the information between them is. Several repositories have a more general scope, while others focus on specialized aspects, such as specific organisms or biological systems. Unfortunately, many of these databases are self-contained or poorly documented and maintained. For a clearer view, in this article we provide a comprehensive categorization, comparison and evaluation of such repositories for different bioentity interaction types. We discuss most of the publicly available services based on their content, sources of information, data representation methods, user-friendliness, scope and interconnectivity, and we comment on their strengths and weaknesses. We aim for this review to reach a broad readership varying from biomedical beginners to experts and serve as a reference article in the field of Network Biology
    corecore