139 research outputs found
Deriving a mutation index of carcinogenicity using protein structure and protein interfaces
With the advent of Next Generation Sequencing the identification of mutations in the genomes of healthy and diseased tissues has become commonplace. While much progress has been made to elucidate the aetiology of disease processes in cancer, the contributions to disease that many individual mutations make remain to be characterised and their downstream consequences on cancer phenotypes remain to be understood. Missense mutations commonly occur in cancers and their consequences remain challenging to predict. However, this knowledge is becoming more vital, for both assessing disease progression and for stratifying drug treatment regimes. Coupled with structural data, comprehensive genomic databases of mutations such as the 1000 Genomes project and COSMIC give an opportunity to investigate general principles of how cancer mutations disrupt proteins and their interactions at the molecular and network level. We describe a comprehensive comparison of cancer and neutral missense mutations; by combining features derived from structural and interface properties we have developed a carcinogenicity predictor, InCa (Index of Carcinogenicity). Upon comparison with other methods, we observe that InCa can predict mutations that might not be detected by other methods. We also discuss general limitations shared by all predictors that attempt to predict driver mutations and discuss how this could impact high-throughput predictions. A web interface to a server implementation is publicly available at http://inca.icr.ac.uk/
Big-Data-Driven Materials Science and its FAIR Data Infrastructure
This chapter addresses the forth paradigm of materials research -- big-data
driven materials science. Its concepts and state-of-the-art are described, and
its challenges and chances are discussed. For furthering the field, Open Data
and an all-embracing sharing, an efficient data infrastructure, and the rich
ecosystem of computer codes used in the community are of critical importance.
For shaping this forth paradigm and contributing to the development or
discovery of improved and novel materials, data must be what is now called FAIR
-- Findable, Accessible, Interoperable and Re-purposable/Re-usable. This sets
the stage for advances of methods from artificial intelligence that operate on
large data sets to find trends and patterns that cannot be obtained from
individual calculations and not even directly from high-throughput studies.
Recent progress is reviewed and demonstrated, and the chapter is concluded by a
forward-looking perspective, addressing important not yet solved challenges.Comment: submitted to the Handbook of Materials Modeling (eds. S. Yip and W.
Andreoni), Springer 2018/201
Recommended from our members
A multifault earthquake threat for the Seattle metropolitan region revealed by mass tree mortality
This is the final version. Available on open access from the American Association for the Advancement of Science via the DOI in this recordData and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and Supplementary Materials. The MacBlo, Price Lake, Hamma Hamma, Lake Washington, West Point log, Lake Sammamish, and Dry Bed Lake tree-ring measurement data will be available upon publication through the U.S. National Oceanic and Atmospheric Administration World Data Service for Paleoclimatology International Tree-Ring Databank (www.ncei.noaa.gov/products/paleoclimatology/tree-ring) as datasets CAN382 and WA155 to WA160. All radiocarbon data are provided in table S2.Compound earthquakes involving simultaneous ruptures along multiple faults often define a region’s upper threshold of maximum magnitude. Yet, the potential for linked faulting remains poorly understood given the infrequency of these events in the historic era. Geological records provide longer perspectives, although temporal uncertainties are too broad to clearly pinpoint single multifault events. Here, we use dendrochronological dating and a cosmogenic radiation pulse to constrain the death dates of earthquake-killed trees along two adjacent fault zones near Seattle, Washington to within a 6-month period between the 923 and 924 CE growing seasons. Our narrow constraints conclusively show linked rupturing that occurred either as a single composite earthquake of estimated magnitude 7.8 or as a closely spaced double earthquake sequence with estimated magnitudes of 7.5 and 7.3. These scenarios, which are not recognized in current hazard models, increase the maximum earthquake size needed for seismic preparedness and engineering design within the Puget Sound region of >4 million residents.U.S. Geological Surve
Expanding the Understanding of Biases in Development of Clinical-Grade Molecular Signatures: A Case Study in Acute Respiratory Viral Infections
The promise of modern personalized medicine is to use molecular and clinical information to better diagnose, manage, and treat disease, on an individual patient basis. These functions are predominantly enabled by molecular signatures, which are computational models for predicting phenotypes and other responses of interest from high-throughput assay data. Data-analytics is a central component of molecular signature development and can jeopardize the entire process if conducted incorrectly. While exploratory data analysis may tolerate suboptimal protocols, clinical-grade molecular signatures are subject to vastly stricter requirements. Closing the gap between standards for exploratory versus clinically successful molecular signatures entails a thorough understanding of possible biases in the data analysis phase and developing strategies to avoid them.Using a recently introduced data-analytic protocol as a case study, we provide an in-depth examination of the poorly studied biases of the data-analytic protocols related to signature multiplicity, biomarker redundancy, data preprocessing, and validation of signature reproducibility. The methodology and results presented in this work are aimed at expanding the understanding of these data-analytic biases that affect development of clinically robust molecular signatures.Several recommendations follow from the current study. First, all molecular signatures of a phenotype should be extracted to the extent possible, in order to provide comprehensive and accurate grounds for understanding disease pathogenesis. Second, redundant genes should generally be removed from final signatures to facilitate reproducibility and decrease manufacturing costs. Third, data preprocessing procedures should be designed so as not to bias biomarker selection. Finally, molecular signatures developed and applied on different phenotypes and populations of patients should be treated with great caution
H2r: Identification of evolutionary important residues by means of an entropy based analysis of multiple sequence alignments
BACKGROUND: A multiple sequence alignment (MSA) generated for a protein can be used to characterise residues by means of a statistical analysis of single columns. In addition to the examination of individual positions, the investigation of co-variation of amino acid frequencies offers insights into function and evolution of the protein and residues. RESULTS: We introduce conn(k), a novel parameter for the characterisation of individual residues. For each residue k, conn(k) is the number of most extreme signals of co-evolution. These signals were deduced from a normalised mutual information (MI) value U(k, l) computed for all pairs of residues k, l. We demonstrate that conn(k) is a more robust indicator than an individual MI-value for the prediction of residues most plausibly important for the evolution of a protein. This proposition was inferred by means of statistical methods. It was further confirmed by the analysis of several proteins. A server, which computes conn(k)-values is available at http://www-bioinf.uni-regensburg.de. CONCLUSION: The algorithms H2r, which analyses MSAs and computes conn(k)-values, characterises a specific class of residues. In contrast to strictly conserved ones, these residues possess some flexibility in the composition of side chains. However, their allocation is sensibly balanced with several other positions, as indicated by conn(k)
DNA repair, genome stability and cancer: a historical perspective
The multistep process of cancer progresses over many years. The prevention of mutations by DNA repair pathways led to an early appreciation of a role for repair in cancer avoidance. However, the broader role of the DNA damage response (DDR) emerged more slowly. In this Timeline article, we reflect on how our understanding of the steps leading to cancer developed, focusing on the role of the DDR. We also consider how our current knowledge can be exploited for cancer therapy
Multiple M. tuberculosis Phenotypes in Mouse and Guinea Pig Lung Tissue Revealed by a Dual-Staining Approach
A unique hallmark of tuberculosis is the granulomatous lesions formed in the lung. Granulomas can be heterogeneous in nature and can develop a necrotic, hypoxic core which is surrounded by an acellular, fibrotic rim. Studying bacilli in this in vivo microenvironment is problematic as Mycobacterium tuberculosis can change its phenotype and also become acid-fast negative. Under in vitro models of differing environments, M. tuberculosis alters its metabolism, transcriptional profile and rate of replication. In this study, we investigated whether these phenotypic adaptations of M. tuberculosis are unique for certain environmental conditions and if they could therefore be used as differential markers. Bacilli were studied using fluorescent acid-fast auramine-rhodamine targeting the mycolic acid containing cell wall, and immunofluorescence targeting bacterial proteins using an anti-M. tuberculosis whole cell lysate polyclonal antibody. These techniques were combined and simultaneously applied to M. tuberculosis in vitro culture samples and to lung sections of M. tuberculosis infected mice and guinea pigs. Two phenotypically different subpopulations of M. tuberculosis were found in stationary culture whilst three subpopulations were found in hypoxic culture and in lung sections. Bacilli were either exclusively acid-fast positive, exclusively immunofluorescent positive or acid-fast and immunofluorescent positive. These results suggest that M. tuberculosis exists as multiple populations in most conditions, even within seemingly a single microenvironment. This is relevant information for approaches that study bacillary characteristics in pooled samples (using lipidomics and proteomics) as well as in M. tuberculosis drug development
Reversal to air-driven sound production revealed by a molecular phylogeny of tongueless frogs, family Pipidae
<p>Abstract</p> <p>Background</p> <p>Evolutionary novelties often appear by conferring completely new functions to pre-existing structures or by innovating the mechanism through which a particular function is performed. Sound production plays a central role in the behavior of frogs, which use their calls to delimit territories and attract mates. Therefore, frogs have evolved complex vocal structures capable of producing a wide variety of advertising sounds. It is generally acknowledged that most frogs call by moving an air column from the lungs through the glottis with the remarkable exception of the family Pipidae, whose members share a highly specialized sound production mechanism independent of air movement.</p> <p>Results</p> <p>Here, we performed behavioral observations in the poorly known African pipid genus <it>Pseudhymenochirus </it>and document that the sound production in this aquatic frog is almost certainly air-driven. However, morphological comparisons revealed an indisputable pipid nature of <it>Pseudhymenochirus </it>larynx. To place this paradoxical pattern into an evolutionary framework, we reconstructed robust molecular phylogenies of pipids based on complete mitochondrial genomes and nine nuclear protein-coding genes that coincided in placing <it>Pseudhymenochirus </it>nested among other pipids.</p> <p>Conclusions</p> <p>We conclude that although <it>Pseudhymenochirus </it>probably has evolved a reversal to the ancestral non-pipid condition of air-driven sound production, the mechanism through which it occurs is an evolutionary innovation based on the derived larynx of pipids. This strengthens the idea that evolutionary solutions to functional problems often emerge based on previous structures, and for this reason, innovations largely depend on possibilities and constraints predefined by the particular history of each lineage.</p
The Long Life of Birds: The Rat-Pigeon Comparison Revisited
The most studied comparison of aging and maximum lifespan potential (MLSP) among endotherms involves the 7-fold longevity difference between rats (MLSP 5y) and pigeons (MLSP 35y). A widely accepted theory explaining MLSP differences between species is the oxidative stress theory, which purports that reactive oxygen species (ROS) produced during mitochondrial respiration damage bio-molecules and eventually lead to the breakdown of regulatory systems and consequent death. Previous rat-pigeon studies compared only aspects of the oxidative stress theory and most concluded that the lower mitochondrial superoxide production of pigeons compared to rats was responsible for their much greater longevity. This conclusion is based mainly on data from one tissue (the heart) using one mitochondrial substrate (succinate). Studies on heart mitochondria using pyruvate as a mitochondrial substrate gave contradictory results. We believe the conclusion that birds produce less mitochondrial superoxide than mammals is unwarranted
Seasonality in Human Zoonotic Enteric Diseases: A Systematic Review
BACKGROUND: Although seasonality is a defining characteristic of many infectious diseases, few studies have described and compared seasonal patterns across diseases globally, impeding our understanding of putative mechanisms. Here, we review seasonal patterns across five enteric zoonotic diseases: campylobacteriosis, salmonellosis, vero-cytotoxigenic Escherichia coli (VTEC), cryptosporidiosis and giardiasis in the context of two primary drivers of seasonality: (i) environmental effects on pathogen occurrence and pathogen-host associations and (ii) population characteristics/behaviour. METHODOLOGY/PRINCIPAL FINDINGS: We systematically reviewed published literature from 1960-2010, resulting in the review of 86 studies across the five diseases. The Gini coefficient compared temporal variations in incidence across diseases and the monthly seasonality index characterised timing of seasonal peaks. Consistent seasonal patterns across transnational boundaries, albeit with regional variations was observed. The bacterial diseases all had a distinct summer peak, with identical Gini values for campylobacteriosis and salmonellosis (0.22) and a higher index for VTEC (Gini 0.36). Cryptosporidiosis displayed a bi-modal peak with spring and summer highs and the most marked temporal variation (Gini = 0.39). Giardiasis showed a relatively small summer increase and was the least variable (Gini = 0.18). CONCLUSIONS/SIGNIFICANCE: Seasonal variation in enteric zoonotic diseases is ubiquitous, with regional variations highlighting complex environment-pathogen-host interactions. Results suggest that proximal environmental influences and host population dynamics, together with distal, longer-term climatic variability could have important direct and indirect consequences for future enteric disease risk. Additional understanding of the concerted influence of these factors on disease patterns may improve assessment and prediction of enteric disease burden in temperate, developed countries
- …