124 research outputs found

    Replica analysis of overfitting in regression models for time-to-event data

    Get PDF
    Overfitting, which happens when the number of parameters in a model is too large compared to the number of data points available for determining these parameters, is a serious and growing problem in survival analysis. While modern medicine presents us with data of unprecedented dimensionality, these data cannot yet be used effectively for clinical outcome prediction. Standard error measures in maximum likelihood regression, such as p-values and z-scores, are blind to overfitting, and even for Cox's proportional hazards model (the main tool of medical statisticians), one finds in literature only rules of thumb on the number of samples required to avoid overfitting. In this paper we present a mathematical theory of overfitting in regression models for time-to-event data, which aims to increase our quantitative understanding of the problem and provide practical tools with which to correct regression outcomes for the impact of overfitting. It is based on the replica method, a statistical mechanical technique for the analysis of heterogeneous many-variable systems that has been used successfully for several decades in physics, biology, and computer science, but not yet in medical statistics. We develop the theory initially for arbitrary regression models for time-to-event data, and verify its predictions in detail for the popular Cox model.Comment: 37 pages, 9 figure

    Protein Networks Reveal Detection Bias and Species Consistency When Analysed by Information-Theoretic Methods

    Get PDF
    We apply our recently developed information-theoretic measures for the characterisation and comparison of protein–protein interaction networks. These measures are used to quantify topological network features via macroscopic statistical properties. Network differences are assessed based on these macroscopic properties as opposed to microscopic overlap, homology information or motif occurrences. We present the results of a large–scale analysis of protein–protein interaction networks. Precise null models are used in our analyses, allowing for reliable interpretation of the results. By quantifying the methodological biases of the experimental data, we can define an information threshold above which networks may be deemed to comprise consistent macroscopic topological properties, despite their small microscopic overlaps. Based on this rationale, data from yeast–two–hybrid methods are sufficiently consistent to allow for intra–species comparisons (between different experiments) and inter–species comparisons, while data from affinity–purification mass–spectrometry methods show large differences even within intra–species comparisons

    Perspectives on utilization of edible coatings and nano-laminate coatings for extension of postharvest storage of fruits and vegetables

    Get PDF
    It is known that in developing countries, a large quantity of fruit and vegetable losses results at postharvest and processing stages due to poor or scarce storage technology and mishandling during harvest. The use of new and innovative technologies for reducing postharvest losses is a requirement that has not been fully covered. The use of edible coatings (mainly based on biopolymers) as a postharvest technique for agricultural commodities has offered biodegradable alternatives in order to solve problems (e.g., microbiological growth) during produce storage. However, biopolymer-based coatings can present some disadvantages such as: poor mechanical properties (e.g., lipids) or poor water vapor barrier properties (e.g., polysaccharides), thus requiring the development of new alternatives to solve these drawbacks. Recently, nanotechnology has emerged as a promising tool in the food processing industry, providing new insights about postharvest technologies on produce storage. Nanotechnological approaches can contribute through the design of functional packing materials with lower amounts of bioactive ingredients, better gas and mechanical properties and with reduced impact on the sensorial qualities of the fruits and vegetables. This work reviews some of the main factors involved in postharvest losses and new technologies for extension of postharvest storage of fruits and vegetables, focused on perspective uses of edible coatings and nano-laminate coatings.María L. Flores-López thanks Mexican Science and Technology Council (CONACYT, Mexico) for PhD fellowship support (CONACYT Grant Number: 215499/310847). Miguel A. Cerqueira (SFRH/BPD/72753/2010) is recipient of a fellowship from the Fundação para a Ciência e Tecnologia (FCT, POPH-QREN and FSE Portugal). The authors also thank the FCT Strategic Project of UID/ BIO/04469/2013 unit, the project RECI/BBB-EBI/0179/2012 (FCOMP-01-0124-FEDER-027462) and the project ‘‘BioInd Biotechnology and Bioengineering for improved Industrial and AgroFood processes,’’ REF. NORTE-07-0124-FEDER-000028 Co-funded by the Programa Operacional Regional do Norte (ON.2 – O Novo Norte), QREN, FEDER. Fundação Cearense de Apoio ao Desenvolvimento Científico e Tecnológico – FUNCAP, CE Brazil (CI10080-00055.01.00/13)

    International incidence of childhood cancer, 2001-10: A population-based registry study

    Get PDF

    Identification of regulatory variants associated with genetic susceptibility to meningococcal disease.

    Get PDF
    Non-coding genetic variants play an important role in driving susceptibility to complex diseases but their characterization remains challenging. Here, we employed a novel approach to interrogate the genetic risk of such polymorphisms in a more systematic way by targeting specific regulatory regions relevant for the phenotype studied. We applied this method to meningococcal disease susceptibility, using the DNA binding pattern of RELA - a NF-kB subunit, master regulator of the response to infection - under bacterial stimuli in nasopharyngeal epithelial cells. We designed a custom panel to cover these RELA binding sites and used it for targeted sequencing in cases and controls. Variant calling and association analysis were performed followed by validation of candidate polymorphisms by genotyping in three independent cohorts. We identified two new polymorphisms, rs4823231 and rs11913168, showing signs of association with meningococcal disease susceptibility. In addition, using our genomic data as well as publicly available resources, we found evidences for these SNPs to have potential regulatory effects on ATXN10 and LIF genes respectively. The variants and related candidate genes are relevant for infectious diseases and may have important contribution for meningococcal disease pathology. Finally, we described a novel genetic association approach that could be applied to other phenotypes

    Antiinflammatory Therapy with Canakinumab for Atherosclerotic Disease

    Get PDF
    Background: Experimental and clinical data suggest that reducing inflammation without affecting lipid levels may reduce the risk of cardiovascular disease. Yet, the inflammatory hypothesis of atherothrombosis has remained unproved. Methods: We conducted a randomized, double-blind trial of canakinumab, a therapeutic monoclonal antibody targeting interleukin-1β, involving 10,061 patients with previous myocardial infarction and a high-sensitivity C-reactive protein level of 2 mg or more per liter. The trial compared three doses of canakinumab (50 mg, 150 mg, and 300 mg, administered subcutaneously every 3 months) with placebo. The primary efficacy end point was nonfatal myocardial infarction, nonfatal stroke, or cardiovascular death. RESULTS: At 48 months, the median reduction from baseline in the high-sensitivity C-reactive protein level was 26 percentage points greater in the group that received the 50-mg dose of canakinumab, 37 percentage points greater in the 150-mg group, and 41 percentage points greater in the 300-mg group than in the placebo group. Canakinumab did not reduce lipid levels from baseline. At a median follow-up of 3.7 years, the incidence rate for the primary end point was 4.50 events per 100 person-years in the placebo group, 4.11 events per 100 person-years in the 50-mg group, 3.86 events per 100 person-years in the 150-mg group, and 3.90 events per 100 person-years in the 300-mg group. The hazard ratios as compared with placebo were as follows: in the 50-mg group, 0.93 (95% confidence interval [CI], 0.80 to 1.07; P = 0.30); in the 150-mg group, 0.85 (95% CI, 0.74 to 0.98; P = 0.021); and in the 300-mg group, 0.86 (95% CI, 0.75 to 0.99; P = 0.031). The 150-mg dose, but not the other doses, met the prespecified multiplicity-adjusted threshold for statistical significance for the primary end point and the secondary end point that additionally included hospitalization for unstable angina that led to urgent revascularization (hazard ratio vs. placebo, 0.83; 95% CI, 0.73 to 0.95; P = 0.005). Canakinumab was associated with a higher incidence of fatal infection than was placebo. There was no significant difference in all-cause mortality (hazard ratio for all canakinumab doses vs. placebo, 0.94; 95% CI, 0.83 to 1.06; P = 0.31). Conclusions: Antiinflammatory therapy targeting the interleukin-1β innate immunity pathway with canakinumab at a dose of 150 mg every 3 months led to a significantly lower rate of recurrent cardiovascular events than placebo, independent of lipid-level lowering. (Funded by Novartis; CANTOS ClinicalTrials.gov number, NCT01327846.

    Global patient outcomes after elective surgery: prospective cohort study in 27 low-, middle- and high-income countries.

    Get PDF
    BACKGROUND: As global initiatives increase patient access to surgical treatments, there remains a need to understand the adverse effects of surgery and define appropriate levels of perioperative care. METHODS: We designed a prospective international 7-day cohort study of outcomes following elective adult inpatient surgery in 27 countries. The primary outcome was in-hospital complications. Secondary outcomes were death following a complication (failure to rescue) and death in hospital. Process measures were admission to critical care immediately after surgery or to treat a complication and duration of hospital stay. A single definition of critical care was used for all countries. RESULTS: A total of 474 hospitals in 19 high-, 7 middle- and 1 low-income country were included in the primary analysis. Data included 44 814 patients with a median hospital stay of 4 (range 2-7) days. A total of 7508 patients (16.8%) developed one or more postoperative complication and 207 died (0.5%). The overall mortality among patients who developed complications was 2.8%. Mortality following complications ranged from 2.4% for pulmonary embolism to 43.9% for cardiac arrest. A total of 4360 (9.7%) patients were admitted to a critical care unit as routine immediately after surgery, of whom 2198 (50.4%) developed a complication, with 105 (2.4%) deaths. A total of 1233 patients (16.4%) were admitted to a critical care unit to treat complications, with 119 (9.7%) deaths. Despite lower baseline risk, outcomes were similar in low- and middle-income compared with high-income countries. CONCLUSIONS: Poor patient outcomes are common after inpatient surgery. Global initiatives to increase access to surgical treatments should also address the need for safe perioperative care. STUDY REGISTRATION: ISRCTN5181700

    Performance of missing transverse momentum reconstruction with the ATLAS detector using proton–proton collisions at √s = 13 TeV

    Get PDF
    The performance of the missing transverse momentum (EmissT) reconstruction with the ATLAS detector is evaluated using data collected in proton–proton collisions at the LHC at a centre-of-mass energy of 13 TeV in 2015. To reconstruct EmissT, fully calibrated electrons, muons, photons, hadronically decaying τ -leptons, and jets reconstructed from calorimeter energy deposits and charged-particle tracks are used. These are combined with the soft hadronic activity measured by reconstructed charged-particle tracks not associated with the hard objects. Possible double counting of contributions from reconstructed charged-particle tracks from the inner detector, energy deposits in the calorimeter, and reconstructed muons from the muon spectrometer is avoided by applying a signal ambiguity resolution procedure which rejects already used signals when combining the various EmissT contributions. The individual terms as well as the overall reconstructed EmissT are evaluated with various performance metrics for scale (linearity), resolution, and sensitivity to the data-taking conditions. The method developed to determine the systematic uncertainties of the EmissT scale and resolution is discussed. Results are shown based on the full 2015 data sample corresponding to an integrated luminosity of 3.2 fb−1

    Pan-cancer analysis of whole genomes

    Get PDF
    Cancer is driven by genetic change, and the advent of massively parallel sequencing has enabled systematic documentation of this variation at the whole-genome scale(1-3). Here we report the integrative analysis of 2,658 whole-cancer genomes and their matching normal tissues across 38 tumour types from the Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium of the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). We describe the generation of the PCAWG resource, facilitated by international data sharing using compute clouds. On average, cancer genomes contained 4-5 driver mutations when combining coding and non-coding genomic elements; however, in around 5% of cases no drivers were identified, suggesting that cancer driver discovery is not yet complete. Chromothripsis, in which many clustered structural variants arise in a single catastrophic event, is frequently an early event in tumour evolution; in acral melanoma, for example, these events precede most somatic point mutations and affect several cancer-associated genes simultaneously. Cancers with abnormal telomere maintenance often originate from tissues with low replicative activity and show several mechanisms of preventing telomere attrition to critical levels. Common and rare germline variants affect patterns of somatic mutation, including point mutations, structural variants and somatic retrotransposition. A collection of papers from the PCAWG Consortium describes non-coding mutations that drive cancer beyond those in the TERT promoter(4); identifies new signatures of mutational processes that cause base substitutions, small insertions and deletions and structural variation(5,6); analyses timings and patterns of tumour evolution(7); describes the diverse transcriptional consequences of somatic mutation on splicing, expression levels, fusion genes and promoter activity(8,9); and evaluates a range of more-specialized features of cancer genomes(8,10-18).Peer reviewe

    Measurement of the t¯tZ and t¯tW cross sections in proton-proton collisions at √s=13 TeV with the ATLAS detector

    Get PDF
    A measurement of the associated production of a top-quark pair (t¯t) with a vector boson (W, Z) in proton-proton collisions at a center-of-mass energy of 13 TeV is presented, using 36.1  fb−1 of integrated luminosity collected by the ATLAS detector at the Large Hadron Collider. Events are selected in channels with two same- or opposite-sign leptons (electrons or muons), three leptons or four leptons, and each channel is further divided into multiple regions to maximize the sensitivity of the measurement. The t¯tZ and t¯tW production cross sections are simultaneously measured using a combined fit to all regions. The best-fit values of the production cross sections are σt¯tZ=0.95±0.08stat±0.10syst pb and σt¯tW=0.87±0.13stat±0.14syst pb in agreement with the Standard Model predictions. The measurement of the t¯tZ cross section is used to set constraints on effective field theory operators which modify the t¯tZ vertex
    corecore