375 research outputs found

    PAC-Bayesian Contrastive Unsupervised Representation Learning

    Get PDF
    Contrastive unsupervised representation learning (CURL) is the state-of-the-art technique to learn representations (as a set of features) from unlabelled data. While CURL has collected several empirical successes recently, theoretical understanding of its performance was still missing. In a recent work, Arora et al. (2019) provide the first generalisation bounds for CURL, relying on a Rademacher complexity. We extend their framework to the flexible PAC-Bayes setting, allowing to deal with the non-iid setting. We present PAC-Bayesian generalisation bounds for CURL, which are then used to derive a new representation learning algorithm. Numerical experiments on real-life datasets illustrate that our algorithm achieves competitive accuracy, and yields generalisation bounds with non-vacuous values

    Clinical trial simulation to evaluate power to compare the antiviral effectiveness of two hepatitis C protease inhibitors using nonlinear mixed effect models: a viral kinetic approach.

    Get PDF
    International audienceBACKGROUND: Models of hepatitis C virus (HCV) kinetics are increasingly used to estimate and to compare in vivo drug's antiviral effectiveness of new potent anti-HCV agents. Viral kinetic parameters can be estimated using non-linear mixed effect models (NLMEM). Here we aimed to evaluate the performance of this approach to precisely estimate the parameters and to evaluate the type I errors and the power of the Wald test to compare the antiviral effectiveness between two treatment groups when data are sparse and/or a large proportion of viral load (VL) are below the limit of detection (BLD). METHODS: We performed a clinical trial simulation assuming two treatment groups with different levels of antiviral effectiveness. We evaluated the precision and the accuracy of parameter estimates obtained on 500 replication of this trial using the stochastic approximation expectation-approximation algorithm which appropriately handles BLD data. Next we evaluated the type I error and the power of the Wald test to assess a difference of antiviral effectiveness between the two groups. Standard error of the parameters and Wald test property were evaluated according to the number of patients, the number of samples per patient and the expected difference in antiviral effectiveness. RESULTS: NLMEM provided precise and accurate estimates for both the fixed effects and the inter-individual variance parameters even with sparse data and large proportion of BLD data. However Wald test with small number of patients and lack of information due to BLD resulted in an inflation of the type I error as compared to the results obtained when no limit of detection of VL was considered. The corrected power of the test was very high and largely outperformed what can be obtained with empirical comparison of the mean VL decline using Wilcoxon test. CONCLUSION: This simulation study shows the benefit of viral kinetic models analyzed with NLMEM over empirical approaches used in most clinical studies. When designing a viral kinetic study, our results indicate that the enrollment of a large number of patients is to be preferred to small population sample with frequent assessments of VL

    Focused Bayesian Prediction

    Full text link
    We propose a new method for conducting Bayesian prediction that delivers accurate predictions without correctly specifying the unknown true data generating process. A prior is defined over a class of plausible predictive models. After observing data, we update the prior to a posterior over these models, via a criterion that captures a user-specified measure of predictive accuracy. Under regularity, this update yields posterior concentration onto the element of the predictive class that maximizes the expectation of the accuracy measure. In a series of simulation experiments and empirical examples we find notable gains in predictive accuracy relative to conventional likelihood-based prediction

    ChEBI: a database and ontology for chemical entities of biological interest

    Get PDF
    Chemical Entities of Biological Interest (ChEBI) is a freely available dictionary of molecular entities focused on ‘small’ chemical compounds. The molecular entities in question are either natural products or synthetic products used to intervene in the processes of living organisms. Genome-encoded macromolecules (nucleic acids, proteins and peptides derived from proteins by cleavage) are not as a rule included in ChEBI. In addition to molecular entities, ChEBI contains groups (parts of molecular entities) and classes of entities. ChEBI includes an ontological classification, whereby the relationships between molecular entities or classes of entities and their parents and/or children are specified. ChEBI is available online at http://www.ebi.ac.uk/chebi

    A Reliable Method for the Selection of Exploitable Melanoma Archival Paraffin Embedded Tissues for Transcript Biomarker Profiling

    Get PDF
    The source tissue for biomarkers mRNA expression profiling of tumors has traditionally been fresh-frozen tissue. The adaptation of formalin-fixed, paraffin-embedded (FFPE) tissues for routine mRNA profiling would however be invaluable in view of their abundance and the clinical information related to them. However, their use in the clinic remains a challenge due to the poor quality of RNA extracted from such tissues. Here, we developed a method for the selection of melanoma archival paraffin-embedded tissues that can be reliably used for transcript biomarker profiling. For that, we used qRT-PCR to conduct a comparative study in matched pairs of frozen and FFPE melanoma tissues of the expression of 25 genes involved in angiogenesis/tumor invasion and 15 housekeeping genes. A classification method was developed that can select the samples with a good frozen/FFPE correlation and identify those that should be discarded on the basis of paraffin data for four reference genes only. We propose therefore a simple and inexpensive assay which improves reliability of mRNA profiling in FFPE samples by allowing the identification and analysis of “good” samples only. This assay which can be extended to other genes would however need validation at the clinical level and on independent tumor series

    “Excellence R Us”: university research and the fetishisation of excellence

    Get PDF
    The rhetoric of “excellence” is pervasive across the academy. It is used to refer to research outputs as well as researchers, theory and education, individuals and organisations, from art history to zoology. But does “excellence” actually mean anything? Does this pervasive narrative of “excellence” do any good? Drawing on a range of sources we interrogate “excellence” as a concept and find that it has no intrinsic meaning in academia. Rather it functions as a linguistic interchange mechanism. To investigate whether this linguistic function is useful we examine how the rhetoric of excellence combines with narratives of scarcity and competition to show that the hypercompetition that arises from the performance of “excellence” is completely at odds with the qualities of good research. We trace the roots of issues in reproducibility, fraud, and homophily to this rhetoric. But we also show that this rhetoric is an internal, and not primarily an external, imposition. We conclude by proposing an alternative rhetoric based on soundness and capacity-building. In the final analysis, it turns out that that “excellence” is not excellent. Used in its current unqualified form it is a pernicious and dangerous rhetoric that undermines the very foundations of good research and scholarship

    Crises and collective socio-economic phenomena: simple models and challenges

    Full text link
    Financial and economic history is strewn with bubbles and crashes, booms and busts, crises and upheavals of all sorts. Understanding the origin of these events is arguably one of the most important problems in economic theory. In this paper, we review recent efforts to include heterogeneities and interactions in models of decision. We argue that the Random Field Ising model (RFIM) indeed provides a unifying framework to account for many collective socio-economic phenomena that lead to sudden ruptures and crises. We discuss different models that can capture potentially destabilising self-referential feedback loops, induced either by herding, i.e. reference to peers, or trending, i.e. reference to the past, and account for some of the phenomenology missing in the standard models. We discuss some empirically testable predictions of these models, for example robust signatures of RFIM-like herding effects, or the logarithmic decay of spatial correlations of voting patterns. One of the most striking result, inspired by statistical physics methods, is that Adam Smith's invisible hand can badly fail at solving simple coordination problems. We also insist on the issue of time-scales, that can be extremely long in some cases, and prevent socially optimal equilibria to be reached. As a theoretical challenge, the study of so-called "detailed-balance" violating decision rules is needed to decide whether conclusions based on current models (that all assume detailed-balance) are indeed robust and generic.Comment: Review paper accepted for a special issue of J Stat Phys; several minor improvements along reviewers' comment

    Clinically Relevant Characterization of Lung Adenocarcinoma Subtypes Based on Cellular Pathways: An International Validation Study

    Get PDF
    Lung adenocarcinoma (AD) represents a predominant type of lung cancer demonstrating significant morphologic and molecular heterogeneity. We sought to understand this heterogeneity by utilizing gene expression analyses of 432 AD samples and examining associations between 27 known cancer-related pathways and the AD subtype, clinical characteristics and patient survival. Unsupervised clustering of AD and gene expression enrichment analysis reveals that cell proliferation is the most important pathway separating tumors into subgroups. Further, AD with increased cell proliferation demonstrate significantly poorer outcome and an increased solid AD subtype component. Additionally, we find that tumors with any solid component have decreased survival as compared to tumors without a solid component. These results lead to the potential to use a relatively simple pathological examination of a tumor in order to determine its aggressiveness and the patient's prognosis. Additional results suggest the ability to use a similar approach to determine a patient's sensitivity to targeted treatment. We then demonstrated the consistency of these findings using two independent AD cohorts from Asia (N = 87) and Europe (N = 89) using the identical analytic procedures

    Molecular apocrine differentiation is a common feature of breast cancer in patients with germline PTEN mutations

    Get PDF
    International audienceINTRODUCTION: Breast carcinoma is the main malignant tumor occurring in patients with Cowden disease, a cancer-prone syndrome caused by germline mutation of the tumor suppressor gene PTEN characterized by the occurrence throughout life of hyperplastic, hamartomatous and malignant growths affecting various organs. The absence of known histological features for breast cancer arising in a PTEN-mutant background prompted us to explore them for potential new markers. METHODS: We first performed a microarray study of three tumors from patients with Cowden disease in the context of a transcriptomic study of 74 familial breast cancers. A subsequent histological and immunohistochemical study including 12 additional cases of Cowden disease breast carcinomas was performed to confirm the microarray data. RESULTS: Unsupervised clustering of the 74 familial tumors followed the intrinsic gene classification of breast cancer except for a group of five tumors that included the three Cowden tumors. The gene expression profile of the Cowden tumors shows considerable overlap with that of a breast cancer subgroup known as molecular apocrine breast carcinoma, which is suspected to have increased androgenic signaling and shows frequent ERBB2 amplification in sporadic tumors. The histological and immunohistochemical study showed that several cases had apocrine histological features and expressed GGT1, which is a potential new marker for apocrine breast carcinoma. CONCLUSIONS: These data suggest that activation of the ERBB2-PI3K-AKT pathway by loss of PTEN at early stages of tumorigenesis promotes the formation of breast tumors with apocrine features
    corecore