56 research outputs found

    Predictor-Rejector Multi-Class Abstention: Theoretical Analysis and Algorithms

    Full text link
    We study the key framework of learning with abstention in the multi-class classification setting. In this setting, the learner can choose to abstain from making a prediction with some pre-defined cost. We present a series of new theoretical and algorithmic results for this learning problem in the predictor-rejector framework. We introduce several new families of surrogate losses for which we prove strong non-asymptotic and hypothesis set-specific consistency guarantees, thereby resolving positively two existing open questions. These guarantees provide upper bounds on the estimation error of the abstention loss function in terms of that of the surrogate loss. We analyze both a single-stage setting where the predictor and rejector are learned simultaneously and a two-stage setting crucial in applications, where the predictor is learned in a first stage using a standard surrogate loss such as cross-entropy. These guarantees suggest new multi-class abstention algorithms based on minimizing these surrogate losses. We also report the results of extensive experiments comparing these algorithms to the current state-of-the-art algorithms on CIFAR-10, CIFAR-100 and SVHN datasets. Our results demonstrate empirically the benefit of our new surrogate losses and show the remarkable performance of our broadly applicable two-stage abstention algorithm

    Cross-Entropy Loss Functions: Theoretical Analysis and Applications

    Full text link
    Cross-entropy is a widely used loss function in applications. It coincides with the logistic loss applied to the outputs of a neural network, when the softmax is used. But, what guarantees can we rely on when using cross-entropy as a surrogate loss? We present a theoretical analysis of a broad family of losses, comp-sum losses, that includes cross-entropy (or logistic loss), generalized cross-entropy, the mean absolute error and other loss cross-entropy-like functions. We give the first HH-consistency bounds for these loss functions. These are non-asymptotic guarantees that upper bound the zero-one loss estimation error in terms of the estimation error of a surrogate loss, for the specific hypothesis set HH used. We further show that our bounds are tight. These bounds depend on quantities called minimizability gaps, which only depend on the loss function and the hypothesis set. To make them more explicit, we give a specific analysis of these gaps for comp-sum losses. We also introduce a new family of loss functions, smooth adversarial comp-sum losses, derived from their comp-sum counterparts by adding in a related smooth term. We show that these loss functions are beneficial in the adversarial setting by proving that they admit HH-consistency bounds. This leads to new adversarial robustness algorithms that consist of minimizing a regularized smooth adversarial comp-sum loss. While our main purpose is a theoretical analysis, we also present an extensive empirical analysis comparing comp-sum losses. We further report the results of a series of experiments demonstrating that our adversarial robustness algorithms outperform the current state-of-the-art, while also achieving a superior non-adversarial accuracy

    Regression with Multi-Expert Deferral

    Full text link
    Learning to defer with multiple experts is a framework where the learner can choose to defer the prediction to several experts. While this problem has received significant attention in classification contexts, it presents unique challenges in regression due to the infinite and continuous nature of the label space. In this work, we introduce a novel framework of regression with deferral, which involves deferring the prediction to multiple experts. We present a comprehensive analysis for both the single-stage scenario, where there is simultaneous learning of predictor and deferral functions, and the two-stage scenario, which involves a pre-trained predictor with a learned deferral function. We introduce new surrogate loss functions for both scenarios and prove that they are supported by HH-consistency bounds. These bounds provide consistency guarantees that are stronger than Bayes consistency, as they are non-asymptotic and hypothesis set-specific. Our framework is versatile, applying to multiple experts, accommodating any bounded regression losses, addressing both instance-dependent and label-dependent costs, and supporting both single-stage and two-stage methods. A by-product is that our single-stage formulation includes the recent regression with abstention framework (Cheng et al., 2023) as a special case, where only a single expert, the squared loss and a label-independent cost are considered. Minimizing our proposed loss functions directly leads to novel algorithms for regression with deferral. We report the results of extensive experiments showing the effectiveness of our proposed algorithms

    Post-stroke cognitive impairment: exploring molecular mechanisms and omics biomarkers for early identification and intervention

    Get PDF
    Post-stroke cognitive impairment (PSCI) is a major stroke consequence that has a severe impact on patients’ quality of life and survival rate. For this reason, it is especially crucial to identify and intervene early in high-risk groups during the acute phase of stroke. Currently, there are no reliable and efficient techniques for the early diagnosis, appropriate evaluation, or prognostication of PSCI. Instead, plenty of biomarkers in stroke patients have progressively been linked to cognitive impairment in recent years. High-throughput omics techniques that generate large amounts of data and process it to a high quality have been used to screen and identify biomarkers of PSCI in order to investigate the molecular mechanisms of the disease. These techniques include metabolomics, which explores dynamic changes in the organism, gut microbiomics, which studies host–microbe interactions, genomics, which elucidates deeper disease mechanisms, transcriptomics and proteomics, which describe gene expression and regulation. We looked through electronic databases like PubMed, the Cochrane Library, Embase, Web of Science, and common databases for each omics to find biomarkers that might be connected to the pathophysiology of PSCI. As all, we found 34 studies: 14 in the field of metabolomics, 5 in the field of gut microbiomics, 5 in the field of genomics, 4 in the field of transcriptomics, and 7 in the field of proteomics. We discovered that neuroinflammation, oxidative stress, and atherosclerosis may be the primary causes of PSCI development, and that metabolomics may play a role in the molecular mechanisms of PSCI. In this study, we summarized the existing issues across omics technologies and discuss the latest discoveries of PSCI biomarkers in the context of omics, with the goal of investigating the molecular causes of post-stroke cognitive impairment. We also discuss the potential therapeutic utility of omics platforms for PSCI mechanisms, diagnosis, and intervention in order to promote the area’s advancement towards precision PSCI treatment

    No Banquet Can Do without Liquor: Alcohol counterfeiting in the People’s Republic of China

    Get PDF
    The illegal trade in alcohol has been an empirical manifestation of organised crime with a very long history; yet, the nature of the illegal trade in alcohol has received relatively limited academic attention in recent years despite the fact that it has been linked with significant tax evasion as well as serious health problems and even deaths. The current article focuses on a specific type associated with the illegal trade in alcohol, the counterfeiting of alcohol in China. The article pays particular attention to the counterfeiting of baijiu, Chinese liquor in mainland China. The aim of the article is to offer an account of the social organisation of alcohol counterfeiting business in China by illustrating the counterfeiting process, the actors in the business as well as its possible embeddedness in legal practices and industries/trades. The alcohol counterfeiting business is highly reflective to the market demand and consumer needs. Alcohol counterfeiting in China is characterised primarily by independent actors many of whom are subcontracted to provide commodities and services about the counterfeiting process. The business relies on personal networks – family and extended family members, friends and acquaintances. Relationships between actors in the business are very often based on a customer-supplier relationship or a ‘business-to-business market’. The alcohol counterfeiting business in China highlights the symbiotic relationship between illegal and legal businesses

    Non-Standard Errors

    Get PDF
    In statistics, samples are drawn from a population in a data-generating process (DGP). Standard errors measure the uncertainty in estimates of population parameters. In science, evidence is generated to test hypotheses in an evidence-generating process (EGP). We claim that EGP variation across researchers adds uncertainty: Non-standard errors (NSEs). We study NSEs by letting 164 teams test the same hypotheses on the same data. NSEs turn out to be sizable, but smaller for better reproducible or higher rated research. Adding peer-review stages reduces NSEs. We further find that this type of uncertainty is underestimated by participants

    Limits-to-arbitrage in the Australian Equity Market

    No full text
    Shleifer and Vishny (1997) and Pontiff (2006) contend that limits-to-arbitrage prevent investors from correcting mispricing, thereby allowing well-known stock market anomalies to persist. While there is a respectable body of literature on Australian stock market anomalies, the influence of limits-to-arbitrage remains largely unexplored. As such, this thesis conducts a comprehensive study of the role played by arbitrage costs for a variety of Australian stock market anomalies. In addition, this thesis also examines how different types of arbitrage costs affect a selection of prominent anomalies (size, book-to-market, gross profitability, asset growth, momentum, MAX, beta and total volatility)

    The MAX effect: An exploration of risk and mispricing explanations

    No full text
    This paper studies the role that risk and mispricing play in the negative relation between extreme positive returns and future returns. We document a strong ‘MAX effect’ in Australian equities over 1991–2013 that is robust to risk adjustment, controlling for other influential stock characteristics and, importantly, manifests in a partition of the 500 largest stocks. While there is no evidence that MAX proxies for sensitivity to risk, the findings are highly consistent with a mispricing explanation. Adapting the recent methodological innovation of Stambaugh et al. (2015) to classify stocks by their degree of mispricing, we show that the MAX effect concentrates amongst the most-overpriced stocks but actually reverses amongst the most-underpriced stocks. Consistent with arbitrage asymmetry, the magnitude of the MAX effect amongst overpriced stocks exceeds that amongst underpriced stocks, leading to the overall negative relation that has been well documented

    Anomalies, risk adjustment and seasonality: Australian evidence

    No full text
    On the basis of raw return analysis, economically significant anomalies appear to exist in relation to the size, momentum, book-to-market and profitability of Australian firms. However, characteristic-sorted portfolios are shown to load in very particular ways on multiple risk factors. After adjusting for exposure to risk, convincing evidence only remains for the size premium. An analysis of seasonality shows that, rather than being consistent throughout the year, anomaly returns are concentrated in a handful of months. We provide and test preliminary explanations of the observed seasonality in these well-known anomalies
    corecore