37 research outputs found
Distributionally robust learning under the Wasserstein metric
Despite their satisfactory performance, most existing listwise Learning-To-Rank (LTR) models do not consider the crucial issue of robustness. A data set can be contaminated in various ways, including human error in labeling or annotation, distributional data shift, and malicious adversaries who wish to degrade the algorithm's performance. It has been shown that Distributionally Robust Optimization (DRO) is resilient against various types of noise and perturbations. To fill this gap, we introduce a new listwise LTR model called Distributionally Robust Multi-output Regression Ranking (DRMRR). Different from existing methods, the scoring function of DRMRR was designed as a multivariate mapping from a feature vector to a vector of deviation scores, which captures local context information and cross-document interactions. In this way, we are able to incorporate the LTR metrics into our model. DRMRR uses a Wasserstein DRO framework to minimize a multi-output loss function under the most adverse distributions in the neighborhood of the empirical data distribution defined by a Wasserstein ball. We present a compact and computationally solvable reformulation of the min-max formulation of DRMRR. Our experiments were conducted on two real-world applications: medical document retrieval and drug response prediction, showing that DRMRR notably outperforms state-of-the-art LTR models. We also conducted an extensive analysis to examine the resilience of DRMRR against various types of noise: Gaussian noise, adversarial perturbations, and label poisoning. Accordingly, DRMRR is not only able to achieve significantly better performance than other baselines, but it can maintain a relatively stable performance as more noise is added to the data.Center for Information and Systems Engineering; DMS-1664644 - National Science Foundation; IIS-1914792 - National Science Foundation; CCF-2200052 - National Science Foundation; N00014-19-1-2571 - Department of Defense/ONR; 000000000000000000000000000000000000000000000000000007726917 - Lawrence Berkeley National LaboratoryPublished versio
Outlier detection using distributionally robust optimization under the Wasserstein metric
We present a Distributionally Robust Optimization (DRO) approach to outlier detection in a linear regression setting, where the closeness of probability distributions is measured using the Wasserstein metric. Training samples contaminated with outliers skew the regression plane computed by least squares and thus impede outlier detection. Classical approaches, such as robust regression, remedy this problem by downweighting the contribution of atypical data points. In contrast, our Wasserstein DRO approach hedges against a family of distributions that are close to the empirical distribution. We show that the resulting formulation encompasses a class of models, which include the regularized Least Absolute Deviation (LAD) as a special case. We provide new insights into the regularization term and give guidance on the selection of the regularization coefficient from the standpoint of a confidence region. We establish two types of performance guarantees for the solution to our formulation under mild conditions. One is related to its out-of-sample behavior, and the other concerns the discrepancy between the estimated and true regression planes. Extensive numerical results demonstrate the superiority of our approach to both robust regression and the regularized LAD in terms of estimation accuracy and outlier detection rates
Learning from past bids to participate strategically in day-ahead electricity markets
We consider the process of bidding by electricity suppliers in a day-ahead market context, where each supplier bids a linear non-decreasing function of her generating capacity with the goal of maximizing her individual profit given other competing suppliers' bids. Based on the submitted bids, the market operator schedules suppliers to meet demand during each hour and determines hourly market clearing prices. Eventually, this game-theoretic process reaches a Nash equilibrium when no supplier is motivated to modify her bid. However, solving the individual profit maximization problem requires information of rivals' bids, which are typically not available. To address this issue, we develop an inverse optimization approach for estimating rivals' production cost functions given historical market clearing prices and production levels. We then use these functions to bid strategically and compute Nash equilibrium bids. We present numerical experiments illustrating our methodology, showing good agreement between bids based on the estimated production cost functions with the bids based on the true cost functions. We discuss an extension of our approach that takes into account network congestion resulting in location-dependent pricesFirst author draf
Learning from Past Bids to Participate Strategically in Day-Ahead Electricity Markets
We consider the process of bidding by electricity suppliers in a day-ahead
market context where each supplier bids a linear non-decreasing function of her
generating capacity with the goal of maximizing her individual profit given
other competing suppliers' bids. Based on the submitted bids, the market
operator schedules suppliers to meet demand during each hour and determines
hourly market clearing prices. Eventually, this game-theoretic process reaches
a Nash equilibrium when no supplier is motivated to modify her bid. However,
solving the individual profit maximization problem requires information of
rivals' bids, which are typically not available. To address this issue, we
develop an inverse optimization approach for estimating rivals' production cost
functions given historical market clearing prices and production levels. We
then use these functions to bid strategically and compute Nash equilibrium
bids. We present numerical experiments illustrating our methodology, showing
good agreement between bids based on the estimated production cost functions
with the bids based on the true cost functions. We discuss an extension of our
approach that takes into account network congestion resulting in
location-dependent prices
Potential plasma biomarkers at low altitude for prediction of acute mountain sickness
BackgroundAscending to high altitude can induce a range of physiological and molecular alterations, rendering a proportion of lowlanders unacclimatized. The prediction of acute mountain sickness (AMS) prior to ascent to high altitude remains elusive.MethodsA total of 40 participants were enrolled for our study in the discovery cohort, and plasma samples were collected from all individuals. The subjects were divided into severe AMS-susceptible (sAMS) group, moderate AMS-susceptible (mAMS) group and non-AMS group based on the Lake Louise Score (LLS) at both 5000m and 3700m. Proteomic analysis was conducted on a cohort of 40 individuals to elucidate differentially expressed proteins (DEPs) and associated pathways between AMS-susceptible group and AMS-resistant group at low altitude (1400m) and middle high-altitude (3700m). Subsequently, a validation cohort consisting of 118 individuals was enrolled. The plasma concentration of selected DEPs were quantified using ELISA. Comparative analyses of DEPs among different groups in validation cohort were performed, followed by Receiver Operating Characteristic (ROC) analysis to evaluate the predictive efficiency of DEPs for the occurrence of AMS.ResultsThe occurrence of the AMS symptoms and LLS differed significantly among the three groups in the discovery cohort (p<0.05), as well as in the validation cohort. Comparison of plasma protein profiles using GO analysis revealed that DEPs were primarily enriched in granulocyte activation, neutrophil mediated immunity, and humoral immune response. The comparison of potential biomarkers between the sAMS group and non-AMS group at low altitude revealed statistically higher levels of AAT, SAP and LTF in sAMS group (p=0.01), with a combined area under the curve(AUC) of 0.965. Compared to the mAMS group at low altitude, both SAP and LTF were found to be significantly elevated in the sAMS group, with a combined AUC of 0.887. HSP90-α and SAP exhibited statistically higher levels in the mAMS group compared to the non-AMS group at low altitude, with a combined AUC of 0.874.ConclusionInflammatory and immune related biological processes were significantly different between AMS-susceptible and AMS-resistant groups at low altitude and middle high-altitude. SAP, AAT, LTF and HSP90-α were considered as potential biomarkers at low altitude for the prediction of AMS
Distributionally Robust Learning under the Wasserstein Metric
This dissertation develops a comprehensive statistical learning framework that is robust to (distributional) perturbations in the data using Distributionally Robust Optimization (DRO) under the Wasserstein metric. The learning problems that are studied include: (i) Distributionally Robust Linear Regression (DRLR), which estimates a robustified linear regression plane by minimizing the worst-case expected absolute loss over a probabilistic ambiguity set characterized by the Wasserstein metric; (ii) Groupwise Wasserstein Grouped LASSO (GWGL), which aims at inducing sparsity at a group level when there exists a predefined grouping structure for the predictors, through defining a specially structured Wasserstein metric for DRO; (iii) Optimal decision making using DRLR informed K-Nearest Neighbors (K-NN) estimation, which selects among a set of actions the optimal one through predicting the outcome under each action using K-NN with a distance metric weighted by the DRLR solution; and (iv) Distributionally Robust Multivariate Learning, which solves a DRO problem with a multi-dimensional response/label vector, as in Multivariate Linear Regression (MLR) and Multiclass Logistic Regression (MLG), generalizing the univariate response model addressed in DRLR. A tractable DRO relaxation for each problem is being derived, establishing a connection between robustness and regularization, and obtaining upper bounds on the prediction and estimation errors of the solution. The accuracy and robustness of the estimator is verified through a series of synthetic and real data experiments. The experiments with real data are all associated with various health informatics applications, an application area which motivated the work in this dissertation. In addition to estimation (regression and classification), this dissertation also considers outlier detection applications
Distributionally Robust Multiclass Classification and Applications in Deep Image Classifiers
We develop a Distributionally Robust Optimization (DRO) formulation for
Multiclass Logistic Regression (MLR), which could tolerate data contaminated by
outliers. The DRO framework uses a probabilistic ambiguity set defined as a
ball of distributions that are close to the empirical distribution of the
training set in the sense of the Wasserstein metric. We relax the DRO
formulation into a regularized learning problem whose regularizer is a norm of
the coefficient matrix. We establish out-of-sample performance guarantees for
the solutions to our model, offering insights on the role of the regularizer in
controlling the prediction error. We apply the proposed method in rendering
deep Vision Transformer (ViT)-based image classifiers robust to random and
adversarial attacks. Specifically, using the MNIST and CIFAR-10 datasets, we
demonstrate reductions in test error rate by up to 83.5% and loss by up to
91.3% compared with baseline methods, by adopting a novel random training
method.Comment: This work was intended as a replacement of arXiv:2109.12772 and any
subsequent updates will appear ther
Investigation of microstructures and strengthening mechanisms in an N-doped Co-Cr-Mo alloy fabricated by laser powder bed fusion
This study presented a comprehensive investigation into the microstructures, strengthening mechanisms and deformation behaviours of a novel N-doped Co-28Cr-6Mo (CCMN) alloy fabricated by laser powder bed fusion (LPBF). In addition to the well-known cellular structures, lattice defects including dislocations and stacking faults (SFs), the near-spherical shaped Cr2N precipitates with two typical size distributions were detected. Tensile test results revealed that the LPBF fabricated CCMN alloy demonstrated superior yield strength of 845 ± 49 MPa and elongation to fracture of 12.7 ± 1.9%. The grain boundaries (∼277 MPa), high density of dislocations (∼176–193 MPa), Cr2N precipitates (∼243 MPa) and SFs (∼131 MPa) are regarded as the dominate strengthening contributors. On the other hand, HCP phase triggered by strain induced martensite transformation (SIMT) and the Lomer-Cottrell locks (L-C locks) associated with the numerous SFs significantly enhanced the alloy strain hardening rate. More importantly, the formed Cr2N nanoprecipitates effectively suppressed the strain localisation and the premature failure along the HCP/FCC interfaces by deflecting the continuous growth of SFs, further contributing to the high ductility of the LPBF processed CCMN alloy. The present study is expected to shed light on the future development of N-doped high-performance cobalt-based alloy for the LPBF process