31 research outputs found

    A Note on "Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms"

    Full text link
    Data valuation is a growing research field that studies the influence of individual data points for machine learning (ML) models. Data Shapley, inspired by cooperative game theory and economics, is an effective method for data valuation. However, it is well-known that the Shapley value (SV) can be computationally expensive. Fortunately, Jia et al. (2019) showed that for K-Nearest Neighbors (KNN) models, the computation of Data Shapley is surprisingly simple and efficient. In this note, we revisit the work of Jia et al. (2019) and propose a more natural and interpretable utility function that better reflects the performance of KNN models. We derive the corresponding calculation procedure for the Data Shapley of KNN classifiers/regressors with the new utility functions. Our new approach, dubbed soft-label KNN-SV, achieves the same time complexity as the original method. We further provide an efficient approximation algorithm for soft-label KNN-SV based on locality sensitive hashing (LSH). Our experimental results demonstrate that Soft-label KNN-SV outperforms the original method on most datasets in the task of mislabeled data detection, making it a better baseline for future work on data valuation

    Renyi Differential Privacy of Propose-Test-Release and Applications to Private and Robust Machine Learning

    Full text link
    Propose-Test-Release (PTR) is a differential privacy framework that works with local sensitivity of functions, instead of their global sensitivity. This framework is typically used for releasing robust statistics such as median or trimmed mean in a differentially private manner. While PTR is a common framework introduced over a decade ago, using it in applications such as robust SGD where we need many adaptive robust queries is challenging. This is mainly due to the lack of Renyi Differential Privacy (RDP) analysis, an essential ingredient underlying the moments accountant approach for differentially private deep learning. In this work, we generalize the standard PTR and derive the first RDP bound for it when the target function has bounded global sensitivity. We show that our RDP bound for PTR yields tighter DP guarantees than the directly analyzed (\eps, \delta)-DP. We also derive the algorithm-specific privacy amplification bound of PTR under subsampling. We show that our bound is much tighter than the general upper bound and close to the lower bound. Our RDP bounds enable tighter privacy loss calculation for the composition of many adaptive runs of PTR. As an application of our analysis, we show that PTR and our theoretical results can be used to design differentially private variants for byzantine robust training algorithms that use robust statistics for gradients aggregation. We conduct experiments on the settings of label, feature, and gradient corruption across different datasets and architectures. We show that PTR-based private and robust training algorithm significantly improves the utility compared with the baseline.Comment: NeurIPS 202

    Comparative Study on the Early Stage of Skid Resistance Development between Polyurethane-Bound Porous Mixture and Asphalt Mixture

    No full text
    Polyurethane-bound porous mixture (PPM) is a new type of pavement material that has shown some potential for overcoming common asphalt mixtures mechanical failures. However, little research has been done on its skid resistance performance. This work presents a comparative study of the skid resistance development between PPM and asphalt mixtures at their early stage. In this study, the three mixtures were bonded by three type binders. The three type binders were polyurethane, 70# virgin bitumen, and styrene-butadiene-styrene (SBS) modified asphalt. In order to distinguished the three type mixtures, we named them PPM, BAM, and SAM respectively. A Taber abraser was used to test the polishing property of binders. A third-scale model mobile loading simulator (MMLS3) was used to simulate the traffic loadings on mixtures, and a British pendulum tester was used to measure the skid resistance of the three types of mixtures in the loading process. The binder polishing test results show a good linear relationship between the binder's mass loss and the polishing cycle. The slope of the fitting line of the two parameters was defined as binder coefficient (BC) to characterize the polishing property of the binder. The mixture test results show that the skid resistance development trend of three mixtures is similar, as it first increases, then decreases, then finally flattens. However, the British pendulum number peak value and stable value of PPM are lower than that of SAM. The order of the number of loading times of peak (NLTP) of the three mixtures is SAM>PPM>BAM. Another good linear relationship is found between BC and NLTP, and the R2 of the fitting model is 0.85, which indicates that the polishing property of binder is effective for predicting the moment of occurrence of the mixture skid resistance peak.Accepted Author ManuscriptUrban Studie

    Fast growth of inch-sized single-crystalline graphene from a controlled single nucleus on Cu-Ni alloys

    No full text
    Wafer-scale single-crystalline graphene monolayers are highly sought after as an ideal platform for electronic and other applications(1-3). At present, state-of-the-art growth methods based on chemical vapour deposition allow the synthesis of one-centimetre-sized single-crystalline graphene domains in similar to 12 h, by suppressing nucleation events on the growth substrate(4). Here we demonstrate an efficient strategy for achieving large-area single-crystalline graphene by letting a single nucleus evolve into a monolayer at a fast rate. By locally feeding carbon precursors to a desired position of a substrate composed of an optimized Cu-Ni alloy, we synthesized an similar to 1.5-inch-large graphene monolayer in 2.5 h. Localized feeding induces the formation of a single nucleus on the entire substrate, and the optimized alloy activates an isothermal segregation mechanism that greatly expedites the growth rate(5,6). This approach may also prove effective for the synthesis of wafer-scale single-crystalline monolayers of other two-dimensional materials.ope
    corecore