Search CORE

31 research outputs found

A Note on "Efficient Task-Specific Data Valuation for Nearest Neighbor Algorithms"

Author: Jia Ruoxi
Wang Jiachen T.
Publication venue
Publication date: 09/04/2023
Field of study

Data valuation is a growing research field that studies the influence of individual data points for machine learning (ML) models. Data Shapley, inspired by cooperative game theory and economics, is an effective method for data valuation. However, it is well-known that the Shapley value (SV) can be computationally expensive. Fortunately, Jia et al. (2019) showed that for K-Nearest Neighbors (KNN) models, the computation of Data Shapley is surprisingly simple and efficient. In this note, we revisit the work of Jia et al. (2019) and propose a more natural and interpretable utility function that better reflects the performance of KNN models. We derive the corresponding calculation procedure for the Data Shapley of KNN classifiers/regressors with the new utility functions. Our new approach, dubbed soft-label KNN-SV, achieves the same time complexity as the original method. We further provide an efficient approximation algorithm for soft-label KNN-SV based on locality sensitive hashing (LSH). Our experimental results demonstrate that Soft-label KNN-SV outperforms the original method on most datasets in the task of mislabeled data detection, making it a better baseline for future work on data valuation

arXiv.org e-Print Archive

Renyi Differential Privacy of Propose-Test-Release and Applications to Private and Robust Machine Learning

Author: Jia Ruoxi
Mahloujifar Saeed
Mittal Prateek
Wang Jiachen T.
Wang Shouda
Publication venue
Publication date: 16/09/2022
Field of study

Propose-Test-Release (PTR) is a differential privacy framework that works with local sensitivity of functions, instead of their global sensitivity. This framework is typically used for releasing robust statistics such as median or trimmed mean in a differentially private manner. While PTR is a common framework introduced over a decade ago, using it in applications such as robust SGD where we need many adaptive robust queries is challenging. This is mainly due to the lack of Renyi Differential Privacy (RDP) analysis, an essential ingredient underlying the moments accountant approach for differentially private deep learning. In this work, we generalize the standard PTR and derive the first RDP bound for it when the target function has bounded global sensitivity. We show that our RDP bound for PTR yields tighter DP guarantees than the directly analyzed (\eps, \delta)-DP. We also derive the algorithm-specific privacy amplification bound of PTR under subsampling. We show that our bound is much tighter than the general upper bound and close to the lower bound. Our RDP bounds enable tighter privacy loss calculation for the composition of many adaptive runs of PTR. As an application of our analysis, we show that PTR and our theoretical results can be used to design differentially private variants for byzantine robust training algorithms that use robust statistics for gradients aggregation. We conduct experiments on the settings of label, feature, and gradient corruption across different datasets and architectures. We show that PTR-based private and robust training algorithm significantly improves the utility compared with the baseline.Comment: NeurIPS 202

arXiv.org e-Print Archive

Optimizing active surveillance strategies to balance the competing goals of early detection of grade progression and minimizing harm from biopsies

Author: Auffenberg Gregory B.
Barnett Christine L.
Cheng Zian
Denton Brian T.
Mamawala Mufaddal
Miller David C.
Montie James E.
Wang Jiachen
Wei John T.
Yang Fan
Publication venue: 'Wiley'
Publication date: 13/11/2017
Field of study

Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/142555/1/cncr31101.pdfhttps://deepblue.lib.umich.edu/bitstream/2027.42/142555/2/cncr31101_am.pd

Crossref

Deep Blue Documents at the University of Michigan

Association of MMP-2 gene haplotypes with thoracic aortic dissection in chinese han population

Author: A Beeghly-Fadiel
AM Wendelboe
B Giusti
B Gong
CS Carlson
EM Isselbacher
F Porto Del
G Gavazzi
H Chintala
Haiyang Li
Hongjia Zhang
IJ Nunez-Gil
International HapMap C
JC Barrett
Jiachen Li
Jianrong Li
JM Chapman
JS Ikonomidis
JY Mak
K Kessler
KH Leung
L Chen
L Liu
M Fatar
M Proietta
Ming Gong
MJ Bown
N Cifani
N Sela-Passwell
O Liu
O Liu
Ou Liu
R Wojciechowski
SB Gabriel
T Kurihara
W Xiong
Xiaolong Wang
XL Wang
Y Hua
Y Yuan
Yanwen Qin
Yi Xin
Yuyong Liu
Z Hu
Z Wang
Z Xu
Z Xu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Comparative Study on the Early Stage of Skid Resistance Development between Polyurethane-Bound Porous Mixture and Asphalt Mixture

Author: Cong Lin (author)
Shi Jiachen (author)
Tan Le (author)
Wang T. (author)
Yang Fan (author)
Yu Meng (author)
Publication venue: 'American Society of Civil Engineers (ASCE)'
Publication date: 01/01/2020
Field of study

Polyurethane-bound porous mixture (PPM) is a new type of pavement material that has shown some potential for overcoming common asphalt mixtures mechanical failures. However, little research has been done on its skid resistance performance. This work presents a comparative study of the skid resistance development between PPM and asphalt mixtures at their early stage. In this study, the three mixtures were bonded by three type binders. The three type binders were polyurethane, 70# virgin bitumen, and styrene-butadiene-styrene (SBS) modified asphalt. In order to distinguished the three type mixtures, we named them PPM, BAM, and SAM respectively. A Taber abraser was used to test the polishing property of binders. A third-scale model mobile loading simulator (MMLS3) was used to simulate the traffic loadings on mixtures, and a British pendulum tester was used to measure the skid resistance of the three types of mixtures in the loading process. The binder polishing test results show a good linear relationship between the binder's mass loss and the polishing cycle. The slope of the fitting line of the two parameters was defined as binder coefficient (BC) to characterize the polishing property of the binder. The mixture test results show that the skid resistance development trend of three mixtures is similar, as it first increases, then decreases, then finally flattens. However, the British pendulum number peak value and stable value of PPM are lower than that of SAM. The order of the number of loading times of peak (NLTP) of the three mixtures is SAM>PPM>BAM. Another good linear relationship is found between BC and NLTP, and the R2 of the fitting model is 0.85, which indicates that the polishing property of binder is effective for predicting the moment of occurrence of the mixture skid resistance peak.Accepted Author ManuscriptUrban Studie

TU Delft Repository

Research on Reliability Assessment of Thyristor in HVDC Converter Valve

Author: Cuicui Liu
Fang Zhuo
Feng Wang
Jiachen Tian
Liang N
Ning Liang
Xie T
Xie T
Yang J
Yating Gou
Zhang J.B
Publication venue: 'IOP Publishing'
Publication date
Field of study

Crossref

Fast growth of inch-sized single-crystalline graphene from a controlled single nucleus on Cu-Ni alloys

Author: AW Tsen
CR Dean
D Waldmann
DC Geng
Feng Ding
Guangyuan Lu
H Kim
H Wang
H Zhou
Haomin Wang
Huishan Wang
I Vlassiouk
JH Lee
Jiachen Xue
KS Novoselov
L Gao
L Liao
L Wang
Mianheng Jiang
N Petrone
Qinghong Yuan
Qingkai Yu
QK Yu
QK Yu
S Bae
S Chen
T Ma
T Wu
Tianru Wu
VKS Shante
W Gannett
X Li
Xiaoming Xie
Xuefu Zhang
Y Hao
Y Liu
Y Zhang
YB Zhang
Z Yan
Z Yan
Z Yan
Zhihong Liu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

Wafer-scale single-crystalline graphene monolayers are highly sought after as an ideal platform for electronic and other applications(1-3). At present, state-of-the-art growth methods based on chemical vapour deposition allow the synthesis of one-centimetre-sized single-crystalline graphene domains in similar to 12 h, by suppressing nucleation events on the growth substrate(4). Here we demonstrate an efficient strategy for achieving large-area single-crystalline graphene by letting a single nucleus evolve into a monolayer at a fast rate. By locally feeding carbon precursors to a desired position of a substrate composed of an optimized Cu-Ni alloy, we synthesized an similar to 1.5-inch-large graphene monolayer in 2.5 h. Localized feeding induces the formation of a single nucleus on the entire substrate, and the optimized alloy activates an isothermal segregation mechanism that greatly expedites the growth rate(5,6). This approach may also prove effective for the synthesis of wafer-scale single-crystalline monolayers of other two-dimensional materials.ope

The Hong Kong Polytechnic University Pao Yue-kong Library

Crossref

PolyU Institutional Repository

ScholarWorks@UNIST

University of Queensland eSpace