Search CORE

89 research outputs found

The Girl with the Peanut Necklace: Experiences of Infertility and in vitro Fertilization in China

Author: Yu Ruoxi
Publication venue: EliScholar – A Digital Platform for Scholarly Publishing at Yale
Publication date: 24/04/2015
Field of study

A 2014-2015 William Prize for best essay in East Asian Studies was awarded to Ruoxi Yu (Berkeley College \u2715) for her essay submitted to the Department of Anthropology, “The Girl with the Peanut Necklace: Experiences of Infertility and in vitro Fertilization in China.” (Marcia Inhorn, William K. Lanman Jr. Professor of Anthropology, advisor; Susan Brownell, Professor of Anthropology at USML, secondary reader.) Ruoxi Yu’s essay, “The Girl with the Peanut Necklace: Experiences of Infertility and in vitro Fertilization in China,” situates original research within the history of the one-child birth control policy and the tension between the demands of the family and the state. The first thing that strikes one about this senior essay is that, at 130 pages, it is not far from being a dissertation. It is based on 10 weeks of ethnographic fieldwork in an infertility clinic in Tianjin combined with two semesters of library research and writing. The quality of the ethnographic research is remarkable for an undergraduate. The setting was very sensitive and required sitting around the clinic waiting for an opportunity to draw a patient into conversation, eventually asking for permission to conduct an interview. Ruoxi’s social skills and facility in Chinese enabled her to interview a number of women who divulged intimate details of their lives. Even many anthropology Ph.D. students find it difficult to pull meaningful information out of the messiness of real life and to organize it within academic frameworks. In the end, Ruoxi is able to successfully draw from medical anthropology and feminist theory to link her research results with time-honored anthropological debates about the Chinese family, and also with recent thinking about medical technologies and their relationship with the state

Yale University

Does GNN Pretraining Help Molecular Representation?

Author: Dai Hanjun
Sun Ruoxi
Yu Adams Wei
Publication venue
Publication date: 02/11/2022
Field of study

Extracting informative representations of molecules using Graph neural networks (GNNs) is crucial in AI-driven drug discovery. Recently, the graph research community has been trying to replicate the success of self-supervised pretraining in natural language processing, with several successes claimed. However, we find the benefit brought by self-supervised pretraining on small molecular data can be negligible in many cases. We conduct thorough ablation studies on the key components of GNN pretraining, including pretraining objectives, data splitting methods, input features, pretraining dataset scales, and GNN architectures, to see how they affect the accuracy of the downstream tasks. Our first important finding is, self-supervised graph pretraining do not always have statistically significant advantages over non-pretraining methods in many settings. Secondly, although noticeable improvement can be observed with additional supervised pretraining, the improvement may diminish with richer features or more balanced data splits. Thirdly, hyper-parameters could have larger impacts on accuracy of downstream tasks than the choice of pretraining tasks, especially when the scales of downstream tasks are small. Finally, we provide our conjectures where the complexity of some pretraining methods on small molecules might be insufficient, followed by empirical evidences on different pretraining datasets

arXiv.org e-Print Archive

Estimating the Distribution of Random Parameters in a Diffusion Equation Forward Model for a Transdermal Alcohol Biosensor

Author: Fairbairn Catharine E.
Kang Dahyeon
Luczak Susan E.
Pan Ruoxi
Rosen I. G.
Sirlanci Melike
Yu Xin
Publication venue
Publication date: 13/08/2018
Field of study

We estimate the distribution of random parameters in a distributed parameter model with unbounded input and output for the transdermal transport of ethanol in humans. The model takes the form of a diffusion equation with the input being the blood alcohol concentration and the output being the transdermal alcohol concentration. Our approach is based on the idea of reformulating the underlying dynamical system in such a way that the random parameters are now treated as additional space variables. When the distribution to be estimated is assumed to be defined in terms of a joint density, estimating the distribution is equivalent to estimating the diffusivity in a multi-dimensional diffusion equation and thus well-established finite dimensional approximation schemes, functional analytic based convergence arguments, optimization techniques, and computational methods may all be employed. We use our technique to estimate a bivariate normal distribution based on data for multiple drinking episodes from a single subject.Comment: 10 page

arXiv.org e-Print Archive

Caltech Authors

Just Fine-tune Twice: Selective Differential Privacy for Large Language Models

Author: Chen Si
Jia Ruoxi
Shi Weiyan
Yu Zhou
Zhang Chiyuan
Publication venue
Publication date: 15/04/2022
Field of study

With the increasing adoption of NLP models in real-world products, it becomes more and more important to protect these models from privacy leakage. Because private information in language data is sparse, previous research formalized a Selective-Differential-Privacy (SDP) notion to provide protection for sensitive tokens detected by policy functions, and prove its effectiveness on RNN-based models. But the previous mechanism requires separating the private and public model parameters and thus cannot be applied on large attention-based models. In this paper, we propose a simple yet effective just-fine-tune-twice privacy mechanism to first fine-tune on in-domain redacted data and then on in-domain private data, to achieve SDP for large Transformer-based language models. We also design explicit and contextual policy functions to provide protections at different levels. Experiments show that our models achieve strong performance while staying robust to the canary insertion attack. We further show that even under low-resource settings with a small amount of in-domain data, SDP can still improve the model utility. We will release the code, data and models to facilitate future research

arXiv.org e-Print Archive

Selective Differential Privacy for Language Modeling

Author: Cui Aiqi
Jia Ruoxi
Li Evan
Shi Weiyan
Yu Zhou
Publication venue
Publication date: 16/07/2022
Field of study

With the increasing applications of language models, it has become crucial to protect these models from leaking private information. Previous work has attempted to tackle this challenge by training RNN-based language models with differential privacy guarantees. However, applying classical differential privacy to language models leads to poor model performance as the underlying privacy notion is over-pessimistic and provides undifferentiated protection for all tokens in the data. Given that the private information in natural language is sparse (for example, the bulk of an email might not carry personally identifiable information), we propose a new privacy notion, selective differential privacy, to provide rigorous privacy guarantees on the sensitive portion of the data to improve model utility. To realize such a new notion, we develop a corresponding privacy mechanism, Selective-DPSGD, for RNN-based language models. Besides language modeling, we also apply the method to a more concrete application--dialog systems. Experiments on both language modeling and dialog system building show that the proposed privacy-preserving mechanism achieves better utilities while remaining safe under various privacy attacks compared to the baselines. The data and code are released at https://github.com/wyshi/lm_privacy to facilitate future research .Comment: NAACL 202

arXiv.org e-Print Archive

Revisiting Data-Free Knowledge Distillation with Poisoned Teachers

Author: Hong Junyuan
Jia Ruoxi
Lyu Lingjuan
Yu Shuyang
Zeng Yi
Zhou Jiayu
Publication venue
Publication date: 04/06/2023
Field of study

Data-free knowledge distillation (KD) helps transfer knowledge from a pre-trained model (known as the teacher model) to a smaller model (known as the student model) without access to the original training data used for training the teacher model. However, the security of the synthetic or out-of-distribution (OOD) data required in data-free KD is largely unknown and under-explored. In this work, we make the first effort to uncover the security risk of data-free KD w.r.t. untrusted pre-trained models. We then propose Anti-Backdoor Data-Free KD (ABD), the first plug-in defensive method for data-free KD methods to mitigate the chance of potential backdoors being transferred. We empirically evaluate the effectiveness of our proposed ABD in diminishing transferred backdoor knowledge while maintaining compatible downstream performances as the vanilla KD. We envision this work as a milestone for alarming and mitigating the potential backdoors in data-free KD. Codes are released at https://github.com/illidanlab/ABD.Comment: Accepted to ICML 202

arXiv.org e-Print Archive

Convergence and Disparities in Higher Education Fiscal Expenditures in China: A Regional Perspective

Author: Ruoxi L.
Tingting Y.
Xinxin W.
Yu Y.
Publication venue: Academic Research and Publishing UG
Publication date: 01/01/2023
Field of study

У цьому дослідженні досліджуються відмінності та конвергенція у фінансових витратах на вищу освіту в різних регіонах Китаю. У дослідженні використовується аналіз коефіцієнтів Джині та тести σ-конвергенції/β-конвергенції для кількісної оцінки ступеня розбіжностей і вивчення тенденцій конвергенції протягом дванадцятирічного періоду дослідження (2007–2018). Результати проливають світло на дисбаланс у розподілі ресурсів і дають цінну інформацію про зусилля, необхідні для досягнення більш справедливого розподілу бюджетних ресурсів для вищої освіти. Отримані результати показують значні відмінності у фінансових витратах на вищу освіту між східним, центральним, західним і північно-східним регіонами, причому східний регіон демонструє найбільший розрив порівняно з іншими. Примітно, диспропорція між східним і центральним регіонами навіть більша, ніж між східним і західним регіонами, що підкреслює необхідність цілеспрямованих втручань для усунення регіональних дисбалансів. Протягом досліджуваного періоду розрив між Східним і Центральним регіонами залишався стабільно вищим, ніж інші регіональні відмінності. Крім того, дослідження показує загальну тенденцію до скорочення розбіжностей у фіскальних видатках у регіонах, причому найбільш виражена конвергенція спостерігається між центральним і північно-східним регіонами. У Західному регіоні спостерігаються дещо більші відмінності, ніж у Центральному та Північно-Східному регіонах, що, ймовірно, пояснюється більшою підтримкою фіскальної політики та меншою кількістю студентів. Тим не менш, розрив у фіскальних видатках між Західним і Центральним регіонами продемонстрував тенденцію до скорочення. Дослідження також досліджує абсолютну та умовну β-конвергенцію, виявляючи помітні моделі конвергенції у східному та центральному регіонах. Однак західні та північно-східні регіони демонструють різний ступінь конвергенції, що вказує на необхідність механізмів конвергенції, характерних для кожного регіону. Щоб досягти збалансованого розподілу фінансових ресурсів для вищої освіти між регіонами, дослідження рекомендує цільову фіскальну політику, додаткове фінансування та покращення прозорості та підзвітності. Політики повинні зосередитися на посиленні механізмів конвергенції для забезпечення більш справедливого розподілу ресурсів і сприяння сталому розвитку вищої освіти в усій країні. Незважаючи на те, що це дослідження дає цінну інформацію, важливо розглянути інші потенційні фактори, що впливають на диспропорції у фіскальних видатках, такі як політична орієнтація, економічні відмінності та демографічні структури, для більш повного розуміння. У майбутніх дослідженнях можуть бути використані якісні дослідження для подальшого вивчення складнощів дисбалансу фінансових витрат на вищу освіту та визначення ефективних заходів політики.This research investigates the disparities and convergence in higher education fiscal expenditures across different regions in China. The study utilises Gini coefficient analysis and σ-convergence/β-convergence tests to quantify the extent of disparities and explore convergence trends over a twelve-year investigation period (2007–2018). The results shed light on the imbalances in resource allocation and provide valuable insights into the efforts required to achieve a more equitable distribution of fiscal resources for higher education. The findings reveal significant disparities in higher education fiscal expenditures between the Eastern, Central, Western, and Northeastern regions, with the Eastern region exhibiting the largest gap compared to others. Remarkably, the disparity between the Eastern and Central regions is even greater than that between the Eastern and Western regions, emphasising the need for targeted interventions to address regional imbalances. Over the study period, the gap between the Eastern and Central regions remained consistently higher than other regional disparities. Moreover, the research shows a general trend towards narrowing regional fiscal expenditure disparities, with the most pronounced convergence observed between the Central and Northeastern regions. The Western region exhibits slightly larger disparities than the Central and Northeastern regions, possibly attributed to greater fiscal policy support and lower student enrollments. Nevertheless, the fiscal expenditure gap between the Western and Central regions has shown a trend towards reduction. The study also explores absolute and conditional β-convergence, revealing notable convergence patterns in the Eastern and Central regions. However, the Western and Northeastern regions exhibit varying degrees of convergence, indicating the necessity for region-specific convergence mechanisms. To achieve a balanced allocation of financial resources for higher education across regions, the study recommends targeted fiscal policies, additional funding, and improved transparency and accountability. Policymakers should focus on enhancing convergence mechanisms to ensure a more equitable distribution of resources and foster the sustainable development of higher education throughout the country. While this research provides valuable insights, it is essential to consider other potential factors influencing fiscal expenditure disparities, such as policy orientation, economic disparities, and demographic structures, for a more comprehensive understanding. Future research may benefit from qualitative investigations to further explore the complexities of higher education fiscal expenditure imbalances and identify effective policy interventions

Electronic Sumy State University Institutional Repository

Excitement Surfeited Turns to Errors: Deep Learning Testing Framework Based on Excitable Neurons

Author: Chen Jinyin
Chen Ruoxi
Cheng Yao
Jin Haibo
Liu Xianglong
Yu Yue
Zheng Haibin
Publication venue
Publication date: 20/11/2022
Field of study

Despite impressive capabilities and outstanding performance, deep neural networks (DNNs) have captured increasing public concern about their security problems, due to their frequently occurred erroneous behaviors. Therefore, it is necessary to conduct a systematical testing for DNNs before they are deployed to real-world applications. Existing testing methods have provided fine-grained metrics based on neuron coverage and proposed various approaches to improve such metrics. However, it has been gradually realized that a higher neuron coverage does \textit{not} necessarily represent better capabilities in identifying defects that lead to errors. Besides, coverage-guided methods cannot hunt errors due to faulty training procedure. So the robustness improvement of DNNs via retraining by these testing examples are unsatisfactory. To address this challenge, we introduce the concept of excitable neurons based on Shapley value and design a novel white-box testing framework for DNNs, namely DeepSensor. It is motivated by our observation that neurons with larger responsibility towards model loss changes due to small perturbations are more likely related to incorrect corner cases due to potential defects. By maximizing the number of excitable neurons concerning various wrong behaviors of models, DeepSensor can generate testing examples that effectively trigger more errors due to adversarial inputs, polluted data and incomplete training. Extensive experiments implemented on both image classification models and speaker recognition models have demonstrated the superiority of DeepSensor.Comment: 32 page

arXiv.org e-Print Archive