26 research outputs found
Differentially Private Data Releasing for Smooth Queries with Synthetic Database Output
We consider accurately answering smooth queries while preserving differential
privacy. A query is said to be -smooth if it is specified by a function
defined on whose partial derivatives up to order are all
bounded. We develop an -differentially private mechanism for the
class of -smooth queries. The major advantage of the algorithm is that it
outputs a synthetic database. In real applications, a synthetic database output
is appealing. Our mechanism achieves an accuracy of , and runs in polynomial time. We also
generalize the mechanism to preserve -differential privacy
with slightly improved accuracy. Extensive experiments on benchmark datasets
demonstrate that the mechanisms have good accuracy and are efficient
Prototypical Fine-tuning: Towards Robust Performance Under Varying Data Sizes
In this paper, we move towards combining large parametric models with
non-parametric prototypical networks. We propose prototypical fine-tuning, a
novel prototypical framework for fine-tuning pretrained language models (LM),
which automatically learns a bias to improve predictive performance for varying
data sizes, especially low-resource settings. Our prototypical fine-tuning
approach can automatically adjust the model capacity according to the number of
data points and the model's inherent attributes. Moreover, we propose four
principles for effective prototype fine-tuning towards the optimal solution.
Experimental results across various datasets show that our work achieves
significant performance improvements under various low-resource settings, as
well as comparable and usually better performances in high-resource scenarios.Comment: Published as a conference paper at AAAI 202
Better to Ask in English: Cross-Lingual Evaluation of Large Language Models for Healthcare Queries
Large language models (LLMs) are transforming the ways the general public
accesses and consumes information. Their influence is particularly pronounced
in pivotal sectors like healthcare, where lay individuals are increasingly
appropriating LLMs as conversational agents for everyday queries. While LLMs
demonstrate impressive language understanding and generation proficiencies,
concerns regarding their safety remain paramount in these high-stake domains.
Moreover, the development of LLMs is disproportionately focused on English. It
remains unclear how these LLMs perform in the context of non-English languages,
a gap that is critical for ensuring equity in the real-world use of these
systems.This paper provides a framework to investigate the effectiveness of
LLMs as multi-lingual dialogue systems for healthcare queries. Our
empirically-derived framework XlingEval focuses on three fundamental criteria
for evaluating LLM responses to naturalistic human-authored health-related
questions: correctness, consistency, and verifiability. Through extensive
experiments on four major global languages, including English, Spanish,
Chinese, and Hindi, spanning three expert-annotated large health Q&A datasets,
and through an amalgamation of algorithmic and human-evaluation strategies, we
found a pronounced disparity in LLM responses across these languages,
indicating a need for enhanced cross-lingual capabilities. We further propose
XlingHealth, a cross-lingual benchmark for examining the multilingual
capabilities of LLMs in the healthcare context. Our findings underscore the
pressing need to bolster the cross-lingual capacities of these models, and to
provide an equitable information ecosystem accessible to all.Comment: 18 pages, 7 figure
CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents
Large language models (LLMs) have been widely used as agents to complete
different tasks, such as personal assistance or event planning. While most work
has focused on cooperation and collaboration between agents, little work
explores competition, another important mechanism that fosters the development
of society and economy. In this paper, we seek to examine the competition
behaviors in LLM-based agents. We first propose a general framework to study
the competition between agents. Then, we implement a practical competitive
environment using GPT-4 to simulate a virtual town with two types of agents,
including restaurant agents and customer agents. Specifically, restaurant
agents compete with each other to attract more customers, where the competition
fosters them to transform, such as cultivating new operating strategies. The
results of our experiments reveal several interesting findings ranging from
social learning to Matthew Effect, which aligns well with existing sociological
and economic theories. We believe that competition between agents deserves
further investigation to help us understand society better. The code will be
released soon.Comment: Technical report; 21 page
Predicting Information Pathways Across Online Communities
The problem of community-level information pathway prediction (CLIPP) aims at
predicting the transmission trajectory of content across online communities. A
successful solution to CLIPP holds significance as it facilitates the
distribution of valuable information to a larger audience and prevents the
proliferation of misinformation. Notably, solving CLIPP is non-trivial as
inter-community relationships and influence are unknown, information spread is
multi-modal, and new content and new communities appear over time. In this
work, we address CLIPP by collecting large-scale, multi-modal datasets to
examine the diffusion of online YouTube videos on Reddit. We analyze these
datasets to construct community influence graphs (CIGs) and develop a novel
dynamic graph framework, INPAC (Information Pathway Across Online Communities),
which incorporates CIGs to capture the temporal variability and multi-modal
nature of video propagation across communities. Experimental results in both
warm-start and cold-start scenarios show that INPAC outperforms seven baselines
in CLIPP.Comment: In Proceedings of the 29th ACM SIGKDD Conference on Knowledge
Discovery and Data Mining (KDD'23
Properties and Estimations of a Multivariate Folded Normal Distribution
A multivariate folded normal distribution is a distribution of the absolute value of a Gaussian random vector. In this paper, we provide the marginal and conditional distributions of the multivariate folded normal distribution, and also prove that independence and non-correlation are equivalent for it. In addition, we provide a numerical approach using the R language to fit a multivariate folded normal distribution. The accuracy of the estimated mean and variance parameters is then examined. Finally, a real data application to body mass index data are presented
Pressure Control for a Hydraulic Cylinder Based on a Self-Tuning PID Controller Optimized by a Hybrid Optimization Algorithm
In order to improve the performance of the hydraulic support electro-hydraulic control system test platform, a self-tuning proportion integration differentiation (PID) controller is proposed to imitate the actual pressure of the hydraulic support. To avoid the premature convergence and to improve the convergence velocity for tuning PID parameters, the PID controller is optimized with a hybrid optimization algorithm integrated with the particle swarm algorithm (PSO) and genetic algorithm (GA). A selection probability and an adaptive cross probability are introduced into the PSO to enhance the diversity of particles. The proportional overflow valve is installed to control the pressure of the pillar cylinder. The data of the control voltage of the proportional relief valve amplifier and pillar pressure are collected to acquire the system transfer function. Several simulations with different methods are performed on the hydraulic cylinder pressure system. The results demonstrate that the hybrid algorithm for a PID controller has comparatively better global search ability and faster convergence velocity on the pressure control of the hydraulic cylinder. Finally, an experiment is conducted to verify the validity of the proposed method
Elevated Circulating Interleukin-27 in Patients with Coronary Artery Disease Is Associated with Dendritic Cells, Oxidized Low-Density Lipoprotein, and Severity of Coronary Artery Stenosis
Coronary artery disease (CAD) is an immune-mediated chronic inflammatory disease mainly caused by atherosclerosis. The aims of this study were to investigate the role of interleukin-27 (IL-27) in patients with CAD and the severity of coronary artery lesions, which was evaluated by Gensini score and to investigate the biosynthesis of IL-27 and oxidized low-density lipoprotein (ox-LDL) in vitro using monocyte-derived dendritic cells (DCs). To this aim, plasma levels of IL-27, ox-LDL, and Gensini score were analyzed in patients with CAD (n=136) and normal subjects (controls, n=29). IL-27 concentration of the supernatant and the mRNA expression levels of p28 and ebi3, subunits of IL-27, from cultured immature DCs incubated with different concentrations of ox-LDL for 24 h were also analyzed. We found that circulating IL-27 levels were significantly elevated in patients with CAD than in controls (P<0.01), and positively correlated to ox-LDL and Gensini score. ox-LDL dose-dependently upregulated expression of both IL-27 protein and IL-27 (p28 and EBI3) mRNA in vitro, indicating that ox-LDL can stimulate DCs to produce IL-27. These results demonstrate that IL-27 might regulate the network of immunity and inflammation in the pathogenesis of atherosclerosis