5 research outputs found
Gender-tuning: Empowering Fine-tuning for Debiasing Pre-trained Language Models
Recent studies have revealed that the widely-used Pre-trained Language Models
(PLMs) propagate societal biases from the large unmoderated pre-training
corpora. Existing solutions require debiasing training processes and datasets
for debiasing, which are resource-intensive and costly. Furthermore, these
methods hurt the PLMs' performance on downstream tasks. In this study, we
propose Gender-tuning, which debiases the PLMs through fine-tuning on
downstream tasks' datasets. For this aim, Gender-tuning integrates Masked
Language Modeling (MLM) training objectives into fine-tuning's training
process. Comprehensive experiments show that Gender-tuning outperforms the
state-of-the-art baselines in terms of average gender bias scores in PLMs while
improving PLMs' performance on downstream tasks solely using the downstream
tasks' dataset. Also, Gender-tuning is a deployable debiasing tool for any PLM
that works with original fine-tuning
Improving Pre-trained Language Models' Generalization
The reusability of state-of-the-art Pre-trained Language Models (PLMs) is
often limited by their generalization problem, where their performance
drastically decreases when evaluated on examples that differ from the training
dataset, known as Out-of-Distribution (OOD)/unseen examples. This limitation
arises from PLMs' reliance on spurious correlations, which work well for
frequent example types but not for general examples. To address this issue, we
propose a training approach called Mask-tuning, which integrates Masked
Language Modeling (MLM) training objectives into the fine-tuning process to
enhance PLMs' generalization. Comprehensive experiments demonstrate that
Mask-tuning surpasses current state-of-the-art techniques and enhances PLMs'
generalization on OOD datasets while improving their performance on
in-distribution datasets. The findings suggest that Mask-tuning improves the
reusability of PLMs on unseen data, making them more practical and effective
for real-world applications
OPTIKS: An Optimized Key Transparency System
Key Transparency (KT) refers to a public key distribution system with transparency mechanisms proving its correct operation, i.e., proving that it reports consistent values for each user\u27s public key. While prior work on KT systems have offered new designs to tackle this problem, relatively little attention has been paid on the issue of scalability. Indeed, it is not straightforward to actually build a scalable and practical KT system from existing constructions, which may be too complex, inefficient, or non-resilient against machine failures.
In this paper, we present OPTIKS, a full featured and optimized KT system that focuses on scalability. Our system is simpler and more performant than prior work, supporting smaller storage overhead while still meeting strong notions of security and privacy. Our design also incorporates a crash-tolerant and scalable server architecture, which we demonstrate by presenting extensive benchmarks. Finally, we address several real-world problems in deploying KT systems that have received limited attention in prior work, including account decommissioning and user-to-device mapping
Labeled PSI from Homomorphic Encryption with Reduced Computation and Communication
It is known that fully homomorphic encryption (FHE) can be used to build efficient (labeled) Private Set Intersection protocols in the unbalanced setting, where one of the sets is much larger than the other (Chen et al. (CCS\u2717, CCS\u2718)). In this paper we demonstrate multiple algorithmic improvements upon these works. In particular, our protocol has an asymptotically better computation cost, requiring only homomorphic multiplications, and communication complexity sublinear in the larger set size .
We demonstrate that our protocol is significantly better than that of Chen et al. (CCS\u2718) for many practical parameters, especially in terms of online communication cost. For example, when intersecting and item sets, our protocol reduces the online computation time by more than 83% and communication by more than 32%. When intersecting and item sets, our protocol reduces the online computation time by 50% and communication by 52%. Our comparison to other state-of-the-art unbalanced PSI protocols shows that our protocol has the best total communication complexity when .
For labeled PSI our protocol also outperforms Chen et al. (CCS\u2718).
When intersecting and item sets, with the larger set having associated -byte labels, our protocol reduces the online computation time by more than 85% and communication by 36%.
Finally, we demonstrate a modification that results in nearly constant communication cost in the larger set size , but impractically high computation complexity on today\u27s CPUs. For example, to intersect a -item set with sets of size , , or , our proof-of-concept implementation requires only MB of online communication, which is more than a -fold improvement over Chen et al. (CCS\u2718)