95 research outputs found
On Efficient Training, Controllability and Compositional Generalization of Insertion-based Language Generators
Auto-regressive language models with the left-to-right generation order have
been a predominant paradigm for language generation. Recently, out-of-order
text generation beyond the traditional left-to-right paradigm has attracted
extensive attention, with a notable variation of insertion-based generation,
where a model is used to gradually extend the context into a complete sentence
purely with insertion operations. However, since insertion operations disturb
the position information of each token, it is often believed that each step of
the insertion-based likelihood estimation requires a bi-directional
\textit{re-encoding} of the whole generated sequence. This computational
overhead prohibits the model from scaling up to generate long, diverse texts
such as stories, news articles, and reports. To address this issue, we propose
InsNet, an insertion-based sequence model that can be trained as efficiently as
traditional transformer decoders while maintaining the same performance as that
with a bi-directional context encoder. We evaluate InsNet on story generation
and CleVR-CoGENT captioning, showing the advantages of InsNet in several
dimensions, including computational costs, generation quality, the ability to
perfectly incorporate lexical controls, and better compositional
generalization
Debiasing Community Detection: The Importance of Lowly-Connected Nodes
Community detection is an important task in social network analysis, allowing
us to identify and understand the communities within the social structures.
However, many community detection approaches either fail to assign low degree
(or lowly-connected) users to communities, or assign them to trivially small
communities that prevent them from being included in analysis. In this work, we
investigate how excluding these users can bias analysis results. We then
introduce an approach that is more inclusive for lowly-connected users by
incorporating them into larger groups. Experiments show that our approach
outperforms the existing state-of-the-art in terms of F1 and Jaccard similarity
scores while reducing the bias towards low-degree users
Men Are Elected, Women Are Married: Events Gender Bias on Wikipedia
Human activities can be seen as sequences of events, which are crucial to
understanding societies. Disproportional event distribution for different
demographic groups can manifest and amplify social stereotypes, and potentially
jeopardize the ability of members in some groups to pursue certain goals. In
this paper, we present the first event-centric study of gender biases in a
Wikipedia corpus. To facilitate the study, we curate a corpus of career and
personal life descriptions with demographic information consisting of 7,854
fragments from 10,412 celebrities. Then we detect events with a
state-of-the-art event detection model, calibrate the results using
strategically generated templates, and extract events that have asymmetric
associations with genders. Our study discovers that the Wikipedia pages tend to
intermingle personal life events with professional events for females but not
for males, which calls for the awareness of the Wikipedia community to
formalize guidelines and train the editors to mind the implicit biases that
contributors carry. Our work also lays the foundation for future works on
quantifying and discovering event biases at the corpus level.Comment: ACL 202
- …