5 research outputs found
Target Language-Aware Constrained Inference for Cross-lingual Dependency Parsing
Prior work on cross-lingual dependency parsing often focuses on capturing the
commonalities between source and target languages and overlooks the potential
of leveraging linguistic properties of the languages to facilitate the
transfer. In this paper, we show that weak supervisions of linguistic knowledge
for the target languages can improve a cross-lingual graph-based dependency
parser substantially. Specifically, we explore several types of corpus
linguistic statistics and compile them into corpus-wise constraints to guide
the inference process during the test time. We adapt two techniques, Lagrangian
relaxation and posterior regularization, to conduct inference with
corpus-statistics constraints. Experiments show that the Lagrangian relaxation
and posterior regularization inference improve the performances on 15 and 17
out of 19 target languages, respectively. The improvements are especially
significant for target languages that have different word order features from
the source language.Comment: 15 pages, 3 figures, published in EMNLP 201
Mitigating Gender Bias Amplification in Distribution by Posterior Regularization
Advanced machine learning techniques have boosted the performance of natural
language processing. Nevertheless, recent studies, e.g., Zhao et al. (2017)
show that these techniques inadvertently capture the societal bias hidden in
the corpus and further amplify it. However, their analysis is conducted only on
models' top predictions. In this paper, we investigate the gender bias
amplification issue from the distribution perspective and demonstrate that the
bias is amplified in the view of predicted probability distribution over
labels. We further propose a bias mitigation approach based on posterior
regularization. With little performance loss, our method can almost remove the
bias amplification in the distribution. Our study sheds the light on
understanding the bias amplification.Comment: 7 pages, 3 figures, published in ACL 202
An Integer Linear Programming Framework for Mining Constraints from Data
Structured output prediction problems (e.g., sequential tagging, hierarchical
multi-class classification) often involve constraints over the output label
space. These constraints interact with the learned models to filter infeasible
solutions and facilitate in building an accountable system. However, although
constraints are useful, they are often based on hand-crafted rules. This raises
a question -- \emph{can we mine constraints and rules from data based on a
learning algorithm?}
In this paper, we present a general framework for mining constraints from
data. In particular, we consider the inference in structured output prediction
as an integer linear programming (ILP) problem. Then, given the coefficients of
the objective function and the corresponding solution, we mine the underlying
constraints by estimating the outer and inner polytopes of the feasible set. We
verify the proposed constraint mining algorithm in various synthetic and
real-world applications and demonstrate that the proposed approach successfully
identifies the feasible set at scale.
In particular, we show that our approach can learn to solve 9x9 Sudoku
puzzles and minimal spanning tree problems from examples without providing the
underlying rules. Our algorithm can also integrate with a neural network model
to learn the hierarchical label structure of a multi-label classification task.
Besides, we provide a theoretical analysis about the tightness of the polytopes
and the reliability of the mined constraints.Comment: 13 pages, published in ICML202
Domain Knowledge Empowered Structured Neural Net for End-to-End Event Temporal Relation Extraction
Extracting event temporal relations is a critical task for information
extraction and plays an important role in natural language understanding. Prior
systems leverage deep learning and pre-trained language models to improve the
performance of the task. However, these systems often suffer from two
short-comings: 1) when performing maximum a posteriori (MAP) inference based on
neural models, previous systems only used structured knowledge that are assumed
to be absolutely correct, i.e., hard constraints; 2) biased predictions on
dominant temporal relations when training with a limited amount of data. To
address these issues, we propose a framework that enhances deep neural network
with distributional constraints constructed by probabilistic domain knowledge.
We solve the constrained inference problem via Lagrangian Relaxation and apply
it on end-to-end event temporal relation extraction tasks. Experimental results
show our framework is able to improve the baseline neural network models with
strong statistical significance on two widely used datasets in news and
clinical domains.Comment: Appear in EMNLP'2
Improving cross-lingual model transfer by chunking
We present a shallow parser guided cross-lingual model transfer approach in
order to address the syntactic differences between source and target languages
more effectively. In this work, we assume the chunks or phrases in a sentence
as transfer units in order to address the syntactic differences between the
source and target languages arising due to the differences in ordering of words
in the phrases and the ordering of phrases in a sentence separately