Deep learning has emerged as a versatile tool for a wide range of NLP tasks,
due to its superior capacity in representation learning. But its applicability
is limited by the reliance on annotated examples, which are difficult to
produce at scale. Indirect supervision has emerged as a promising direction to
address this bottleneck, either by introducing labeling functions to
automatically generate noisy examples from unlabeled text, or by imposing
constraints over interdependent label decisions. A plethora of methods have
been proposed, each with respective strengths and limitations. Probabilistic
logic offers a unifying language to represent indirect supervision, but
end-to-end modeling with probabilistic logic is often infeasible due to
intractable inference and learning. In this paper, we propose deep
probabilistic logic (DPL) as a general framework for indirect supervision, by
composing probabilistic logic with deep learning. DPL models label decisions as
latent variables, represents prior knowledge on their relations using weighted
first-order logical formulas, and alternates between learning a deep neural
network for the end task and refining uncertain formula weights for indirect
supervision, using variational EM. This framework subsumes prior indirect
supervision methods as special cases, and enables novel combination via
infusion of rich domain and linguistic knowledge. Experiments on biomedical
machine reading demonstrate the promise of this approach.Comment: EMNLP 2018 final versio