In this paper, we focus on learning structure-aware document representations
from data without recourse to a discourse parser or additional annotations.
Drawing inspiration from recent efforts to empower neural networks with a
structural bias, we propose a model that can encode a document while
automatically inducing rich structural dependencies. Specifically, we embed a
differentiable non-projective parsing algorithm into a neural model and use
attention mechanisms to incorporate the structural biases. Experimental
evaluation across different tasks and datasets shows that the proposed model
achieves state-of-the-art results on document modeling tasks while inducing
intermediate structures which are both interpretable and meaningful.Comment: change to one-based indexing, published in Transactions of the
Association for Computational Linguistics (TACL),
https://transacl.org/ojs/index.php/tacl/article/view/1185/28