Search CORE

3 research outputs found

Learning Latent Trees with Stochastic Perturbations and Differentiable Dynamic Programming

Author: Corro Caio
Titov Ivan
Publication venue
Publication date: 01/01/2019
Field of study

We treat projective dependency trees as latent variables in our probabilistic model and induce them in such a way as to be beneficial for a downstream task, without relying on any direct tree supervision. Our approach relies on Gumbel perturbations and differentiable dynamic programming. Unlike previous approaches to latent tree learning, we stochastically sample global structures and our parser is fully differentiable. We illustrate its effectiveness on sentiment analysis and natural language inference tasks. We also study its properties on a synthetic structure induction task. Ablation studies emphasize the importance of both stochasticity and constraining latent structures to be projective trees.Comment: Accepted at ACL 201

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Learning Binary Decision Trees by Argmin Differentiation

Author: Kusner Matt J.
Niculae Vlad
Zantedeschi Valentina
Publication venue
Publication date: 07/06/2021
Field of study

We address the problem of learning binary decision trees that partition data for some downstream task. We propose to learn discrete parameters (i.e., for tree traversals and node pruning) and continuous parameters (i.e., for tree split functions and prediction functions) simultaneously using argmin differentiation. We do so by sparsely relaxing a mixed-integer program for the discrete parameters, to allow gradients to pass through the program to continuous parameters. We derive customized algorithms to efficiently compute the forward and backward passes. This means that our tree learning procedure can be used as an (implicit) layer in arbitrary deep networks, and can be optimized with arbitrary loss functions. We demonstrate that our approach produces binary trees that are competitive with existing single tree and ensemble approaches, in both supervised and unsupervised settings. Further, apart from greedy approaches (which do not have competitive accuracies), our method is faster to train than all other tree-learning baselines we compare with. The code for reproducing the results is available at https://github.com/vzantedeschi/LatentTrees

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

UCL Discovery

Learning Latent Trees with Stochastic Perturbations and Differentiable Dynamic Programming

Author: Corro C.
Korhonen A.
Màrquez L.
Titov I.
Traum D.
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2019
Field of study