5 research outputs found
Parser Training with Heterogeneous Treebanks
How to make the most of multiple heterogeneous treebanks when training a monolingual dependency parser is an open question. We start by investigating previouslysuggested, but little evaluated, strategiesfor exploiting multiple treebanks based onconcatenating training sets, with or without fine-tuning. We go on to propose anew method based on treebank embeddings. We perform experiments for severallanguages and show that in many casesfine-tuning and treebank embeddings leadto substantial improvements over singletreebanks or concatenation, with averagegains of 2.0–3.5 LAS points. We arguethat treebank embeddings should be preferred due to their conceptual simplicity,flexibility and extensibility
Parser Training with Heterogeneous Treebanks
How to make the most of multiple heterogeneous treebanks when training a monolingual dependency parser is an open question. We start by investigating previouslysuggested, but little evaluated, strategiesfor exploiting multiple treebanks based onconcatenating training sets, with or without fine-tuning. We go on to propose anew method based on treebank embeddings. We perform experiments for severallanguages and show that in many casesfine-tuning and treebank embeddings leadto substantial improvements over singletreebanks or concatenation, with averagegains of 2.0–3.5 LAS points. We arguethat treebank embeddings should be preferred due to their conceptual simplicity,flexibility and extensibility