1,961 research outputs found
Fast and Accurate Neural Word Segmentation for Chinese
Neural models with minimal feature engineering have achieved competitive
performance against traditional methods for the task of Chinese word
segmentation. However, both training and working procedures of the current
neural models are computationally inefficient. This paper presents a greedy
neural word segmenter with balanced word and character embedding inputs to
alleviate the existing drawbacks. Our segmenter is truly end-to-end, capable of
performing segmentation much faster and even more accurate than
state-of-the-art neural models on Chinese benchmark datasets.Comment: To appear in ACL201
Efficient Multi-Template Learning for Structured Prediction
Conditional random field (CRF) and Structural Support Vector Machine
(Structural SVM) are two state-of-the-art methods for structured prediction
which captures the interdependencies among output variables. The success of
these methods is attributed to the fact that their discriminative models are
able to account for overlapping features on the whole input observations. These
features are usually generated by applying a given set of templates on labeled
data, but improper templates may lead to degraded performance. To alleviate
this issue, in this paper, we propose a novel multiple template learning
paradigm to learn structured prediction and the importance of each template
simultaneously, so that hundreds of arbitrary templates could be added into the
learning model without caution. This paradigm can be formulated as a special
multiple kernel learning problem with exponential number of constraints. Then
we introduce an efficient cutting plane algorithm to solve this problem in the
primal, and its convergence is presented. We also evaluate the proposed
learning paradigm on two widely-studied structured prediction tasks,
\emph{i.e.} sequence labeling and dependency parsing. Extensive experimental
results show that the proposed method outperforms CRFs and Structural SVMs due
to exploiting the importance of each template. Our complexity analysis and
empirical results also show that our proposed method is more efficient than
OnlineMKL on very sparse and high-dimensional data. We further extend this
paradigm for structured prediction using generalized -block norm
regularization with , and experiments show competitive performances when
- …