2 research outputs found
Confidence penalty, annealing Gaussian noise and zoneout for biLSTM-CRF networks for named entity recognition
Named entity recognition (NER) is used to identify relevant entities in text.
A bidirectional LSTM (long short term memory) encoder with a neural conditional
random fields (CRF) decoder (biLSTM-CRF) is the state of the art methodology.
In this work, we have done an analysis of several methods that intend to
optimize the performance of networks based on this architecture, which in some
cases encourage overfitting avoidance. These methods target exploration of
parameter space, regularization of LSTMs and penalization of confident output
distributions. Results show that the optimization methods improve the
performance of the biLSTM-CRF NER baseline system, setting a new state of the
art performance for the CoNLL-2003 Spanish set with an F1 of 87.18
Maximum Entropy Regularization and Chinese Text Recognition
Chinese text recognition is more challenging than Latin text due to the large
amount of fine-grained Chinese characters and the great imbalance over classes,
which causes a serious overfitting problem. We propose to apply Maximum Entropy
Regularization to regularize the training process, which is to simply add a
negative entropy term to the canonical cross-entropy loss without any
additional parameters and modification of a model. We theoretically give the
convergence probability distribution and analyze how the regularization
influence the learning process. Experiments on Chinese character recognition,
Chinese text line recognition and fine-grained image classification achieve
consistent improvement, proving that the regularization is beneficial to
generalization and robustness of a recognition model.Comment: 15 page