Search CORE

5,922 research outputs found

How to Fine-Tune BERT for Text Classification?

Author: Huang Xuanjing
Qiu Xipeng
Sun Chi
Xu Yige
Publication venue
Publication date: 05/02/2020
Field of study

Language model pre-training has proven to be useful in learning universal language representations. As a state-of-the-art language model pre-training model, BERT (Bidirectional Encoder Representations from Transformers) has achieved amazing results in many language understanding tasks. In this paper, we conduct exhaustive experiments to investigate different fine-tuning methods of BERT on text classification task and provide a general solution for BERT fine-tuning. Finally, the proposed solution obtains new state-of-the-art results on eight widely-studied text classification datasets

arXiv.org e-Print Archive

Crossref

Image-based Text Classification using 2D Convolutional Neural Networks

Author: Chen Liming
Geist Matthieu
Giakoumis Dimitrios
Hamzaoui Raouf
Hanke Sten
Kalatzis Dimitrios
Kropf Johannes
Merdivan Erinç
Tzovaras Dimitrios
Vafeiadis Anastasios
Votis Konstantinos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/05/2019
Field of study

We propose a new approach to text classification in which we consider the input text as an image and apply 2D Convolutional Neural Networks to learn the local and global semantics of the sentences from the variations of the visual patterns of words. Our approach demonstrates that it is possible to get semantically meaningful features from images with text without using optical character recognition and sequential processing pipelines, techniques that traditional natural language processing algorithms require. To validate our approach, we present results for two applications: text classification and dialog modeling. Using a 2D Convolutional Neural Network, we were able to outperform the state-ofart accuracy results for a Chinese text classification task and achieved promising results for seven English text classification tasks. Furthermore, our approach outperformed the memory networks without match types when using out of vocabulary entities from Task 4 of the bAbI dialog dataset

Crossref

INRIA a CCSD electronic archive server

HAL-INSU

De Montfort University Open Research Archive