Semantic-based padding in convolutional neural networks for improving the performance in natural language processing. A case of study in sentiment analysis

Abstract

This is the author's version of a work that was accepted for publication in Neurocomputing. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Neurocomputing, Volume 378, 22 February 2020, DOI: 10.1016/j.neucom.2019.08.096[EN] In this work, a methodology for applying semantic-based padding in Convolutional Neural Networks for Natural Language Processing tasks is proposed. Semantic-based padding takes advantage of the unused space required for having a fixed-size input matrix in a Convolutional Network effectively, using words present in the sentence. The methodology proposed has been evaluated intensively in Sentiment Analysis tasks using a variety of word embeddings. In all the experimentation carried out the proposed semantic-based padding improved the results achieved when no padding strategy is applied. Moreover, when the model used a pre-trained word embeddings, the performance of the state of the art has been surpassed.We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan Xp GPU used for this research. The work of the first author is financed by Grant PAID-01-2461 2015, from the Universitat Politecnica de Valencia. This work is partially supported by and grantnumber. the Grant PROMETEO/2018/002 from GVA.Giménez, M.; Palanca Cámara, J.; Botti Navarro, VJ. (2020). Semantic-based padding in convolutional neural networks for improving the performance in natural language processing. A case of study in sentiment analysis. Neurocomputing. 378:315-323. https://doi.org/10.1016/j.neucom.2019.08.096S315323378Ye, Q., & Doermann, D. (2015). Text Detection and Recognition in Imagery: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37(7), 1480-1500. doi:10.1109/tpami.2014.2366765Zhao, W., Chellappa, R., Phillips, P. J., & Rosenfeld, A. (2003). Face recognition. ACM Computing Surveys, 35(4), 399-458. doi:10.1145/954339.954342Li, P., & Mao, K. (2019). Knowledge-oriented convolutional neural network for causal relation extraction from natural language texts. Expert Systems with Applications, 115, 512-523. doi:10.1016/j.eswa.2018.08.009Yoo, S., Song, J., & Jeong, O. (2018). Social media contents based sentiment analysis and prediction system. Expert Systems with Applications, 105, 102-111. doi:10.1016/j.eswa.2018.03.055LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D. (1989). Backpropagation Applied to Handwritten Zip Code Recognition. Neural Computation, 1(4), 541-551. doi:10.1162/neco.1989.1.4.541W. Yin, K. Kann, M. Yu, H. Schütze, Comparative study of CNN and RNN for natural language processing, arXiv:1702.01923 (2017).J. Villena Román, S. Lana Serrano, E. Martínez Cámara, J.C. González Cristóbal, Tass-workshop on sentiment analysis at SEPLN (2013).Pang, B., & Lee, L. (2008). Opinion Mining and Sentiment Analysis. Foundations and Trends® in Information Retrieval, 2(1–2), 1-135. doi:10.1561/1500000011Mohammad, S. M., & Turney, P. D. (2012). CROWDSOURCING A WORD-EMOTION ASSOCIATION LEXICON. Computational Intelligence, 29(3), 436-465. doi:10.1111/j.1467-8640.2012.00460.xKiritchenko, S., Zhu, X., & Mohammad, S. M. (2014). Sentiment Analysis of Short Informal Texts. Journal of Artificial Intelligence Research, 50, 723-762. doi:10.1613/jair.4272T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representations in vector space, arXiv:1301.3781 (2013).P. Bojanowski, E. Grave, A. Joulin, T. Mikolov, Enriching word vectors with subword information, arXiv:1607.04606 (2016).Araque, O., Corcuera-Platas, I., Sánchez-Rada, J. F., & Iglesias, C. A. (2017). Enhancing deep learning sentiment analysis with ensemble techniques in social applications. Expert Systems with Applications, 77, 236-246. doi:10.1016/j.eswa.2017.02.002Chen, T., Xu, R., He, Y., & Wang, X. (2017). Improving sentiment analysis via sentence type classification using BiLSTM-CRF and CNN. Expert Systems with Applications, 72, 221-230. doi:10.1016/j.eswa.2016.10.065Y. Zhang, B. Wallace, A sensitivity analysis of (and practitioners’ guide to) convolutional neural networks for sentence classification, arXiv:1510.03820 (2015).Y. Kim, Convolutional neural networks for sentence classification, arXiv:1408.5882 (2014).Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2), 157-166. doi:10.1109/72.279181Zhang, W., Itoh, K., Tanida, J., & Ichioka, Y. (1990). Parallel distributed processing model with local space-invariant interconnections and its optical architecture. Applied Optics, 29(32), 4790. doi:10.1364/ao.29.004790S.M. Mohammad, S. Kiritchenko, X. Zhu, NRC-Canada: building the state-of-the-art in sentiment analysis of tweets, arXiv:1308.6242 (2013).J. Barnes, R. Klinger, S.S.i. Walde, Assessing state-of-the-art sentiment models on state-of-the-art sentiment datasets, arXiv:1709.04219 (2017).Medhat, W., Hassan, A., & Korashy, H. (2014). Sentiment analysis algorithms and applications: A survey. Ain Shams Engineering Journal, 5(4), 1093-1113. doi:10.1016/j.asej.2014.04.011M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G.S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M. Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mané, R. Monga, S. Moore, D. Murray, C. Olah, M. Schuster, J. Shlens, B. Steiner, I. Sutskever, K. Talwar, P. Tucker, V. Vanhoucke, V. Vasudevan, F. Viégas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, X. Zheng, TensorFlow: Large-scale machine learning on heterogeneous systems, 2015, Software available from tensorflow.org

    Similar works