121,895 research outputs found

    Siamese hierarchical attention networks for extractive summarization

    Full text link
    [EN] In this paper, we present an extractive approach to document summarization based on Siamese Neural Networks. Specifically, we propose the use of Hierarchical Attention Networks to select the most relevant sentences of a text to make its summary. We train Siamese Neural Networks using document-summary pairs to determine whether the summary is appropriated for the document or not. By means of a sentence-level attention mechanism the most relevant sentences in the document can be identified. Hence, once the network is trained, it can be used to generate extractive summaries. The experimentation carried out using the CNN/DailyMail summarization corpus shows the adequacy of the proposal. In summary, we propose a novel end-to-end neural network to address extractive summarization as a binary classification problem which obtains promising results in-line with the state-of-the-art on the CNN/DailyMail corpus.This work has been partially supported by the Spanish MINECO and FEDER founds under project AMIC (TIN2017-85854-C4-2-R). Work of Jose-Angel Gonzalez is also financed by Universitat Politecnica de Valencia under grant PAID-01-17.González-Barba, JÁ.; Segarra Soriano, E.; García-Granada, F.; Sanchís Arnal, E.; Hurtado Oliver, LF. (2019). Siamese hierarchical attention networks for extractive summarization. Journal of Intelligent & Fuzzy Systems. 36(5):4599-4607. https://doi.org/10.3233/JIFS-179011S45994607365N. Begum , M. Fattah , and F. Ren . Automatic text summarization using support vector machine 5(7) (2009), 1987–1996.J. Cheng and M. Lapata . Neural summarization by extracting sentences and words. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, ACL 2016, August 7-12, 2016, Berlin, Germany, Volume 1: Long Papers, 2016.K.M. Hermann , T. Kocisky , E. Grefenstette , L. Espeholt , W. Kay , M. Suleyman , and P. Blunsom . Teaching machines to read and comprehend, CoRR, abs/1506.03340, 2015.D.P. Kingma and J. Ba . Adam: A method for stochastic optimization. CoRR, abs/1412.6980, 2014.Lloret, E., & Palomar, M. (2011). Text summarisation in progress: a literature review. Artificial Intelligence Review, 37(1), 1-41. doi:10.1007/s10462-011-9216-zLouis, A., & Nenkova, A. (2013). Automatically Assessing Machine Summary Content Without a Gold Standard. Computational Linguistics, 39(2), 267-300. doi:10.1162/coli_a_00123Miao, Y., & Blunsom, P. (2016). Language as a Latent Variable: Discrete Generative Models for Sentence Compression. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. doi:10.18653/v1/d16-1031R. Mihalcea and P. Tarau . Textrank: Bringing order into text. In Proceedings of the 2004 Conference on Empirical Methods in Natural Language Processing, 2004.T. Mikolov , K. Chen , G. S. Corrado , and J. Dean . Efficient estimation of word representations in vector space, CoRR, abs/1301.3781, 2013.Minaee, S., & Liu, Z. (2017). Automatic question-answering using a deep similarity neural network. 2017 IEEE Global Conference on Signal and Information Processing (GlobalSIP). doi:10.1109/globalsip.2017.8309095R. Paulus , C. Xiong , and R. Socher , A deep reinforced model for abstractive summarization. CoRR, abs/1705.04304, 2017.Schuster, M., & Paliwal, K. K. (1997). Bidirectional recurrent neural networks. IEEE Transactions on Signal Processing, 45(11), 2673-2681. doi:10.1109/78.650093See, A., Liu, P. J., & Manning, C. D. (2017). Get To The Point: Summarization with Pointer-Generator Networks. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). doi:10.18653/v1/p17-1099Takase, S., Suzuki, J., Okazaki, N., Hirao, T., & Nagata, M. (2016). Neural Headline Generation on Abstract Meaning Representation. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing. doi:10.18653/v1/d16-1112G. Tur and R. De Mori . Spoken language understanding: Systems for extracting semantic information from speech, John Wiley & Sons, 2011

    High level cognitive information processing in neural networks

    Get PDF
    Two related research efforts were addressed: (1) high-level connectionist cognitive modeling; and (2) local neural circuit modeling. The goals of the first effort were to develop connectionist models of high-level cognitive processes such as problem solving or natural language understanding, and to understand the computational requirements of such models. The goals of the second effort were to develop biologically-realistic model of local neural circuits, and to understand the computational behavior of such models. In keeping with the nature of NASA's Innovative Research Program, all the work conducted under the grant was highly innovative. For instance, the following ideas, all summarized, are contributions to the study of connectionist/neural networks: (1) the temporal-winner-take-all, relative-position encoding, and pattern-similarity association techniques; (2) the importation of logical combinators into connection; (3) the use of analogy-based reasoning as a bridge across the gap between the traditional symbolic paradigm and the connectionist paradigm; and (4) the application of connectionism to the domain of belief representation/reasoning. The work on local neural circuit modeling also departs significantly from the work of related researchers. In particular, its concentration on low-level neural phenomena that could support high-level cognitive processing is unusual within the area of biological local circuit modeling, and also serves to expand the horizons of the artificial neural net field

    Deep learning in clinical natural language processing: a methodical review.

    Get PDF
    OBJECTIVE: This article methodically reviews the literature on deep learning (DL) for natural language processing (NLP) in the clinical domain, providing quantitative analysis to answer 3 research questions concerning methods, scope, and context of current research. MATERIALS AND METHODS: We searched MEDLINE, EMBASE, Scopus, the Association for Computing Machinery Digital Library, and the Association for Computational Linguistics Anthology for articles using DL-based approaches to NLP problems in electronic health records. After screening 1,737 articles, we collected data on 25 variables across 212 papers. RESULTS: DL in clinical NLP publications more than doubled each year, through 2018. Recurrent neural networks (60.8%) and word2vec embeddings (74.1%) were the most popular methods; the information extraction tasks of text classification, named entity recognition, and relation extraction were dominant (89.2%). However, there was a long tail of other methods and specific tasks. Most contributions were methodological variants or applications, but 20.8% were new methods of some kind. The earliest adopters were in the NLP community, but the medical informatics community was the most prolific. DISCUSSION: Our analysis shows growing acceptance of deep learning as a baseline for NLP research, and of DL-based NLP in the medical community. A number of common associations were substantiated (eg, the preference of recurrent neural networks for sequence-labeling named entity recognition), while others were surprisingly nuanced (eg, the scarcity of French language clinical NLP with deep learning). CONCLUSION: Deep learning has not yet fully penetrated clinical NLP and is growing rapidly. This review highlighted both the popular and unique trends in this active field

    Stochastic Language Generation in Dialogue using Recurrent Neural Networks with Convolutional Sentence Reranking

    Full text link
    The natural language generation (NLG) component of a spoken dialogue system (SDS) usually needs a substantial amount of handcrafting or a well-labeled dataset to be trained on. These limitations add significantly to development costs and make cross-domain, multi-lingual dialogue systems intractable. Moreover, human languages are context-aware. The most natural response should be directly learned from data rather than depending on predefined syntaxes or rules. This paper presents a statistical language generator based on a joint recurrent and convolutional neural network structure which can be trained on dialogue act-utterance pairs without any semantic alignments or predefined grammar trees. Objective metrics suggest that this new model outperforms previous methods under the same experimental conditions. Results of an evaluation by human judges indicate that it produces not only high quality but linguistically varied utterances which are preferred compared to n-gram and rule-based systems.Comment: To be appear in SigDial 201
    corecore