Search CORE

16,329 research outputs found

Character-level Convolutional Networks for Text Classification

Author: LeCun Yann
Zhang Xiang
Zhao Junbo
Publication venue
Publication date: 01/01/2015
Field of study

This article offers an empirical exploration on the use of character-level convolutional networks (ConvNets) for text classification. We constructed several large-scale datasets to show that character-level convolutional networks could achieve state-of-the-art or competitive results. Comparisons are offered against traditional models such as bag of words, n-grams and their TFIDF variants, and deep learning models such as word-based ConvNets and recurrent neural networks.Comment: An early version of this work entitled "Text Understanding from Scratch" was posted in Feb 2015 as arXiv:1502.01710. The present paper has considerably more experimental results and a rewritten introduction, Advances in Neural Information Processing Systems 28 (NIPS 2015

arXiv.org e-Print Archive

CiteSeerX

A topic sentence-based instance transfer method for imbalanced sentiment classification of Chinese product reviews

Author: Chao Kuo-Ming
Lan Tian
Shah Nazaraf
Tian Feng
Wu Fan
Yue Jia
Zheng Qinghua
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Coventry University Pure Portal

TGSum: Build Tweet Guided Multi-Document Summarization Dataset

Author: Cao Ziqiang
Chen Chengyao
Li Sujian
Li Wenjie
Wei Furu
Zhou Ming
Publication venue
Publication date: 26/11/2015
Field of study

The development of summarization research has been significantly hampered by the costly acquisition of reference summaries. This paper proposes an effective way to automatically collect large scales of news-related multi-document summaries with reference to social media's reactions. We utilize two types of social labels in tweets, i.e., hashtags and hyper-links. Hashtags are used to cluster documents into different topic sets. Also, a tweet with a hyper-link often highlights certain key points of the corresponding document. We synthesize a linked document cluster to form a reference summary which can cover most key points. To this aim, we adopt the ROUGE metrics to measure the coverage ratio, and develop an Integer Linear Programming solution to discover the sentence set reaching the upper bound of ROUGE. Since we allow summary sentences to be selected from both documents and high-quality tweets, the generated reference summaries could be abstractive. Both informativeness and readability of the collected summaries are verified by manual judgment. In addition, we train a Support Vector Regression summarizer on DUC generic multi-document summarization benchmarks. With the collected data as extra training resource, the performance of the summarizer improves a lot on all the test sets. We release this dataset for further research.Comment: 7 pages, 1 figure in AAAI 201

arXiv.org e-Print Archive

CiteSeerX

The Hong Kong Polytechnic University Pao Yue-kong Library

Association for the Advancement of Artificial Intelligence: AAAI Publications

Sentiment Analysis: State of the Art

Author: Chalothorn Tawunrat
Ellman Jeremy
Publication venue: Institute of Research Engineers and Doctors
Publication date: 01/08/2013
Field of study

We present the state of art in sentiment analysis which covers the purpose of sentiment analysis, levels of sentiment analysis and processes that could be used to measure polarity and classify labels. Moreover, brief details about some resources of sentiment analysis are included

Northumbria Research Link