Search CORE

116 research outputs found

Semi-supervised latent variable models for sentence-level sentiment analysis

Author: McDonald Ryan
Täckström Oscar
Publication venue
Publication date: 01/01/2011
Field of study

We derive two variants of a semi-supervised model for fine-grained sentiment analysis. Both models leverage abundant natural supervision in the form of review ratings, as well as a small amount of manually crafted sentence labels, to learn sentence-level sentiment classifiers. The proposed model is a fusion of a fully supervised structured conditional model and its partially supervised counterpart. This allows for highly efficient estimation and inference algorithms with rich feature definitions. We describe the two variants as well as their component models and verify experimentally that both variants give significantly improved results for sentence-level sentiment analysis compared to all baselines

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Automated Big Text Security Classification

Author: Alzhrani Khudran
Boult Terrance E.
Chow C. Edward
Rudd Ethan M.
Publication venue
Publication date: 21/10/2016
Field of study

In recent years, traditional cybersecurity safeguards have proven ineffective against insider threats. Famous cases of sensitive information leaks caused by insiders, including the WikiLeaks release of diplomatic cables and the Edward Snowden incident, have greatly harmed the U.S. government's relationship with other governments and with its own citizens. Data Leak Prevention (DLP) is a solution for detecting and preventing information leaks from within an organization's network. However, state-of-art DLP detection models are only able to detect very limited types of sensitive information, and research in the field has been hindered due to the lack of available sensitive texts. Many researchers have focused on document-based detection with artificially labeled "confidential documents" for which security labels are assigned to the entire document, when in reality only a portion of the document is sensitive. This type of whole-document based security labeling increases the chances of preventing authorized users from accessing non-sensitive information within sensitive documents. In this paper, we introduce Automated Classification Enabled by Security Similarity (ACESS), a new and innovative detection model that penetrates the complexity of big text security classification/detection. To analyze the ACESS system, we constructed a novel dataset, containing formerly classified paragraphs from diplomatic cables made public by the WikiLeaks organization. To our knowledge this paper is the first to analyze a dataset that contains actual formerly sensitive information annotated at paragraph granularity.Comment: Pre-print of Best Paper Award IEEE Intelligence and Security Informatics (ISI) 2016 Manuscrip

arXiv.org e-Print Archive

Crossref

Generating Aspect-oriented Multi-document Summarization with Event-Aspect Model

Author: GAO Wei
JIANG Jing
LI Peng
WANG Yinglin
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/07/2011
Field of study

In this paper, we propose a novel approach to automatic generation of aspect-oriented summaries from multiple documents. We first develop an event-aspect LDA model to cluster sentences into aspects. We then use extended LexRank algorithm to rank the sentences in each cluster. We use Integer Linear Programming for sentence selection. Key features of our method include automatic grouping of semantically related sentences and sentence ranking based on extension of random walk model. Also, we implement a new sentence compression algorithm which use dependency tree instead of parser tree. We compare our method with four baseline methods. Quantitative evaluation based on Rouge metric demonstrates the effectiveness and advantages of our method.

CiteSeerX

Institutional Knowledge at Singapore Management University

Cross-Cultural Comparisons of Review Aspect Importance

Author: Kanayama Hiroshi
Nakayama Makoto
Nasukawa Tetsuya
Publication venue: Bright Publisher
Publication date: 01/09/2018
Field of study

Previous text mining studies identified key common aspects in online restaurant reviews. However, it is not clear how important these aspects are for consumers. In this exploratory study, we used Yelp restaurant reviews on an ethnic food item, ramen noodles, and assessed the importance of each aspect to both U.S. and Japanese consumers. The results show that food and atmosphere are far more important than the other common aspects in both the U.S. and Japan. However, we found noticeable differences between consumers in the two countries regarding how the food aspect plays a role on star ratings. Both implications and a future research agenda are discussed

Directory of Open Access Journals

IJIIS - International Journal of Informatics and Information Systems