870 research outputs found
A novel concept-level approach for ultra-concise opinion summarization
The Web 2.0 has resulted in a shift as to how users consume and interact with the information, and has introduced a wide range of new textual genres, such as reviews or microblogs, through which users communicate, exchange, and share opinions. The exploitation of all this user-generated content is of great value both for users and companies, in order to assist them in their decision-making processes. Given this context, the analysis and development of automatic methods that can help manage online information in a quicker manner are needed. Therefore, this article proposes and evaluates a novel concept-level approach for ultra-concise opinion abstractive summarization. Our approach is characterized by the integration of syntactic sentence simplification, sentence regeneration and internal concept representation into the summarization process, thus being able to generate abstractive summaries, which is one the most challenging issues for this task. In order to be able to analyze different settings for our approach, the use of the sentence regeneration module was made optional, leading to two different versions of the system (one with sentence regeneration and one without). For testing them, a corpus of 400 English texts, gathered from reviews and tweets belonging to two different domains, was used. Although both versions were shown to be reliable methods for generating this type of summaries, the results obtained indicate that the version without sentence regeneration yielded to better results, improving the results of a number of state-of-the-art systems by 9%, whereas the version with sentence regeneration proved to be more robust to noisy data.This research work has been partially funded by the University of Alicante, Generalitat Valenciana, Spanish Government and the European Commission through the projects, “Tratamiento inteligente de la información para la ayuda a la toma de decisiones” (GRE12-44), “Explotación y tratamiento de la información disponible en Internet para la anotación y generación de textos adaptados al usuario” (GRE13-15), DIIM2.0 (PROMETEOII/2014/001), ATTOS (TIN2012-38536-C03-03), LEGOLANG-UAGE (TIN2012-31224), SAM (FP7-611312), and FIRST (FP7-287607)
Storia: Summarizing Social Media Content based on Narrative Theory using Crowdsourcing
People from all over the world use social media to share thoughts and
opinions about events, and understanding what people say through these channels
has been of increasing interest to researchers, journalists, and marketers
alike. However, while automatically generated summaries enable people to
consume large amounts of data efficiently, they do not provide the context
needed for a viewer to fully understand an event. Narrative structure can
provide templates for the order and manner in which this data is presented to
create stories that are oriented around narrative elements rather than
summaries made up of facts. In this paper, we use narrative theory as a
framework for identifying the links between social media content. To do this,
we designed crowdsourcing tasks to generate summaries of events based on
commonly used narrative templates. In a controlled study, for certain types of
events, people were more emotionally engaged with stories created with
narrative structure and were also more likely to recommend them to others
compared to summaries created without narrative structure
New Thoughts on Guiding Students’ Values in the Microblog Era
With the popularization of the Internet, which promotes the arrival of the microblogging era, college students in today’s colleges and universities are the main population in the microblogging era. Microblog attracts college students with its popularity, flexibility and originality. Due to the complexity of microblogging information, the lack of attention to microblogging education in colleges and universities, as well as some factors of college students themselves, microblogging will have a certain impact on the values of college students, which is specifically manifested in the sense that it makes college students increase their awareness of social participation, realizes and vulgarizes their value orientation, and makes their value standards relatively loose. Therefore, in the era of microblogging, it is necessary to take effective strategies to guide college students to establish correct values
Interpretable classification and summarization of crisis events from microblogs
The widespread use of social media platforms has created convenient ways to obtain and spread up-to-date information during crisis events such as disasters. Time-critical analysis of crisis-related information helps humanitarian organizations and governmental bodies gain actionable information and plan for aid response. However, situational information is often immersed in a high volume of irrelevant content. Moreover, crisis-related messages also vary greatly in terms of information types, ranging from general situational awareness - such as information about warnings, infrastructure damages, and casualties - to individual needs. Different humanitarian organizations or governmental bodies usually demand information of different types for various tasks such as crisis preparation, resource planning, and aid response. To cope with information overload and efficiently support stakeholders in crisis situations, it is necessary to (a) classify data posted during crisis events into fine-grained humanitarian categories, (b) summarize the situational data in near real-time.
In this thesis, we tackle the aforementioned problems and propose novel methods for the classification and summarization of user-generated posts from microblogs. Previous studies have introduced various machine learning techniques to assist humanitarian or governmental bodies, but they primarily focused on model performance. Unlike those works, we develop interpretable machine-learning models which can provide explanations of model decisions. Generally, we focus on three methods for reducing information overload in crisis situations: (i) post classification, (ii) post summarization, (iii) interpretable models for post classification and summarization. We evaluate our methods using posts from the microblogging platform Twitter, so-called tweets. First, we expand publicly available labeled datasets with rationale annotations. Each tweet is annotated with a class label and rationales, which are short snippets from the tweet to explain its assigned label. Using the data, we develop trustworthy classification methods that give the best tradeoff between model performance and interoperability. Rationale snippets usually convey essential information in the tweets. Hence, we propose an integer linear programming-based summarization method that maximizes the coverage of rationale phrases to generate summaries of class-level tweet data. Next, we introduce an approach that can enhance latent embedding representations of tweets in vector space. Our approach helps improve the classification performance-interpretability tradeoff and detect near duplicates for designing a summarization model with low computational complexity. Experiments show that rationale labels are helpful for developing interpretable-by-design models. However, annotations are not always available, especially in real-time situations for new tasks and crisis events. In the last part of the thesis, we propose a two-stage approach to extract the rationales under minimal human supervision
A Proposal for Brand Analysis with Opinion Mining
The popularity of e‐commerce sites has increased the availability of product reviews, most of which are overlooked by customers because of their large number. Opinion mining, a discipline that aims to extract people\u27s opinions regarding some topic from reviews, was developed to address this situation. However, the individual interpretation of the reviews is not enough to take advantage of the massive datasets available on the web; a meaningful summary of the set of opinions is necessary to give users an overall insight into the opinions. We propose a system to extract information from Amazon product reviews, which focuses on a time‐varying comparison among different brands in a given Amazon product department. In this system, the results are summarized so that users can get a representative and detailed overview of the opinions of (possibly) hundreds of other users regarding the strong and weak points of several brands. This information can be used by customers who want to find high‐quality products, or by the enterprises themselves, which could find the aspects with a higher impact in the public perception
A Survey of Location Prediction on Twitter
Locations, e.g., countries, states, cities, and point-of-interests, are
central to news, emergency events, and people's daily lives. Automatic
identification of locations associated with or mentioned in documents has been
explored for decades. As one of the most popular online social network
platforms, Twitter has attracted a large number of users who send millions of
tweets on daily basis. Due to the world-wide coverage of its users and
real-time freshness of tweets, location prediction on Twitter has gained
significant attention in recent years. Research efforts are spent on dealing
with new challenges and opportunities brought by the noisy, short, and
context-rich nature of tweets. In this survey, we aim at offering an overall
picture of location prediction on Twitter. Specifically, we concentrate on the
prediction of user home locations, tweet locations, and mentioned locations. We
first define the three tasks and review the evaluation metrics. By summarizing
Twitter network, tweet content, and tweet context as potential inputs, we then
structurally highlight how the problems depend on these inputs. Each dependency
is illustrated by a comprehensive review of the corresponding strategies
adopted in state-of-the-art approaches. In addition, we also briefly review two
related problems, i.e., semantic location prediction and point-of-interest
recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur
Sentiment analysis on online social network
A large amount of data is maintained in every Social networking sites.The total data constantly gathered on these sites make it difficult for methods like use of field agents, clipping services and ad-hoc research to maintain social media data. This paper discusses the previous research on sentiment analysis
- …