2,628 research outputs found
Argumentation Mining in User-Generated Web Discourse
The goal of argumentation mining, an evolving research field in computational
linguistics, is to design methods capable of analyzing people's argumentation.
In this article, we go beyond the state of the art in several ways. (i) We deal
with actual Web data and take up the challenges given by the variety of
registers, multiple domains, and unrestricted noisy user-generated Web
discourse. (ii) We bridge the gap between normative argumentation theories and
argumentation phenomena encountered in actual data by adapting an argumentation
model tested in an extensive annotation study. (iii) We create a new gold
standard corpus (90k tokens in 340 documents) and experiment with several
machine learning methods to identify argument components. We offer the data,
source codes, and annotation guidelines to the community under free licenses.
Our findings show that argumentation mining in user-generated Web discourse is
a feasible but challenging task.Comment: Cite as: Habernal, I. & Gurevych, I. (2017). Argumentation Mining in
User-Generated Web Discourse. Computational Linguistics 43(1), pp. 125-17
A large multilingual and multi-domain dataset for recommender systems
This paper presents a multi-domain interests dataset to train and test Recommender Systems, and the methodology to create the dataset
from Twitter messages in English and Italian. The English dataset includes an average of 90 preferences per user on music, books,
movies, celebrities, sport, politics and much more, for about half million users. Preferences are either extracted from messages of
users who use Spotify, Goodreads and other similar content sharing platforms, or induced from their ”topical” friends, i.e., followees
representing an interest rather than a social relation between peers. In addition, preferred items are matched with Wikipedia articles
describing them. This unique feature of our dataset provides a mean to derive a semantic categorization of the preferred items, exploiting
available semantic resources linked to Wikipedia such as the Wikipedia Category Graph, DBpedia, BabelNet and others
BlogForever: D3.1 Preservation Strategy Report
This report describes preservation planning approaches and strategies recommended by the BlogForever project as a core component of a weblog repository design. More specifically, we start by discussing why we would want to preserve weblogs in the first place and what it is exactly that we are trying to preserve. We further present a review of past and present work and highlight why current practices in web archiving do not address the needs of weblog preservation adequately. We make three distinctive contributions in this volume: a) we propose transferable practical workflows for applying a combination of established metadata and repository standards in developing a weblog repository, b) we provide an automated approach to identifying significant properties of weblog content that uses the notion of communities and how this affects previous strategies, c) we propose a sustainability plan that draws upon community knowledge through innovative repository design
SEMANTIC SOCIAL NETWORK ANALYSIS FOR THE ENTERPRISE
Business processes are generally fixed and enforced strictly, as reflected by the static nature of underlying software systems and datasets. However, internal and external situations, organizational changes and various other factors trigger dynamism, which is reflected in the form of issues, complains, Q&A, opinions, reviews, etc, over a plethora of communication channels, such as email, chat, discussion forums, and internal social network. Careful and timely analysis and processing of such channels may lead to early detection of emerging trends, critical issues, opportunities, topics of interests, contributors, experts etc. Social network analytics have been successfully applied in general purpose, online social network platforms, like Facebook and Twitter. However, in order for such techniques to be useful in business context, it is mandatory to integrate them with underlying business systems, processes and practices. Such integration problem is increasingly recognized as Big Data problem. We argue that SemanticWeb technology applied with social network analytics can solve enterprise knowledge management, while achieving integration
A novel data analytic model for mining user insurance demands from microblogs
This paper proposes a method based on LDA model and Word2Vec for analyzing Microblog users' insurance demands. First of all, we use LDA model to analyze the text data of Microblog user to get their candidate topic. Secondly, we use CBOW model to implement topic word vectorization and use word similarity calculation to expand it. Then we use K-means model to cluster the expanded words and redefine the topic category. Then we use the LDA model to extract the keywords of various insurance information on the “Pingan Insurance” website and analyze the possibility of users with different demands to purchase various types of insurance with the help of word vector similarity. Finally, the validity of the method in this paper is verified against Microblog user information. The experimental results show that the accuracy, recall rate and F1 value of the LDA-CBOW extending method have been proposed compared with that of the traditional LDA model, respectively, which proves the feasibility of this method. The results of this paper will help insurance companies to accurately grasp the preferences of Microblog users, understand the potential insurance needs of users timely, and lay a foundation for personalized recommendation of insurance products
Contextual Social Networking
The thesis centers around the multi-faceted research question of how contexts may
be detected and derived that can be used for new context aware Social Networking
services and for improving the usefulness of existing Social Networking services, giving
rise to the notion of Contextual Social Networking. In a first foundational part,
we characterize the closely related fields of Contextual-, Mobile-, and Decentralized
Social Networking using different methods and focusing on different detailed
aspects. A second part focuses on the question of how short-term and long-term
social contexts as especially interesting forms of context for Social Networking may
be derived. We focus on NLP based methods for the characterization of social relations
as a typical form of long-term social contexts and on Mobile Social Signal
Processing methods for deriving short-term social contexts on the basis of geometry
of interaction and audio. We furthermore investigate, how personal social agents
may combine such social context elements on various levels of abstraction. The third
part discusses new and improved context aware Social Networking service concepts.
We investigate special forms of awareness services, new forms of social information
retrieval, social recommender systems, context aware privacy concepts and services
and platforms supporting Open Innovation and creative processes.
This version of the thesis does not contain the included publications because of
copyrights of the journals etc. Contact in terms of the version with all included
publications: Georg Groh, [email protected] zentrale Gegenstand der vorliegenden Arbeit ist die vielschichtige Frage, wie Kontexte detektiert und abgeleitet werden können, die dazu dienen können, neuartige kontextbewusste Social Networking Dienste zu schaffen und bestehende Dienste in ihrem Nutzwert zu verbessern. Die (noch nicht abgeschlossene) erfolgreiche Umsetzung dieses Programmes führt auf ein Konzept, das man als Contextual Social Networking bezeichnen kann. In einem grundlegenden ersten Teil werden die eng zusammenhängenden Gebiete Contextual Social Networking, Mobile Social Networking und Decentralized Social Networking mit verschiedenen Methoden und unter Fokussierung auf verschiedene Detail-Aspekte näher beleuchtet und in Zusammenhang gesetzt. Ein zweiter Teil behandelt die Frage, wie soziale Kurzzeit- und Langzeit-Kontexte als für das Social Networking besonders interessante Formen von Kontext gemessen und abgeleitet werden können. Ein Fokus liegt hierbei auf NLP Methoden zur Charakterisierung sozialer Beziehungen als einer typischen Form von sozialem Langzeit-Kontext. Ein weiterer Schwerpunkt liegt auf Methoden aus dem Mobile Social Signal Processing zur Ableitung sinnvoller sozialer Kurzzeit-Kontexte auf der Basis von Interaktionsgeometrien und Audio-Daten. Es wird ferner untersucht, wie persönliche soziale Agenten Kontext-Elemente verschiedener Abstraktionsgrade miteinander kombinieren können. Der dritte Teil behandelt neuartige und verbesserte Konzepte für kontextbewusste Social Networking Dienste. Es werden spezielle Formen von Awareness Diensten, neue Formen von sozialem Information Retrieval, Konzepte für kontextbewusstes Privacy Management und Dienste und Plattformen zur Unterstützung von Open Innovation und Kreativität untersucht und vorgestellt. Diese Version der Habilitationsschrift enthält die inkludierten Publikationen zurVermeidung von Copyright-Verletzungen auf Seiten der Journals u.a. nicht. Kontakt in Bezug auf die Version mit allen inkludierten Publikationen: Georg Groh, [email protected]
- …