5,318 research outputs found
Thread Reconstruction in Conversational Data using Neural Coherence Models
Discussion forums are an important source of information. They are often used
to answer specific questions a user might have and to discover more about a
topic of interest. Discussions in these forums may evolve in intricate ways,
making it difficult for users to follow the flow of ideas. We propose a novel
approach for automatically identifying the underlying thread structure of a
forum discussion. Our approach is based on a neural model that computes
coherence scores of possible reconstructions and then selects the highest
scoring, i.e., the most coherent one. Preliminary experiments demonstrate
promising results outperforming a number of strong baseline methods.Comment: Neu-IR: Workshop on Neural Information Retrieval 201
How did the discussion go: Discourse act classification in social media conversations
We propose a novel attention based hierarchical LSTM model to classify
discourse act sequences in social media conversations, aimed at mining data
from online discussion using textual meanings beyond sentence level. The very
uniqueness of the task is the complete categorization of possible pragmatic
roles in informal textual discussions, contrary to extraction of
question-answers, stance detection or sarcasm identification which are very
much role specific tasks. Early attempt was made on a Reddit discussion
dataset. We train our model on the same data, and present test results on two
different datasets, one from Reddit and one from Facebook. Our proposed model
outperformed the previous one in terms of domain independence; without using
platform-dependent structural features, our hierarchical LSTM with word
relevance attention mechanism achieved F1-scores of 71\% and 66\% respectively
to predict discourse roles of comments in Reddit and Facebook discussions.
Efficiency of recurrent and convolutional architectures in order to learn
discursive representation on the same task has been presented and analyzed,
with different word and comment embedding schemes. Our attention mechanism
enables us to inquire into relevance ordering of text segments according to
their roles in discourse. We present a human annotator experiment to unveil
important observations about modeling and data annotation. Equipped with our
text-based discourse identification model, we inquire into how heterogeneous
non-textual features like location, time, leaning of information etc. play
their roles in charaterizing online discussions on Facebook
BlockTag: Design and applications of a tagging system for blockchain analysis
Annotating blockchains with auxiliary data is useful for many applications.
For example, e-crime investigations of illegal Tor hidden services, such as
Silk Road, often involve linking Bitcoin addresses, from which money is sent or
received, to user accounts and related online activities. We present BlockTag,
an open-source tagging system for blockchains that facilitates such tasks. We
describe BlockTag's design and present three analyses that illustrate its
capabilities in the context of privacy research and law enforcement
The state-of-the-art in personalized recommender systems for social networking
With the explosion of Web 2.0 application such as blogs, social and professional networks, and various other types of social media, the rich online information and various new sources of knowledge flood users and hence pose a great challenge in terms of information overload. It is critical to use intelligent agent software systems to assist users in finding the right information from an abundance of Web data. Recommender systems can help users deal with information overload problem efficiently by suggesting items (e.g., information and products) that match users’ personal interests. The recommender technology has been successfully employed in many applications such as recommending films, music, books, etc. The purpose of this report is to give an overview of existing technologies for building personalized recommender systems in social networking environment, to propose a research direction for addressing user profiling and cold start problems by exploiting user-generated content newly available in Web 2.0
- …