11,323 research outputs found
Listening between the Lines: Learning Personal Attributes from Conversations
Open-domain dialogue agents must be able to converse about many topics while
incorporating knowledge about the user into the conversation. In this work we
address the acquisition of such knowledge, for personalization in downstream
Web applications, by extracting personal attributes from conversations. This
problem is more challenging than the established task of information extraction
from scientific publications or Wikipedia articles, because dialogues often
give merely implicit cues about the speaker. We propose methods for inferring
personal attributes, such as profession, age or family status, from
conversations using deep learning. Specifically, we propose several Hidden
Attribute Models, which are neural networks leveraging attention mechanisms and
embeddings. Our methods are trained on a per-predicate basis to output rankings
of object values for a given subject-predicate combination (e.g., ranking the
doctor and nurse professions high when speakers talk about patients, emergency
rooms, etc). Experiments with various conversational texts including Reddit
discussions, movie scripts and a collection of crowdsourced personal dialogues
demonstrate the viability of our methods and their superior performance
compared to state-of-the-art baselines.Comment: published in WWW'1
Applying text timing in corporate spin-off disclosure statement analysis: understanding the main concerns and recommendation of appropriate term weights
Text mining helps in extracting knowledge and useful information from unstructured data. It detects and extracts information from mountains of documents and allowing in selecting data related to a particular data.
In this study, text mining is applied to the 10-12b filings done by the companies during Corporate Spin-off. The main purposes are (1) To investigate potential and/or major concerns found from these financial statements filed for corporate spin-off and (2) To identify appropriate methods in text mining which can be used to reveal these major concerns.
10-12b filings from thirty-four companies were taken and only the Risk Factors category was taken for analysis. Term weights such as Entropy, IDF, GF-IDF, Normal and None were applied on the input data and out of them Entropy and GF-IDF were found to be the appropriate term weights which provided acceptable results. These accepted term weights gave the results which was acceptable to human expert\u27s expectations. The document distribution from these term weights created a pattern which reflected the mood or focus of the input documents.
In addition to the analysis, this study also provides a pilot study for future work in predictive text mining for the analysis of similar financial documents. For example, the descriptive terms found from this study provide a set of start word list which eliminates the try and error method of framing an initial start list --Abstract, page iii
False News On Social Media: A Data-Driven Survey
In the past few years, the research community has dedicated growing interest
to the issue of false news circulating on social networks. The widespread
attention on detecting and characterizing false news has been motivated by
considerable backlashes of this threat against the real world. As a matter of
fact, social media platforms exhibit peculiar characteristics, with respect to
traditional news outlets, which have been particularly favorable to the
proliferation of deceptive information. They also present unique challenges for
all kind of potential interventions on the subject. As this issue becomes of
global concern, it is also gaining more attention in academia. The aim of this
survey is to offer a comprehensive study on the recent advances in terms of
detection, characterization and mitigation of false news that propagate on
social media, as well as the challenges and the open questions that await
future research on the field. We use a data-driven approach, focusing on a
classification of the features that are used in each study to characterize
false information and on the datasets used for instructing classification
methods. At the end of the survey, we highlight emerging approaches that look
most promising for addressing false news
- …