4,709 research outputs found

    Automatic domain ontology extraction for context-sensitive opinion mining

    Get PDF
    Automated analysis of the sentiments presented in online consumer feedbacks can facilitate both organizations’ business strategy development and individual consumers’ comparison shopping. Nevertheless, existing opinion mining methods either adopt a context-free sentiment classification approach or rely on a large number of manually annotated training examples to perform context sensitive sentiment classification. Guided by the design science research methodology, we illustrate the design, development, and evaluation of a novel fuzzy domain ontology based contextsensitive opinion mining system. Our novel ontology extraction mechanism underpinned by a variant of Kullback-Leibler divergence can automatically acquire contextual sentiment knowledge across various product domains to improve the sentiment analysis processes. Evaluated based on a benchmark dataset and real consumer reviews collected from Amazon.com, our system shows remarkable performance improvement over the context-free baseline

    Finding Academic Experts on a MultiSensor Approach using Shannon's Entropy

    Full text link
    Expert finding is an information retrieval task concerned with the search for the most knowledgeable people, in some topic, with basis on documents describing peoples activities. The task involves taking a user query as input and returning a list of people sorted by their level of expertise regarding the user query. This paper introduces a novel approach for combining multiple estimators of expertise based on a multisensor data fusion framework together with the Dempster-Shafer theory of evidence and Shannon's entropy. More specifically, we defined three sensors which detect heterogeneous information derived from the textual contents, from the graph structure of the citation patterns for the community of experts, and from profile information about the academic experts. Given the evidences collected, each sensor may define different candidates as experts and consequently do not agree in a final ranking decision. To deal with these conflicts, we applied the Dempster-Shafer theory of evidence combined with Shannon's Entropy formula to fuse this information and come up with a more accurate and reliable final ranking list. Experiments made over two datasets of academic publications from the Computer Science domain attest for the adequacy of the proposed approach over the traditional state of the art approaches. We also made experiments against representative supervised state of the art algorithms. Results revealed that the proposed method achieved a similar performance when compared to these supervised techniques, confirming the capabilities of the proposed framework

    A Case-Based Approach to Cross Domain Sentiment Classification

    Get PDF
    This paper considers the task of sentiment classification of subjective text across many domains, in particular on scenarios where no in-domain data is available. Motivated by the more general applicability of such methods, we propose an extensible approach to sentiment classification that leverages sentiment lexicons and out-of-domain data to build a case-based system where solutions to past cases are reused to predict the sentiment of new documents from an unknown domain. In our approach the case representation uses a set of features based on document statistics, while the case solution stores sentiment lexicons employed on past predictions allowing for later retrieval and reuse on similar documents. The case-based nature of our approach also allows for future improvements since new lexicons and classification methods can be added to the case base as they become available. On a cross domain experiment our method has shown robust results when compared to a baseline single-lexicon classifier where the lexicon has to be pre-selected for the domain in question

    Benchmarking News Recommendations in a Living Lab

    Get PDF
    Most user-centric studies of information access systems in literature suffer from unrealistic settings or limited numbers of users who participate in the study. In order to address this issue, the idea of a living lab has been promoted. Living labs allow us to evaluate research hypotheses using a large number of users who satisfy their information need in a real context. In this paper, we introduce a living lab on news recommendation in real time. The living lab has first been organized as News Recommendation Challenge at ACM RecSys’13 and then as campaign-style evaluation lab NEWSREEL at CLEF’14. Within this lab, researchers were asked to provide news article recommendations to millions of users in real time. Different from user studies which have been performed in a laboratory, these users are following their own agenda. Consequently, laboratory bias on their behavior can be neglected. We outline the living lab scenario and the experimental setup of the two benchmarking events. We argue that the living lab can serve as reference point for the implementation of living labs for the evaluation of information access systems

    A Survey of Location Prediction on Twitter

    Full text link
    Locations, e.g., countries, states, cities, and point-of-interests, are central to news, emergency events, and people's daily lives. Automatic identification of locations associated with or mentioned in documents has been explored for decades. As one of the most popular online social network platforms, Twitter has attracted a large number of users who send millions of tweets on daily basis. Due to the world-wide coverage of its users and real-time freshness of tweets, location prediction on Twitter has gained significant attention in recent years. Research efforts are spent on dealing with new challenges and opportunities brought by the noisy, short, and context-rich nature of tweets. In this survey, we aim at offering an overall picture of location prediction on Twitter. Specifically, we concentrate on the prediction of user home locations, tweet locations, and mentioned locations. We first define the three tasks and review the evaluation metrics. By summarizing Twitter network, tweet content, and tweet context as potential inputs, we then structurally highlight how the problems depend on these inputs. Each dependency is illustrated by a comprehensive review of the corresponding strategies adopted in state-of-the-art approaches. In addition, we also briefly review two related problems, i.e., semantic location prediction and point-of-interest recommendation. Finally, we list future research directions.Comment: Accepted to TKDE. 30 pages, 1 figur
    • 

    corecore