19 research outputs found

    A Comparative Analysis of Retrievability and PageRank Measures

    Full text link
    The accessibility of documents within a collection holds a pivotal role in Information Retrieval, signifying the ease of locating specific content in a collection of documents. This accessibility can be achieved via two distinct avenues. The first is through some retrieval model using a keyword or other feature-based search, and the other is where a document can be navigated using links associated with them, if available. Metrics such as PageRank, Hub, and Authority illuminate the pathways through which documents can be discovered within the network of content while the concept of Retrievability is used to quantify the ease with which a document can be found by a retrieval model. In this paper, we compare these two perspectives, PageRank and retrievability, as they quantify the importance and discoverability of content in a corpus. Through empirical experimentation on benchmark datasets, we demonstrate a subtle similarity between retrievability and PageRank particularly distinguishable for larger datasets.Comment: Accepted at FIRE 202

    Extracting Product Features from Online Consumer Reviews

    Get PDF
    Along with the exponential growth of user-generated content online comes the need of making sense of such content. Online consumer review is one type of user-generated content that has been more important. Thus, there is a demand for uncovering hidden patterns, unknown relationships and other useful information. The focal problem of this research is product feature extraction. Few existing studies has looked into detailed categorization of review features and explored how to adjust extraction methods by taking account of the characteristics of different categories of features. This paper begins with the introduction of a new scheme of feature classification and then introduces new extraction methods for each type of features separately. These methods were design to not only recognize new features but also filter irrelevant features. The experimental results show that our proposed methods outperform the state-of-the-art techniques

    The research for effect of aspects extraction of Chinese commodity comments on supervised learning methods

    Get PDF
    Abstract With the advent of Web 2.0, there are more and more websites for shopping. These websites often allow customers make comments of the commodity which they have purchased. Therefore, three is an increasing number of online reviews. More importantly, these reviews contain a mass of sentiment. The sentiment is meaningful for merchants and customers. This paper focuses on the extraction of aspects of online review of products. We will use Supervised Learning methods to extract aspects of online review of products. Through the experiment of this paper, we found that Machine Learning can be used for aspects extraction of Chinese online review of products. Using ME and presence character representation can achieve 85.6% accuracy

    Discovering Topical Aspects in Microblogs

    Get PDF
    Abstract We address the problem of discovering topical phrases or "aspects" from microblogging sites like Twitter, that correspond to key talking points or buzz around a particular topic or entity of interest. Inferring such topical aspects enables various applications such as trend detection and opinion mining for business analytics. However, mining high-volume microblog streams for aspects poses unique challenges due to the inherent noise, redundancy and ambiguity in users' social posts. We address these challenges by using a probabilistic model that incorporates various global and local indicators such as "uniqueness", "diversity" and "burstiness" of phrases, to infer relevant aspects. Our model is learned using an EM algorithm that uses automatically generated noisy labels, without requiring manual effort or domain knowledge. We present results on three months of Twitter data across different types of entities to validate our approach

    Subjectivity Analysis In Opinion Mining - A Systematic Literature Review

    Get PDF
    Subjectivity analysis determines existence of subjectivity in text using subjective clues.It is the first task in opinion mining process.The difference between subjectivity analysis and polarity determination is the latter process subjective text to determine the orientation as positive or negative.There were many techniques used to solve the problem of segregating subjective and objective text.This paper used systematic literature review (SLR) to compile the undertaking study in subjective analysis.SLR is a literature review that collects multiple and critically analyse multiple studies to answer the research questions.Eight research questions were drawn for this purpose.Information such as technique,corpus,subjective clues representation and performance were extracted from 97 articles known as primary studies.This information was analysed to identify the strengths and weaknesses of the technique,affecting elements to the performance and missing elements from the subjectivity analysis.The SLR has found that majority of the study are using machine learning approach to identify and learn subjective text due to the nature of subjectivity analysis problem that is viewed as classification problem.The performance of this approach outperformed other approaches though currently it is at satisfactory level.Therefore,more studies are needed to improve the performance of subjectivity analysis