8,551 research outputs found

    Finding Relevant Answers in Software Forums

    Get PDF
    Abstract—Online software forums provide a huge amount of valuable content. Developers and users often ask questions and receive answers from such forums. The availability of a vast amount of thread discussions in forums provides ample opportunities for knowledge acquisition and summarization. For a given search query, current search engines use traditional information retrieval approach to extract webpages containin

    Browsing and searching e-encyclopaedias

    Get PDF
    Educational websites and electronic encyclopaedias employ many of the same design elements, such as hyperlinks, frames and search mechanisms. This paper asks to what extent recommendations from the world of web design can be applied to e-encyclopaedias, through an evaluation of users' browsing and searching behaviour in the free, web-based versions of Encyclopaedia Britannica, the Concise Columbia Encyclopaedia and Microsoft's Encarta. It is discovered that e-encyclopaedias have a unique set of design requirements, as users' expectations are inherited from the worlds of both web and print

    Improving Ontology Recommendation and Reuse in WebCORE by Collaborative Assessments

    Get PDF
    In this work, we present an extension of CORE [8], a tool for Collaborative Ontology Reuse and Evaluation. The system receives an informal description of a specific semantic domain and determines which ontologies from a repository are the most appropriate to describe the given domain. For this task, the environment is divided into three modules. The first component receives the problem description as a set of terms, and allows the user to refine and enlarge it using WordNet. The second module applies multiple automatic criteria to evaluate the ontologies of the repository, and determines which ones fit best the problem description. A ranked list of ontologies is returned for each criterion, and the lists are combined by means of rank fusion techniques. Finally, the third component uses manual user evaluations in order to incorporate a human, collaborative assessment of the ontologies. The new version of the system incorporates several novelties, such as its implementation as a web application; the incorporation of a NLP module to manage the problem definitions; modifications on the automatic ontology retrieval strategies; and a collaborative framework to find potential relevant terms according to previous user queries. Finally, we present some early experiments on ontology retrieval and evaluation, showing the benefits of our system

    Closing the loop: assisting archival appraisal and information retrieval in one sweep

    Get PDF
    In this article, we examine the similarities between the concept of appraisal, a process that takes place within the archives, and the concept of relevance judgement, a process fundamental to the evaluation of information retrieval systems. More specifically, we revisit selection criteria proposed as result of archival research, and work within the digital curation communities, and, compare them to relevance criteria as discussed within information retrieval's literature based discovery. We illustrate how closely these criteria relate to each other and discuss how understanding the relationships between the these disciplines could form a basis for proposing automated selection for archival processes and initiating multi-objective learning with respect to information retrieval

    INEX Tweet Contextualization Task: Evaluation, Results and Lesson Learned

    Get PDF
    Microblogging platforms such as Twitter are increasingly used for on-line client and market analysis. This motivated the proposal of a new track at CLEF INEX lab of Tweet Contextualization. The objective of this task was to help a user to understand a tweet by providing him with a short explanatory summary (500 words). This summary should be built automatically using resources like Wikipedia and generated by extracting relevant passages and aggregating them into a coherent summary. Running for four years, results show that the best systems combine NLP techniques with more traditional methods. More precisely the best performing systems combine passage retrieval, sentence segmentation and scoring, named entity recognition, text part-of-speech (POS) analysis, anaphora detection, diversity content measure as well as sentence reordering. This paper provides a full summary report on the four-year long task. While yearly overviews focused on system results, in this paper we provide a detailed report on the approaches proposed by the participants and which can be considered as the state of the art for this task. As an important result from the 4 years competition, we also describe the open access resources that have been built and collected. The evaluation measures for automatic summarization designed in DUC or MUC were not appropriate to evaluate tweet contextualization, we explain why and depict in detailed the LogSim measure used to evaluate informativeness of produced contexts or summaries. Finally, we also mention the lessons we learned and that it is worth considering when designing a task

    Evaluating Web Search Result Summaries

    No full text
    The aim of our research is to produce and assess short summaries to aid users’ relevance judgements, for example for a search engine result page. In this paper we present our new metric for measuring summary quality based on representativeness and judgeability, and compare the summary quality of our system to that of Google. We discuss the basis for constructing our evaluation methodology in contrast to previous relevant open evaluations, arguing that the elements which make up an evaluation methodology: the tasks, data and metrics, are interdependent and the way in which they are combined is critical to the effectiveness of the methodology. The paper discusses the relationship between these three factors as implemented in our own work, as well as in SUMMAC/MUC/DUC

    Going Beyond Relevance: Role of effort in Information Retrieval

    Get PDF
    The primary focus of Information Retrieval (IR) systems has been to optimize for Relevance. Existing approaches to rank documents or evaluate IR systems does not account for “user effort”. Currently, judges only determine whether the information provided in a given document would satisfy the underlying information need in a query. The current mechanism of obtaining relevance judgments does not account for time and effort that an end user must put forth to consume its content. While a judge may spend a lot of time assessing a document, an impatient user may not devote the same amount of time and effort to consume its content. This problem is exacerbated on smaller devices like mobile. While on mobile or tablets, with limited interaction, users may not put in too much effort in finding information. This thesis characterizes and incorporates effort in Information Retrieval. Comparison of explicit and implicit relevance judgments across several datasets reveals that certain documents are marked relevant by the judges but are of low utility to an end user. Experiments indicate that document-level effort features can reliably predict the mismatch between dwell time and judging time of documents. Explicit and preference-based judgments were collected to determine which factors associated with effort agreed the most with user satisfaction. The ability to locate relevant information or findability was found to be in highest agreement with preference judgments. Findability judgments were also gathered to study the association of different annotator, query or document related properties with effort judgments. We also investigate how can existing systems be optimized for relevance and effort. Finally, we investigate the role of effort on smaller devices with the help of cost-benefit models
    corecore