15,207 research outputs found

    Mining Missing Hyperlinks from Human Navigation Traces: A Case Study of Wikipedia

    Full text link
    Hyperlinks are an essential feature of the World Wide Web. They are especially important for online encyclopedias such as Wikipedia: an article can often only be understood in the context of related articles, and hyperlinks make it easy to explore this context. But important links are often missing, and several methods have been proposed to alleviate this problem by learning a linking model based on the structure of the existing links. Here we propose a novel approach to identifying missing links in Wikipedia. We build on the fact that the ultimate purpose of Wikipedia links is to aid navigation. Rather than merely suggesting new links that are in tune with the structure of existing links, our method finds missing links that would immediately enhance Wikipedia's navigability. We leverage data sets of navigation paths collected through a Wikipedia-based human-computation game in which users must find a short path from a start to a target article by only clicking links encountered along the way. We harness human navigational traces to identify a set of candidates for missing links and then rank these candidates. Experiments show that our procedure identifies missing links of high quality

    Toward a collective intelligence recommender system for education

    Get PDF
    The development of Information and Communication Technology (ICT), have revolutionized the world and have moved us into the information age, however the access and handling of this large amount of information is causing valuable time losses. Teachers in Higher Education especially use the Internet as a tool to consult materials and content for the development of the subjects. The internet has very broad services, and sometimes it is difficult for users to find the contents in an easy and fast way. This problem is increasing at the time, causing that students spend a lot of time in search information rather than in synthesis, analysis and construction of new knowledge. In this context, several questions have emerged: Is it possible to design learning activities that allow us to value the information search and to encourage collective participation?. What are the conditions that an ICT tool that supports a process of information search has to have to optimize the student's time and learning? This article presents the use and application of a Recommender System (RS) designed on paradigms of Collective Intelligence (CI). The RS designed encourages the collective learning and the authentic participation of the students. The research combines the literature study with the analysis of the ICT tools that have emerged in the field of the CI and RS. Also, Design-Based Research (DBR) was used to compile and summarize collective intelligence approaches and filtering techniques reported in the literature in Higher Education as well as to incrementally improving the tool. Several are the benefits that have been evidenced as a result of the exploratory study carried out. Among them the following stand out: • It improves student motivation, as it helps you discover new content of interest in an easy way. • It saves time in the search and classification of teaching material of interest. • It fosters specialized reading, inspires competence as a means of learning. • It gives the teacher the ability to generate reports of trends and behaviors of their students, real-time assessment of the quality of learning material. The authors consider that the use of ICT tools that combine the paradigms of the CI and RS presented in this work, are a tool that improves the construction of student knowledge and motivates their collective development in cyberspace, in addition, the model of Filltering Contents used supports the design of models and strategies of collective intelligence in Higher Education.Postprint (author's final draft

    BlogForever: D2.5 Weblog Spam Filtering Report and Associated Methodology

    Get PDF
    This report is written as a first attempt to define the BlogForever spam detection strategy. It comprises a survey of weblog spam technology and approaches to their detection. While the report was written to help identify possible approaches to spam detection as a component within the BlogForver software, the discussion has been extended to include observations related to the historical, social and practical value of spam, and proposals of other ways of dealing with spam within the repository without necessarily removing them. It contains a general overview of spam types, ready-made anti-spam APIs available for weblogs, possible methods that have been suggested for preventing the introduction of spam into a blog, and research related to spam focusing on those that appear in the weblog context, concluding in a proposal for a spam detection workflow that might form the basis for the spam detection component of the BlogForever software

    OntoMathPROOntoMath^{PRO} Ontology: A Linked Data Hub for Mathematics

    Full text link
    In this paper, we present an ontology of mathematical knowledge concepts that covers a wide range of the fields of mathematics and introduces a balanced representation between comprehensive and sensible models. We demonstrate the applications of this representation in information extraction, semantic search, and education. We argue that the ontology can be a core of future integration of math-aware data sets in the Web of Data and, therefore, provide mappings onto relevant datasets, such as DBpedia and ScienceWISE.Comment: 15 pages, 6 images, 1 table, Knowledge Engineering and the Semantic Web - 5th International Conferenc

    Deriving query suggestions for site search

    Get PDF
    Modern search engines have been moving away from simplistic interfaces that aimed at satisfying a user's need with a single-shot query. Interactive features are now integral parts of web search engines. However, generating good query modification suggestions remains a challenging issue. Query log analysis is one of the major strands of work in this direction. Although much research has been performed on query logs collected on the web as a whole, query log analysis to enhance search on smaller and more focused collections has attracted less attention, despite its increasing practical importance. In this article, we report on a systematic study of different query modification methods applied to a substantial query log collected on a local website that already uses an interactive search engine. We conducted experiments in which we asked users to assess the relevance of potential query modification suggestions that have been constructed using a range of log analysis methods and different baseline approaches. The experimental results demonstrate the usefulness of log analysis to extract query modification suggestions. Furthermore, our experiments demonstrate that a more fine-grained approach than grouping search requests into sessions allows for extraction of better refinement terms from query log files. © 2013 ASIS&T

    Using Search Engine Technology to Improve Library Catalogs

    Get PDF
    This chapter outlines how search engine technology can be used in online public access library catalogs (OPACs) to help improve users’ experiences, to identify users’ intentions, and to indicate how it can be applied in the library context, along with how sophisticated ranking criteria can be applied to the online library catalog. A review of the literature and current OPAC developments form the basis of recommendations on how to improve OPACs. Findings were that the major shortcomings of current OPACs are that they are not sufficiently user-centered and that their results presentations lack sophistication. Further, these shortcomings are not addressed in current 2.0 developments. It is argued that OPAC development should be made search-centered before additional features are applied. While the recommendations on ranking functionality and the use of user intentions are only conceptual and not yet applied to a library catalogue, practitioners will find recommendations for developing better OPACs in this chapter. In short, readers will find a systematic view on how the search engines’ strengths can be applied to improving libraries’ online catalogs
    corecore