3,236 research outputs found

    Retrieving with good sense

    Get PDF
    Although always present in text, word sense ambiguity only recently became regarded as a problem to information retrieval which was potentially solvable. The growth of interest in word senses resulted from new directions taken in disambiguation research. This paper first outlines this research and surveys the resulting efforts in information retrieval. Although the majority of attempts to improve retrieval effectiveness were unsuccessful, much was learnt from the research. Most notably a notion of under what circumstance disambiguation may prove of use to retrieval

    The SIMBAD astronomical database

    Get PDF
    Simbad is the reference database for identification and bibliography of astronomical objects. It contains identifications, `basic data', bibliography, and selected observational measurements for several million astronomical objects. Simbad is developed and maintained by CDS, Strasbourg. Building the database contents is achieved with the help of several contributing institutes. Scanning the bibliography is the result of the collaboration of CDS with bibliographers in Observatoire de Paris (DASGAL), Institut d'Astrophysique de Paris, and Observatoire de Bordeaux. When selecting catalogues and tables for inclusion, priority is given to optimal multi-wavelength coverage of the database, and to support of research developments linked to large projects. In parallel, the systematic scanning of the bibliography reflects the diversity and general trends of astronomical research. A WWW interface to Simbad is available at: http://simbad.u-strasbg.fr/SimbadComment: 14 pages, 5 Postscript figures; to be published in A&A

    Virtual WWW Documents: a Concept to Explicit the Structure of WWW Sites

    Get PDF
    http://www.emse.fr/~beigbeder/PUBLIS/1999-BCS-IRSG-p185-doan-v1.pdfInternational audienceThis paper shows a new concept of a virtual WWW document (VWD), as a set of WWW pages representing a logical information space, generally dealing with one particular domain. The VWD is described using metadata in the XML syntax and will be accessed through a metadata.class file, stored at the root level of WWW sites. We'll suggest how the VWD can improve information retrieval on the WWW and reduce the network load generated by the robots. We describe a prototype implemented in JAVA, within an application in the environmental domain. The exchanges of such metadata lay in a flexible architecture based on two kinds of robots : generalists and specialists that collect and organize this metadata, in order to localize the resources on the WWW. They will contribute to the overall auto-organizing information process by exchanging their indices, therefore forwarding their knowledge each other

    Personalization of tagging systems

    Get PDF
    Social media systems have encouraged end user participation in the Internet, for the purpose of storing and distributing Internet content, sharing opinions and maintaining relationships. Collaborative tagging allows users to annotate the resulting user-generated content, and enables effective retrieval of otherwise uncategorised data. However, compared to professional web content production, collaborative tagging systems face the challenge that end-users assign tags in an uncontrolled manner, resulting in unsystematic and inconsistent metadata. This paper introduces a framework for the personalization of social media systems. We pinpoint three tasks that would benefit from personalization: collaborative tagging, collaborative browsing and collaborative s

    Guided generation of pedagogical concept maps from the Wikipedia

    Get PDF
    We propose a new method for guided generation of concept maps from open accessonline knowledge resources such as Wikies. Based on this method we have implemented aprototype extracting semantic relations from sentences surrounding hyperlinks in the Wikipedia’sarticles and letting a learner to create customized learning objects in real-time based oncollaborative recommendations considering her earlier knowledge. Open source modules enablepedagogically motivated exploration in Wiki spaces, corresponding to an intelligent tutoringsystem. The method extracted compact noun–verb–noun phrases, suggested for labeling arcsbetween nodes that were labeled with article titles. On average, 80 percent of these phrases wereuseful while their length was only 20 percent of the length of the original sentences. Experimentsindicate that even simple analysis algorithms can well support user-initiated information retrievaland building intuitive learning objects that follow the learner’s needs.Peer reviewe

    Applying Wikipedia to Interactive Information Retrieval

    Get PDF
    There are many opportunities to improve the interactivity of information retrieval systems beyond the ubiquitous search box. One idea is to use knowledge bases—e.g. controlled vocabularies, classification schemes, thesauri and ontologies—to organize, describe and navigate the information space. These resources are popular in libraries and specialist collections, but have proven too expensive and narrow to be applied to everyday webscale search. Wikipedia has the potential to bring structured knowledge into more widespread use. This online, collaboratively generated encyclopaedia is one of the largest and most consulted reference works in existence. It is broader, deeper and more agile than the knowledge bases put forward to assist retrieval in the past. Rendering this resource machine-readable is a challenging task that has captured the interest of many researchers. Many see it as a key step required to break the knowledge acquisition bottleneck that crippled previous efforts. This thesis claims that the roadblock can be sidestepped: Wikipedia can be applied effectively to open-domain information retrieval with minimal natural language processing or information extraction. The key is to focus on gathering and applying human-readable rather than machine-readable knowledge. To demonstrate this claim, the thesis tackles three separate problems: extracting knowledge from Wikipedia; connecting it to textual documents; and applying it to the retrieval process. First, we demonstrate that a large thesaurus-like structure can be obtained directly from Wikipedia, and that accurate measures of semantic relatedness can be efficiently mined from it. Second, we show that Wikipedia provides the necessary features and training data for existing data mining techniques to accurately detect and disambiguate topics when they are mentioned in plain text. Third, we provide two systems and user studies that demonstrate the utility of the Wikipedia-derived knowledge base for interactive information retrieval
    corecore