906 research outputs found

    ImageSieve: Exploratory search of museum archives with named entity-based faceted browsing

    Get PDF
    Over the last few years, faceted search emerged as an attractive alternative to the traditional "text box" search and has become one of the standard ways of interaction on many e-commerce sites. However, these applications of faceted search are limited to domains where the objects of interests have already been classified along several independent dimensions, such as price, year, or brand. While automatic approaches to generate faceted search interfaces were proposed, it is not yet clear to what extent the automatically-produced interfaces will be useful to real users, and whether their quality can match or surpass their manually-produced predecessors. The goal of this paper is to introduce an exploratory search interface called ImageSieve, which shares many features with traditional faceted browsing, but can function without the use of traditional faceted metadata. ImageSieve uses automatically extracted and classified named entities, which play important roles in many domains (such as news collections, image archives, etc.). We describe one specific application of ImageSieve for image search. Here, named entities extracted from the descriptions of the retrieved images are used to organize a faceted browsing interface, which then helps users to make sense of and further explore the retrieved images. The results of a user study of ImageSieve demonstrate that a faceted search system based on named entities can help users explore large collections and find relevant information more effectively

    Virtual language observatory: The portal to the language resources and technology universe

    Get PDF
    Over the years, the field of Language Resources and Technology (LRT) hasdeveloped a tremendous amount of resources and tools. However, there is noready-to-use map that researchers could use to gain a good overview andsteadfast orientation when searching for, say corpora or software tools tosupport their studies. It is rather the case that information is scatteredacross project- or organisation-specific sites, which makes it hard if notimpossible for less-experienced researchers to gather all relevant material.Clearly, the provision of metadata is central to resource and softwareexploration. However, in the LRT field, metadata comes in many forms, tastesand qualities, and therefore substantial harmonization and curation efforts arerequired to provide researchers with metadata-based guidance. To address thisissue a broad alliance of LRT providers (CLARIN, the Linguist List, DOBES,DELAMAN, DFKI, ELRA) have initiated the Virtual Language Observatory portal toprovide a low-barrier, easy-to-follow entry point to language resources andtools; it can be accessed via http://www.clarin.eu/vl

    Search in the eye of the beholder: using the personal social dataset and ontology-guided input to improve web search efficiency

    Get PDF
    Proceedings of: Latin American Web Conference 2007 (LA-WEB 2007), 31 October-2 November 2007, Santiago (Chile)Among the challenges of searching the vast information source the Web has become, improving Web search efficiency by different strategies using semantics and the user generated data from Web 2.0 applications remains a promising and interesting approach. In this paper, we present the Personal Social Dataset and Ontology-guided Input strategies and couple them together, providing a proof of concept implementation.Publicad

    Theory and Practice of Data Citation

    Full text link
    Citations are the cornerstone of knowledge propagation and the primary means of assessing the quality of research, as well as directing investments in science. Science is increasingly becoming "data-intensive", where large volumes of data are collected and analyzed to discover complex patterns through simulations and experiments, and most scientific reference works have been replaced by online curated datasets. Yet, given a dataset, there is no quantitative, consistent and established way of knowing how it has been used over time, who contributed to its curation, what results have been yielded or what value it has. The development of a theory and practice of data citation is fundamental for considering data as first-class research objects with the same relevance and centrality of traditional scientific products. Many works in recent years have discussed data citation from different viewpoints: illustrating why data citation is needed, defining the principles and outlining recommendations for data citation systems, and providing computational methods for addressing specific issues of data citation. The current panorama is many-faceted and an overall view that brings together diverse aspects of this topic is still missing. Therefore, this paper aims to describe the lay of the land for data citation, both from the theoretical (the why and what) and the practical (the how) angle.Comment: 24 pages, 2 tables, pre-print accepted in Journal of the Association for Information Science and Technology (JASIST), 201

    Lightweight Ontologies

    Get PDF
    Ontologies are explicit specifications of conceptualizations. They are often thought of as directed graphs whose nodes represent concepts and whose edges represent relations between concepts. The notion of concept is understood as defined in Knowledge Representation, i.e., as a set of objects or individuals. This set is called the concept extension or the concept interpretation. Concepts are often lexically defined, i.e., they have natural language names which are used to describe the concept extensions (e.g., concept mother denotes the set of all female parents). Therefore, when ontologies are visualized, their nodes are often shown with corresponding natural language concept names. The backbone structure of the ontology graph is a taxonomy in which the relations are “is-a”, whereas the remaining structure of the graph supplies auxiliary information about the modeled domain and may include relations like “part-of”, “located-in”, “is-parent-of”, and many others

    Facilitating design learning through faceted classification of in-service information

    Get PDF
    The maintenance and service records collected and maintained by engineering companies are a useful resource for the ongoing support of products. Such records are typically semi-structured and contain key information such as a description of the issue and the product affected. It is suggested that further value can be realised from the collection of these records for indicating recurrent and systemic issues which may not have been apparent previously. This paper presents a faceted classification approach to organise the information collection that might enhance retrieval and also facilitate learning from in-service experiences. The faceted classification may help to expedite responses to urgent in-service issues as well as to allow for patterns and trends in the records to be analysed, either automatically using suitable data mining algorithms or by manually browsing the classification tree. The paper describes the application of the approach to aerospace in-service records, where the potential for knowledge discovery is demonstrated

    Multi-Faceted Search and Navigation of Biological Databases

    Get PDF
    • 

    corecore