1,121 research outputs found

    Associative and Spatial Relationships in Thesaurus-based Retrieval

    Get PDF
    The OASIS (Ontologically Augmented Spatial Information System) project explores terminology systems for thematic and spatial access in digital library applications. A prototype implementation uses data from the Royal Commission on the Ancient and Historical Monuments of Scotland, together with the Getty AAT and TGN thesauri. This paper describes its integrated spatial and thematic schema and discusses novel approaches to the application of thesauri in spatial and thematic semantic distance measures. Semantic distance measures can underpin interactive and automatic query expansion techniques by ranking lists of candidate terms. We first illustrate how hierarchical spatial relationships can be used to provide more flexible retrieval for queries incorporating place names in applications employing online gazetteers and geographical thesauri. We then employ a set of experimental scenarios to investigate key issues affecting use of the associative (RT) thesaurus relationships in semantic distance measures. Previous work has noted the potential of RTs in thesaurus search aids but the problem of increased noise in result sets has been emphasised. Specialising RTs allows the possibility of dynamically linking RT type to query context. Results presented in this paper demonstrate the potential for filtering on the context of the RT link and on subtypes of RT relationships

    Metadata Augmentation for Semantic- and Context- Based Retrieval of Digital Cultural Objects

    Get PDF
    Cultural objects are increasingly stored and generated in digital form, yet effective methods for their indexing and retrieval still remain an open area of research. The main problem arises from the disconnection between the content-based indexing approach used by computer scientists and the description-based approach used by information scientists. There is also a lack of representational schemes that allow the alignment of the semantics and context with keywords and low-level features that can be automatically extracted from the content of these cultural objects. This paper presents an integrated approach to address these problems, taking advantage of both computer science and information science approaches. The focus is on the rationale and conceptual design of the system and its various components. In particular, we discuss techniques for augmenting commonly used metadata with visual features and domain knowledge to generate high-level abstract metadata which in turn can be used for semantic and context-based indexing and retrieval. We use a sample collection of Vietnamese traditional woodcuts to demonstrate the usefulness of this approach

    The DIGMAP geo-temporal web gazetteer service

    Get PDF
    This paper presents the DIGMAP geo-temporal Web gazetteer service, a system providing access to names of places, historical periods, and associated geo-temporal information. Within the DIGMAP project, this gazetteer serves as the unified repository of geographic and temporal information, assisting in the recognition and disambiguation of geo-temporal expressions over text, as well as in resource searching and indexing. We describe the data integration methodology, the handling of temporal information and some of the applications that use the gazetteer. Initial evaluation results show that the proposed system can adequately support several tasks related to geo-temporal information extraction and retrieval

    Il modello semantico di EuroWordNet come strumento per la strutturazione della relazione associativa nei thesauri.

    Get PDF
    Thesauri are tools which semantically organize a domain of knowledge for operational purposes. Their relational semantics is concerned with methods that connect terms with related meanings and it is important to support information retrieval, enhancing the information recall performance and contributing to improve precision. In fact, the network of relations of a thesaurus has an important semantic function, providing a representation of the meaning of each thesaurus term and a map of the conceptual structure of a subject area. The traditional thesaurus format - as described in international standards - includes the hierarchical, associative and equivalence relationships. However, a rather widespread opinion is that this format should be refined, in order to cope with the current needs of information organization. This paper discusses the possibility of refining the associative relation into a number of sub-kinds by adopting the semantic model of EuroWordNet (EWN), as it was used, according to one of its national versions, ItalWordNet (IWN), taking into account the terminological database Mariterm, which contains terms belonging to the maritime domain. It is also stressed how RT designation and refinement appear to be domain dependent, in the sense that they are associated with the specific features of a knowledge field.I thesauri sono strumenti che organizzano semanticamente un dominio di conoscenza per fini applicativi. Attraverso la loro struttura relazionale vengono stabiliti nessi tra termini con significati correlati. La semantica relazionale di un thesaurus ? uno strumento di supporto fondamentale per il recupero dell\u27informazione, attraverso cui vengono aumentati il richiamo (recall) e la precisione (precision) della ricerca. La rete delle relazioni thesaurali svolge, infatti, una funzione semantica importante, fornendo una rappresentazione del significato di ciascun termine contenuto nel thesaurus e realizzando il prototipo di mappa della struttura concettuale del dominio di conoscenza. Il formato tradizionale di un thesaurus, cos? come ? descritto negli standard internazionali, include tre relazioni fondamentali (relazione gerarchica, relazione associativa e relazione di equivalenza). ? opinione diffusa che, per poter meglio rispondere ai bisogni attuali in ambito di organizzazione dell\u27informazione, questo formato debba essere in qualche modo riconsiderato e perfezionato. In questo contributo viene analizzata la possibilit? di differenziare la relazione associativa in un numero ristretto di sottotipi. ? stata preliminarmente valutata a tal fine la possibilit? di utilizzare una serie di relazioni incluse nel modello semantico di EuroWordNet (EWN), cos? come ? stato applicato in una delle sue versioni nazionali, ItalWordNet (IWN), nell\u27ambito del progetto riguardante la terminologia del settore marittimo (Mariterm). Viene, inoltre, preso in considerazione il modo in cui le operazioni di attribuzione e di articolazione della relazione associativa sembrano essere condizionate dalle caratteristiche del dominio di conoscenza in cui sono effettuate

    The best of both worlds: highlighting the synergies of combining manual and automatic knowledge organization methods to improve information search and discovery.

    Get PDF
    Research suggests organizations across all sectors waste a significant amount of time looking for information and often fail to leverage the information they have. In response, many organizations have deployed some form of enterprise search to improve the 'findability' of information. Debates persist as to whether thesauri and manual indexing or automated machine learning techniques should be used to enhance discovery of information. In addition, the extent to which a knowledge organization system (KOS) enhances discoveries or indeed blinds us to new ones remains a moot point. The oil and gas industry was used as a case study using a representative organization. Drawing on prior research, a theoretical model is presented which aims to overcome the shortcomings of each approach. This synergistic model could help to re-conceptualize the 'manual' versus 'automatic' debate in many enterprises, accommodating a broader range of information needs. This may enable enterprises to develop more effective information and knowledge management strategies and ease the tension between what arc often perceived as mutually exclusive competing approaches. Certain aspects of the theoretical model may be transferable to other industries, which is an area for further research

    Web based knowledge extraction and consolidation for automatic ontology instantiation

    Get PDF
    The Web is probably the largest and richest information repository available today. Search engines are the common access routes to this valuable source. However, the role of these search engines is often limited to the retrieval of lists of potentially relevant documents. The burden of analysing the returned documents and identifying the knowledge of interest is therefore left to the user. The Artequakt system aims to deploy natural language tools to automatically ex-tract and consolidate knowledge from web documents and instantiate a given ontology, which dictates the type and form of knowledge to extract. Artequakt focuses on the domain of artists, and uses the harvested knowledge to gen-erate tailored biographies. This paper describes the latest developments of the system and discusses the problem of knowledge consolidation
    • …
    corecore