827 research outputs found

    The Semantic Web: Apotheosis of annotation, but what are its semantics?

    Get PDF
    This article discusses what kind of entity the proposed Semantic Web (SW) is, principally by reference to the relationship of natural language structure to knowledge representation (KR). There are three distinct views on this issue. The first is that the SW is basically a renaming of the traditional AI KR task, with all its problems and challenges. The second view is that the SW will be, at a minimum, the World Wide Web with its constituent documents annotated so as to yield their content, or meaning structure, more directly. This view makes natural language processing central as the procedural bridge from texts to KR, usually via some form of automated information extraction. The third view is that the SW is about trusted databases as the foundation of a system of Web processes and services. There's also a fourth view, which is much more difficult to define and discuss: If the SW just keeps moving as an engineering development and is lucky, then real problems won't arise. This article is part of a special issue called Semantic Web Update

    Multilingual Metadata for Cultural Heritage Materials: The Case of the Tse-Tsung Chow Collection of Chinese Scrolls and Fan Paintings

    Get PDF
    Purpose – The purpose of this paper is to explore multilingual access in digital libraries and to present a case study of creating bilingual metadata records for the Tse-Tsung Chow Collection of Chinese Scrolls and Fan Paintings. The project, undertaken at the University of Wisconsin-Milwaukee Libraries, provides access to digital copies of calligraphic and painted Chinese scrolls and fans from the collection donated by Prof Tse-Tsung Chow (Cezong Zhou). Design/methodology/approach – This paper examines the current approaches to multilingual indexing and retrieval in digital collections and presents a model of creating bilingual parallel records that combines translation with controlled vocabulary mapping. Findings – Creating multilingual metadata records for cultural heritage materials is in an early phase of development. Bilingual metadata created through human translation and controlled vocabulary mapping represents one of the approaches to multilingual access in digital libraries. Multilingual indexing of collections of international origin addresses the linguistic needs of the target audience, connects the digitized objects to their respective cultures and contributes to richer descriptive records. The approach that relies on human translation and research can be undertaken in small-scale digitization projects of rare cultural heritage materials. Language and subject expertise are required to create bilingual metadata records. Research limitations/implications – This paper presents the results of a case study. The approach to multilingual access that involves research, and it relies on human translation that can only be undertaken in small-scale projects. Practical implications – This case study of creating parallel records with a combination of translation and vocabulary mapping can be useful for designing similar bilingual digital collections. Social implications – This paper also discusses the obligations of holding institutions in undertaking digital conversion of the cultural heritage materials that originated in other countries, especially in regard to providing metadata records that reflect the language of the originating community. Originality/value – The research and practice in multilingual indexing of cultural heritage materials are very limited. There are no standardized models of how to approach building multilingual digita

    Development of a Comprehensive Network for Scientific and Technical Information in Japan

    Get PDF
    published or submitted for publicatio

    Ontologies on the semantic web

    Get PDF
    As an informational technology, the World Wide Web has enjoyed spectacular success. In just ten years it has transformed the way information is produced, stored, and shared in arenas as diverse as shopping, family photo albums, and high-level academic research. The “Semantic Web” was touted by its developers as equally revolutionary but has not yet achieved anything like the Web’s exponential uptake. This 17 000 word survey article explores why this might be so, from a perspective that bridges both philosophy and IT

    Report of the Stanford Linked Data Workshop

    No full text
    The Stanford University Libraries and Academic Information Resources (SULAIR) with the Council on Library and Information Resources (CLIR) conducted at week-long workshop on the prospects for a large scale, multi-national, multi-institutional prototype of a Linked Data environment for discovery of and navigation among the rapidly, chaotically expanding array of academic information resources. As preparation for the workshop, CLIR sponsored a survey by Jerry Persons, Chief Information Architect emeritus of SULAIR that was published originally for workshop participants as background to the workshop and is now publicly available. The original intention of the workshop was to devise a plan for such a prototype. However, such was the diversity of knowledge, experience, and views of the potential of Linked Data approaches that the workshop participants turned to two more fundamental goals: building common understanding and enthusiasm on the one hand and identifying opportunities and challenges to be confronted in the preparation of the intended prototype and its operation on the other. In pursuit of those objectives, the workshop participants produced:1. a value statement addressing the question of why a Linked Data approach is worth prototyping;2. a manifesto for Linked Libraries (and Museums and Archives and 
);3. an outline of the phases in a life cycle of Linked Data approaches;4. a prioritized list of known issues in generating, harvesting & using Linked Data;5. a workflow with notes for converting library bibliographic records and other academic metadata to URIs;6. examples of potential “killer apps” using Linked Data: and7. a list of next steps and potential projects.This report includes a summary of the workshop agenda, a chart showing the use of Linked Data in cultural heritage venues, and short biographies and statements from each of the participants

    A Legal Perspective on Training Models for Natural Language Processing

    Get PDF
    A significant concern in processing natural language data is the often unclear legal status of the input and output data/resources. In this paper, we investigate this problem by discussing a typical activity in Natural Language Processing: the training of a machine learning model from an annotated corpus. We examine which legal rules apply at relevant steps and how they affect the legal status of the results, especially in terms of copyright and copyright-related rights

    Foundation, Implementation and Evaluation of the MorphoSaurus System: Subword Indexing, Lexical Learning and Word Sense Disambiguation for Medical Cross-Language Information Retrieval

    Get PDF
    Im medizinischen Alltag, zu welchem viel Dokumentations- und Recherchearbeit gehört, ist mittlerweile der ĂŒberwiegende Teil textuell kodierter Information elektronisch verfĂŒgbar. Hiermit kommt der Entwicklung leistungsfĂ€higer Methoden zur effizienten Recherche eine vorrangige Bedeutung zu. Bewertet man die NĂŒtzlichkeit gĂ€ngiger Textretrievalsysteme aus dem Blickwinkel der medizinischen Fachsprache, dann mangelt es ihnen an morphologischer FunktionalitĂ€t (Flexion, Derivation und Komposition), lexikalisch-semantischer FunktionalitĂ€t und der FĂ€higkeit zu einer sprachĂŒbergreifenden Analyse großer DokumentenbestĂ€nde. In der vorliegenden Promotionsschrift werden die theoretischen Grundlagen des MorphoSaurus-Systems (ein Akronym fĂŒr Morphem-Thesaurus) behandelt. Dessen methodischer Kern stellt ein um Morpheme der medizinischen Fach- und Laiensprache gruppierter Thesaurus dar, dessen EintrĂ€ge mittels semantischer Relationen sprachĂŒbergreifend verknĂŒpft sind. Darauf aufbauend wird ein Verfahren vorgestellt, welches (komplexe) Wörter in Morpheme segmentiert, die durch sprachunabhĂ€ngige, konzeptklassenartige Symbole ersetzt werden. Die resultierende ReprĂ€sentation ist die Basis fĂŒr das sprachĂŒbergreifende, morphemorientierte Textretrieval. Neben der Kerntechnologie wird eine Methode zur automatischen Akquise von LexikoneintrĂ€gen vorgestellt, wodurch bestehende Morphemlexika um weitere Sprachen ergĂ€nzt werden. Die BerĂŒcksichtigung sprachĂŒbergreifender PhĂ€nomene fĂŒhrt im Anschluss zu einem neuartigen Verfahren zur Auflösung von semantischen AmbiguitĂ€ten. Die LeistungsfĂ€higkeit des morphemorientierten Textretrievals wird im Rahmen umfangreicher, standardisierter Evaluationen empirisch getestet und gĂ€ngigen Herangehensweisen gegenĂŒbergestellt

    A multilingual/multicultural semantic-based approach to improve Data Sharing in a SDI for Nature Conservation

    Get PDF
    The paper proposes an approach to transcend multicultural and multilingual barriers in the use and reuse of geographical data at the European level. The approach aims at sharing scientific terms in the field of nature conservation with the goal of assisting different user communities with metadata compilation and information discovery. A multi-thesauri solution is proposed, based on a Common Thesaurus Framework for Nature Conservation, where different well-known Knowledge Organization Systems are assembled and shared. It has been designed according to semantic web and W3C recommendations employing SKOS standard models and Linked Data to publish the thesauri as a whole in machine-understandable format. The outcome is a powerful framework satisfying the requirements of modularity and openness for further thesaurus extension and updating, interlinking among thesauri, and exploitability from other systems. The paper supports the employment of Linked Data to deal with terminologies in complex domains such as nature conservation and it proposes a hands-on recipe to publish thesauri in the framework
    • 

    corecore