    Sampo-UI: A Full Stack JavaScript Framework for Developing Semantic Portal User Interfaces

    This paper presents a new software framework, SAMPO-UI, for developing user interfaces for semantic portals. The goal is to provide the end-user with multiple application perspectives to Linked Data knowledge graphs, and a two-step usage cycle based on faceted search combined with ready-to-use tooling for data analysis. For the software developer, the SAMPO-UI framework makes it possible to create highly customizable, user-friendly, and responsive user interfaces using current state-of-the-art JavaScript libraries and data from SPARQL endpoints, while saving substantial coding effort. SAMPO-UI is published on GitHub under the open MIT License and has been utilized in several internal and external projects. The framework has been used thus far in creating six published and five forth-coming portals, mostly related to the Cultural Heritage domain, that have had tens of thousands of end-users on the Web.Peer reviewe

    Using the Semantic Web in digital humanities : Shift from data publishing to data-analysis and serendipitous knowledge discovery

    This paper discusses a shift of focus in research on Cultural Heritage semantic portals, based on Linked Data, and envisions and proposes new directions of research. Three generations of portals are identified: Ten years ago the research focus in semantic portal development was on data harmonization, aggregation, search, and browsing ('first generation systems'). At the moment, the rise of Digital Humanities research has started to shift the focus to providing the user with integrated tools for solving research problems in interactive ways ('second generation systems'). This paper envisions and argues that the next step ahead to 'third generation systems' is based on Artificial Intelligence: future portals not only provide tools for the human to solve problems but are used for finding research problems in the first place, for addressing them, and even for solving them automatically under the constraints set by the human researcher. Such systems should preferably be able to explain their reasoning, which is an important aspect in the source critical humanities research tradition. The second and third generation systems set new challenges for both computer scientists and humanities researchers.Peer reviewe

    A Metadata-Enabled Scientific Discourse Platform

    Scientific papers and scientific conferences are still, despite the emergence of several new dissemination technologies, the de-facto standard in which scientific knowledge is consumed and discussed. While there is no shortage of services and platforms that aid this process (e.g. scholarly search engines, websites, blogs, conference management programs), a widely accepted platform used to capture and enrich the interactions of research community has yet to appear. As such, we aim to create new ways for the members and interested people working in research communities to interact; before, during and after their conferences. Furthermore, to serve as a base to these interactions, we want not only to obtain, format and manage a body of legacy and new papers related to this community but also to aggregate several useful information and services to the environment of a discourse platform

    Resources Annotation, Retrieval and Presentation: a semantic annotation management system

    International audienceThis paper addresses the problem of the management of resources metadata. A variety of responses are discussed, and we describe one possible way forward, which uses a semantic annotation management tool. The term 'semantic' describes the ability to create, retrieve, query and navigate knowledgeably about things identified by a Web URI. The support for this semantic tool is RDF, through the integration of Jena, an open-source RDF API provided by HP laboratory. Thanks to RDF capabilities, this tool offers new search features such as hierarchical browsing based on the structure of RDF vocabularies and faceted-browsing using properties lists defined by the end-user. The navigation inside annotations uses intuitive modes such as left/right and backward/forward movements. Presentation is controlled by the user using a subset of the Fresnel language to specify how RDF graphs are presented. This work is ongoing; certain open issues are raised

    Easy Creation of Semantics-Enhanced Digital Artwork Collections

    In this paper we propose an approach for cost-effective employing of semantic technologies to improve the efficiency of searching and browsing of digital artwork collections. It is based on a semi-automatic creation of a Topic Map-based virtual art gallery portal by using existing Topic Maps tools. Such a ‘cheap’ solution could enable small art museums or art-related educational programs that lack sufficient funding for software development and publication infrastructure to take advantage of the emerging semantic technologies. The proposed approach has been used for creating the WSSU Diggs Gallery Portal

    Methods for Building Semantic Portals

    Semantic portals are information systems which collect information from several sources and combine them using semantic web technologies into a user interface that solves information needs of users. Creating such portals requires methods and tools from multiple disciplines, including knowledge representation, information retrieval, information extraction, and user interface design. This thesis explores methods for building and improving semantic portals and other semantic web applications with contributions in three areas. The studies included in the thesis draw from the design science methodology in information systems research. First, a method for creating of faceted search user interfaces for semantic portals utilizing controlled vocabularies with a complex hierarchical structure is presented. The results show that the method allows the creation of user-centric search facets that hide the complex hierarchies from the user, resulting in a user-friendly faceted search interface. Second, the creation of structured metadata from text documents is enhanced by adapting a state of the art automatic subject indexing system to Finnish language texts. The results show that using a suitable combination of existing tools, automatic subject indexing quality comparable to that of human indexers can be attained in a highly inflected language such as Finnish. Finally, the quality of controlled vocabularies such as thesauri and lightweight ontologies is examined by developing a set of quality criteria for vocabularies expressed using the SKOS standard, and methods for correcting structural problems in SKOS vocabularies are presented. The results show that most published SKOS vocabularies suffer from quality issues and violate the SKOS integrity conditions. However, the great majority of such problems were corrected by the methods presented in this dissertation. The methods have been implemented in several real world applications, including the HealthFinland health information portal, the ARPA information extraction toolkit, and the ONKI ontology library system.Semanttiset portaalit ovat tietojärjestelmiä, jotka keräävät tietoa useista lähteistä ja yhdistävät ne semanttisen webin teknologioiden avulla käyttäjien tiedontarpeita tukevaksi käyttöliittymäksi. Tällaisten portaalien rakentaminen vaatii menetelmiä ja työkaluja useilta tieteenaloilta, mukaan lukien tietämyksen esittäminen, tiedonhaku, tiedon eristäminen ja käyttöliittymäsuunnittelu. Tässä väitöskirjassa tarkastellaan menetelmiä semanttisten portaalien ja muiden semanttisen webin sovellusten rakentamiseksi. Väitöskirjan tulokset jakaantuvat kolmeen osa-alueeseen. Tutkimuksessa käytetyt menetelmät perustuvat informaatiojärjestelmien tutkimuksessa käytettyihin suunnittelutieteen menetelmiin. Ensiksi väitöskirjassa esitetään menetelmä semanttisten portaalien fasettipohjaisten käyttöliittymien luomiseksi monimutkaisten kontrolloitujen sanastojen pohjalta. Tulokset osoittavat, että menetelmä mahdollistaa sellaisten käyttäjäkeskeisten hakunäkymien luomisen, jotka piilottavat monimutkaiset hierarkiat käyttäjältä ja auttavat siten luomaan käyttäjäystävällisen fasettipohjaisen hakukäyttöliittymän. Toiseksi rakenteisen metatiedon tuottamista tekstidokumenteista parannetaan sovittamalla nykyaikainen automaattisen sisällönkuvailun järjestelmä suomenkieliselle tekstiaineistolle. Tulokset osoittavat, että käyttämällä sopivaa yhdistelmää olemassaolevista työkaluista saavutetaan ihmistyönä tehtyyn sisällönkuvailuun verrattavissa oleva automaattisen sisällönkuvailun laatu myös agglutinatiivisella kielellä kuten suomen kielellä esitetyille aineistoille. Kolmanneksi tarkastellaan kontrolloitujen sanastojen kuten asiasanastojen ja kevytontologioiden laatua kehittämällä laatukriteeristö SKOS-standardin avulla esitetyille sanastoille sekä esittämällä menetelmiä SKOS-sanastojen rakenteisten ongelmien korjaamiseksi. Tulokset osoittavat, että useimmat julkaistut SKOS-sanastot kärsivät laatuongelmista eivätkä noudata SKOS-standardin eheyssääntöjä. Suuri osa näistä ongelmista pystyttiin korjaamaan tässä väitöskirjassa esitetyin menetelmin. Menetelmät on toteutettu useissa käytössä olevissa järjestelmissä, kuten TerveSuomi-terveystietoportaalissa, ARPA-tiedoneristämistyökalussa ja ONKI-ontologiakirjastossa

    Dataset search: a survey

    Generating value from data requires the ability to find, access and make sense of datasets. There are many efforts underway to encourage data sharing and reuse, from scientific publishers asking authors to submit data alongside manuscripts to data marketplaces, open data portals and data communities. Google recently beta released a search service for datasets, which allows users to discover data stored in various online repositories via keyword queries. These developments foreshadow an emerging research field around dataset search or retrieval that broadly encompasses frameworks, methods and tools that help match a user data need against a collection of datasets. Here, we survey the state of the art of research and commercial systems in dataset retrieval. We identify what makes dataset search a research field in its own right, with unique challenges and methods and highlight open problems. We look at approaches and implementations from related areas dataset search is drawing upon, including information retrieval, databases, entity-centric and tabular search in order to identify possible paths to resolve these open problems as well as immediate next steps that will take the field forward.Comment: 20 pages, 153 reference