7 research outputs found

    A Survey of the First 20 Years of Research on Semantic Web and Linked Data

    Get PDF
    International audienceThis paper is a survey of the research topics in the field of Semantic Web, Linked Data and Web of Data. This study looks at the contributions of this research community over its first twenty years of existence. Compiling several bibliographical sources and bibliometric indicators , we identify the main research trends and we reference some of their major publications to provide an overview of that initial period. We conclude with some perspectives for the future research challenges.Cet article est une étude des sujets de recherche dans le domaine du Web sémantique, des données liées et du Web des données. Cette étude se penche sur les contributions de cette communauté de recherche au cours de ses vingt premières années d'existence. En compilant plusieurs sources bibliographiques et indicateurs bibliométriques, nous identifions les principales tendances de la recherche et nous référençons certaines de leurs publications majeures pour donner un aperçu de cette période initiale. Nous concluons avec une discussion sur les tendances et perspectives de recherche

    Yavaa: supporting data workflows from discovery to visualization

    Get PDF
    Recent years have witness an increasing number of data silos being opened up both within organizations and to the general public: Scientists publish their raw data as supplements to articles or even standalone artifacts to enable others to verify and extend their work. Governments pass laws to open up formerly protected data treasures to improve accountability and transparency as well as to enable new business ideas based on this public good. Even companies share structured information about their products and services to advertise their use and thus increase revenue. Exploiting this wealth of information holds many challenges for users, though. Oftentimes data is provided as tables whose sheer endless rows of daunting numbers are barely accessible. InfoVis can mitigate this gap. However, offered visualization options are generally very limited and next to no support is given in applying any of them. The same holds true for data wrangling. Only very few options to adjust the data to the current needs and barely any protection are in place to prevent even the most obvious mistakes. When it comes to data from multiple providers, the situation gets even bleaker. Only recently tools emerged to search for datasets across institutional borders reasonably. Easy-to-use ways to combine these datasets are still missing, though. Finally, results generally lack proper documentation of their provenance. So even the most compelling visualizations can be called into question when their coming about remains unclear. The foundations for a vivid exchange and exploitation of open data are set, but the barrier of entry remains relatively high, especially for non-expert users. This thesis aims to lower that barrier by providing tools and assistance, reducing the amount of prior experience and skills required. It covers the whole workflow ranging from identifying proper datasets, over possible transformations, up until the export of the result in the form of suitable visualizations

    Contaminant Hydrogeology Knowledge Base (CHKb) of Georgia, USA

    Get PDF
    Hydrogeologists collect data through studies that originate from a diverse and growing set of instruments that measure, for example, geochemical constituents of surface and groundwater. Databases store and publish the collected data on the Web, and the volume of data is quickly increasing, which makes accessing data problematic and time consuming for individuals. One way to overcome this problem is to develop ontology to formally and explicitly represent the domain (e.g., contaminant hydrogeology) knowledge. Using OWL and RDF, contaminant hydrogeology ontology (CHO) is developed to manage hydrological spatial data for Georgia, USA. CHO is a conceptual computer model for the contaminant hydrogeology domain in which concepts (e.g. contaminant, aquifer) and their relationships (e.g. pollutes) are formerly and explicitly defined. Cyberinfrastructure for exposing CHO and datasets (i.e., CHKb) as Linked Data on the Web is developed. Cyberinfrastructure consists of storing, managing, querying, and visualizing CHKb that can be accessed from URL: cho.gsu.edu

    Improving reproducibility and reuse of modelling results in the life sciences

    Get PDF
    Research results are complex and include a variety of heterogeneous data. This entails major computational challenges to (i) to manage simulation studies, (ii) to ensure model exchangeability, stability and validity, and (iii) to foster communication between partners. I describe techniques to improve the reproducibility and reuse of modelling results. First, I introduce a method to characterise differences in computational models. Second, I present approaches to obtain shareable and reproducible research results. Altogether, my methods and tools foster exchange and reuse of modelling results.Die verteilte Entwicklung von komplexen Simulationsstudien birgt eine große Zahl an informationstechnischen Herausforderungen: (i) Modelle müssen verwaltet werden; (ii) Reproduzierbarkeit, Stabilität und Gültigkeit von Ergebnissen muss sichergestellt werden; und (iii) die Kommunikation zwischen Partnern muss verbessert werden. Ich stelle Techniken vor, um die Reproduzierbarkeit und Wiederverwendbarkeit von Modellierungsergebnissen zu verbessern. Meine Implementierungen wurden erfolgreich in internationalen Anwendungen integriert und fördern das Teilen von wissenschaftlichen Ergebnissen

    Conceptual modeling of multimedia databases

    Get PDF
    The gap between the semantic content of multimedia data and its underlying physical representation is one of the main problems in the modern multimedia research in general, and, in particular, in the field of multimedia database modeling. We believe that one of the principal reasons of this problem is the attempt to conceptually represent multimedia data in a way, which is similar to its low-level representation by applications dealing with encoding standards, feature-based multimedia analysis, etc. In our opinion, such conceptual representation of multimedia contributes to the semantic gap by separating the representation of multimedia information from the representation of the universe of discourse of an application, to which the multimedia information pertains. In this research work we address the problem of conceptual modeling of multimedia data in a way to deal with the above-mentioned limitations. First, we introduce two different paradigms of conceptual understanding of the essence of multimedia data, namely: multimedia as data and multimedia as metadata. The multimedia as data paradigm, which views multimedia data as the subject of modeling in its own right, is inherent to so-called multimedia-centric applications, where multimedia information itself represents the main part of the universe of discourse. The examples of such kind of applications are digital photo collections or digital movie archives. On the other hand, the multimedia as metadata paradigm, which is inherent to so-called multimedia-enhanced applications, views multimedia data as just another (optional) source of information about whatever universe of discourse that the application pertains to. An example of a multimedia-enhanced application is a human-resource database augmented with employee photos. Here the universe of discourse is the totality of company employees, while their photos simply represent an additional (possibly optional) kind of information describing the universe of discourse. The multimedia conceptual modeling approach that we present in this work allows addressing multimedia-centric applications, as well as, in particular, multimedia-enhanced applications. The model that we propose builds upon MADS (Modeling Application Data with Spatio-temporal features), which is a rich conceptual model defined in our laboratory, and which is, in particular, characterized by structural completeness, spatio-temporal modeling capabilities, and multirepresentation support. The proposed multimedia model is provided in the form of a new modeling dimension of MADS, whose orthogonality principle allows to integrate the new multimedia modeling dimension with already existing modeling features of MADS. The following multimedia modeling constructs are provided: multimedia datatypes, simple and complex representational constraints (relationships), a multimedia partitioning mechanism, and multimedia multirepresentation features. Following the description of our conceptual multimedia modeling approach based on MADS, we present the peculiarities of logical multimedia modeling and of conceptual-to-logical inter-layer transformations. We provide a set of mapping guidelines intended to help the schema designer in coming up with rich logical multimedia document representations of the application domain, which conform with the conceptual multimedia schema. The practical interest of our research is illustrated by a mock-up application, which has been developed to support the theoretical ideas described in this work. In particular, we show how the abstract conceptual set-based representations of multimedia data elements, as well as simple and complex multimedia representational relationships can be implemented using Oracle DBMS

    Scripts in a Frame: A Framework for Archiving Deferred Representations

    Get PDF
    Web archives provide a view of the Web as seen by Web crawlers. Because of rapid advancements and adoption of client-side technologies like JavaScript and Ajax, coupled with the inability of crawlers to execute these technologies effectively, Web resources become harder to archive as they become more interactive. At Web scale, we cannot capture client-side representations using the current state-of-the art toolsets because of the migration from Web pages to Web applications. Web applications increasingly rely on JavaScript and other client-side programming languages to load embedded resources and change client-side state. We demonstrate that Web crawlers and other automatic archival tools are unable to archive the resulting JavaScript-dependent representations (what we term deferred representations), resulting in missing or incorrect content in the archives and the general inability to replay the archived resource as it existed at the time of capture. Building on prior studies on Web archiving, client-side monitoring of events and embedded resources, and studies of the Web, we establish an understanding of the trends contributing to the increasing unarchivability of deferred representations. We show that JavaScript leads to lower-quality mementos (archived Web resources) due to the archival difficulties it introduces. We measure the historical impact of JavaScript on mementos, demonstrating that the increased adoption of JavaScript and Ajax correlates with the increase in missing embedded resources. To measure memento and archive quality, we propose and evaluate a metric to assess memento quality closer to Web users’ perception. We propose a two-tiered crawling approach that enables crawlers to capture embedded resources dependent upon JavaScript. Measuring the performance benefits between crawl approaches, we propose a classification method that mitigates the performance impacts of the two-tiered crawling approach, and we measure the frontier size improvements observed with the two-tiered approach. Using the two-tiered crawling approach, we measure the number of client-side states associated with each URI-R and propose a mechanism for storing the mementos of deferred representations. In short, this dissertation details a body of work that explores the following: why JavaScript and deferred representations are difficult to archive (establishing the term deferred representation to describe JavaScript dependent representations); the extent to which JavaScript impacts archivability along with its impact on current archival tools; a metric for measuring the quality of mementos, which we use to describe the impact of JavaScript on archival quality; the performance trade-offs between traditional archival tools and technologies that better archive JavaScript; and a two-tiered crawling approach for discovering and archiving currently unarchivable descendants (representations generated by client-side user events) of deferred representations to mitigate the impact of JavaScript on our archives. In summary, what we archive is increasingly different from what we as interactive users experience. Using the approaches detailed in this dissertation, archives can create mementos closer to what users experience rather than archiving the crawlers’ experiences on the Web
    corecore