30,466 research outputs found

    A vocabulary-independent generation framework for DBpedia and beyond

    Get PDF
    The dbpedia Extraction Framework, the generation framework behind one of the Linked Open Data cloud’s central hubs, has limitations which lead to quality issues with the dbpedia dataset. Therefore, we provide a new take on its Extraction Framework that allows for a sustainable and general-purpose Linked Data generation framework by adapting a semantic-driven approach. The proposed approach decouples, in a declarative manner, the extraction, transformation, and mapping rules execution. This way, among others, interchanging different schema annotations is supported, instead of being coupled to a certain ontology as it is now, because the dbpedia Extraction Framework allows only generating a certain dataset with a single semantic representation. In this paper, we shed more light to the added value that this aspect brings. We provide an extracted dbpedia dataset using a different vocabulary, and give users the opportunity to generate a new dbpedia dataset using a custom combination of vocabularies

    A unified view of data-intensive flows in business intelligence systems : a survey

    Get PDF
    Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

    Dimensional enrichment of statistical linked open data

    Get PDF
    On-Line Analytical Processing (OLAP) is a data analysis technique typically used for local and well-prepared data. However, initiatives like Open Data and Open Government bring new and publicly available data on the web that are to be analyzed in the same way. The use of semantic web technologies for this context is especially encouraged by the Linked Data initiative. There is already a considerable amount of statistical linked open data sets published using the RDF Data Cube Vocabulary (QB) which is designed for these purposes. However, QB lacks some essential schema constructs (e.g., dimension levels) to support OLAP. Thus, the QB4OLAP vocabulary has been proposed to extend QB with the necessary constructs and be fully compliant with OLAP. In this paper, we focus on the enrichment of an existing QB data set with QB4OLAP semantics. We first thoroughly compare the two vocabularies and outline the benefits of QB4OLAP. Then, we propose a series of steps to automate the enrichment of QB data sets with specific QB4OLAP semantics; being the most important, the definition of aggregate functions and the detection of new concepts in the dimension hierarchy construction. The proposed steps are defined to form a semi-automatic enrichment method, which is implemented in a tool that enables the enrichment in an interactive and iterative fashion. The user can enrich the QB data set with QB4OLAP concepts (e.g., full-fledged dimension hierarchies) by choosing among the candidate concepts automatically discovered with the steps proposed. Finally, we conduct experiments with 25 users and use three real-world QB data sets to evaluate our approach. The evaluation demonstrates the feasibility of our approach and shows that, in practice, our tool facilitates, speeds up, and guarantees the correct results of the enrichment process.Peer ReviewedPostprint (author's final draft

    BIM semantic-enrichment for built heritage representation

    Get PDF
    In the built heritage context, BIM has shown difficulties in representing and managing the large and complex knowledge related to non-geometrical aspects of the heritage. Within this scope, this paper focuses on a domain-specific semantic-enrichment of BIM methodology, aimed at fulfilling semantic representation requirements of built heritage through Semantic Web technologies. To develop this semantic-enriched BIM approach, this research relies on the integration of a BIM environment with a knowledge base created through information ontologies. The result is knowledge base system - and a prototypal platform - that enhances semantic representation capabilities of BIM application to architectural heritage processes. It solves the issue of knowledge formalization in cultural heritage informative models, favouring a deeper comprehension and interpretation of all the building aspects. Its open structure allows future research to customize, scale and adapt the knowledge base different typologies of artefacts and heritage activities

    Designing Improved Sediment Transport Visualizations

    Get PDF
    Monitoring, or more commonly, modeling of sediment transport in the coastal environment is a critical task with relevance to coastline stability, beach erosion, tracking environmental contaminants, and safety of navigation. Increased intensity and regularity of storms such as Superstorm Sandy heighten the importance of our understanding of sediment transport processes. A weakness of current modeling capabilities is the ability to easily visualize the result in an intuitive manner. Many of the available visualization software packages display only a single variable at once, usually as a two-dimensional, plan-view cross-section. With such limited display capabilities, sophisticated 3D models are undermined in both the interpretation of results and dissemination of information to the public. Here we explore a subset of existing modeling capabilities (specifically, modeling scour around man-made structures) and visualization solutions, examine their shortcomings and present a design for a 4D visualization for sediment transport studies that is based on perceptually-focused data visualization research and recent and ongoing developments in multivariate displays. Vector and scalar fields are co-displayed, yet kept independently identifiable utilizing human perception\u27s separation of color, texture, and motion. Bathymetry, sediment grain-size distribution, and forcing hydrodynamics are a subset of the variables investigated for simultaneous representation. Direct interaction with field data is tested to support rapid validation of sediment transport model results. Our goal is a tight integration of both simulated data and real world observations to support analysis and simulation of the impact of major sediment transport events such as hurricanes. We unite modeled results and field observations within a geodatabase designed as an application schema of the Arc Marine Data Model. Our real-world focus is on the Redbird Artificial Reef Site, roughly 18 nautical miles offshor- Delaware Bay, Delaware, where repeated surveys have identified active scour and bedform migration in 27 m water depth amongst the more than 900 deliberately sunken subway cars and vessels. Coincidently collected high-resolution multibeam bathymetry, backscatter, and side-scan sonar data from surface and autonomous underwater vehicle (AUV) systems along with complementary sub-bottom, grab sample, bottom imagery, and wave and current (via ADCP) datasets provide the basis for analysis. This site is particularly attractive due to overlap with the Delaware Bay Operational Forecast System (DBOFS), a model that provides historical and forecast oceanographic data that can be tested in hindcast against significant changes observed at the site during Superstorm Sandy and in predicting future changes through small-scale modeling around the individual reef objects
    corecore