515 research outputs found

    Incorporation of ontologies in data warehouse/business intelligence systems - A systematic literature review

    Get PDF
    Semantic Web (SW) techniques, such as ontologies, are used in Information Systems (IS) to cope with the growing need for sharing and reusing data and knowledge in various research areas. Despite the increasing emphasis on unstructured data analysis in IS, structured data and its analysis remain critical for organizational performance management. This systematic literature review aims at analyzing the incorporation and impact of ontologies in Data Warehouse/Business Intelligence (DW/BI) systems, contributing to the current literature by providing a classification of works based on the field of each case study, SW techniques used, and the authors’ motivations for using them, with a focus on DW/BI design, development and exploration tasks. A search strategy was developed, including the definition of keywords, inclusion and exclusion criteria, and the selection of search engines. Ontologies are mainly defined using the Ontology Web Language standard to support multiple DW/BI tasks, such as Dimensional Modeling, Requirement Analysis, Extract-Transform-Load, and BI Application Design. Reviewed authors present a variety of motivations for ontology-driven solutions in DW/BI, such as eliminating or solving data heterogeneity/semantics problems, increasing interoperability, facilitating integration, or providing semantic content for requirements and data analysis. Further, implications for practice and research agenda are indicated.info:eu-repo/semantics/publishedVersio

    Publishing a Scorecard for Evaluating the Use of Open-Access Journals Using Linked Data Technologies

    Get PDF
    Open access journals collect, preserve and publish scientific information in digital form, but it is still difficult not only for users but also for digital libraries to evaluate the usage and impact of this kind of publications. This problem can be tackled by introducing Key Performance Indicators (KPIs), allowing us to objectively measure the performance of the journals related to the objectives pursued. In addition, Linked Data technologies constitute an opportunity to enrich the information provided by KPIs, connecting them to relevant datasets across the web. This paper describes a process to develop and publish a scorecard on the semantic web based on the ISO 2789:2013 standard using Linked Data technologies in such a way that it can be linked to related datasets. Furthermore, methodological guidelines are presented with activities. The proposed process was applied to the open journal system of a university, including the definition of the KPIs linked to the institutional strategies, the extraction, cleaning and loading of data from the data sources into a data mart, the transforming of data into RDF (Resource Description Framework), and the publication of data by means of a SPARQL endpoint using the OpenLink Virtuoso application. Additionally, the RDF data cube vocabulary has been used to publish the multidimensional data on the web. The visualization was made using CubeViz a faceted browser to present the KPIs in interactive charts.This work has been partially supported by the Prometeo Project by SENESCYT, Ecuadorian Government

    A Goal and Ontology Based Approach for Generating ETL Process Specifications

    Get PDF
    Data warehouse (DW) systems development involves several tasks such as defining requirements, designing DW schemas, and specifying data transformation operations. Indeed, the success of DW systems is very much dependent on the proper design of the extracting, transforming, and loading (ETL) processes. However, the common design-related problems in the ETL processes such as defining user requirements and data transformation specifications are far from being resolved. These problems are due to data heterogeneity in data sources, ambiguity of user requirements, and the complexity of data transformation activities. Current approaches have limitations on the reconciliation of DW requirement semantics towards designing the ETL processes. As a result, this has prolonged the process of the ETL processes specifications generation. The semantic framework of DW systems established from this study is used to develop the requirement analysis method for designing the ETL processes (RAMEPs) from the different perspectives of organization, decision-maker, and developer by using goal and ontology approaches. The correctness of RAMEPs approach was validated by using modified and newly developed compliant tools. The RAMEPs was evaluated in three real case studies, i.e., Student Affairs System, Gas Utility System, and Graduate Entrepreneur System. These case studies were used to illustrate how the RAMEPs approach can be implemented for designing and generating the ETL processes specifications. Moreover, the RAMEPs approach was reviewed by the DW experts for assessing the strengths and weaknesses of this method, and the new approach is accepted. The RAMEPs method proves that the ETL processes specifications can be derived from the early phases of DW systems development by using the goal-ontology approach

    Building Data Warehouses with Semantic Web Data

    Get PDF
    The Semantic Web (SW) deployment is now a realization and the amount of semantic annotations is ever increasing thanks to several initiatives that promote a change in the current Web towards the Web of Data, where the semantics of data become explicit through data representation formats and standards such as RDF/(S) and OWL. However, such initiatives have not yet been accompanied by e cient intelligent applications that can exploit the implicit semantics and thus, provide more insightful analysis. In this paper, we provide the means for e ciently analyzing and exploring large amounts of semantic data by combining the inference power from the annotation semantics with the analysis capabilities provided by OLAP-style aggregations, navigation, and reporting. We formally present how semantic data should be organized in a well-de ned conceptual MD schema, so that sophisticated queries can be expressed and evaluated. Our proposal has been evaluated over a real biomedical scenario, which demonstrates the scalability and applicability of the proposed approach

    Corporate Smart Content Evaluation

    Get PDF
    Nowadays, a wide range of information sources are available due to the evolution of web and collection of data. Plenty of these information are consumable and usable by humans but not understandable and processable by machines. Some data may be directly accessible in web pages or via data feeds, but most of the meaningful existing data is hidden within deep web databases and enterprise information systems. Besides the inability to access a wide range of data, manual processing by humans is effortful, error-prone and not contemporary any more. Semantic web technologies deliver capabilities for machine-readable, exchangeable content and metadata for automatic processing of content. The enrichment of heterogeneous data with background knowledge described in ontologies induces re-usability and supports automatic processing of data. The establishment of “Corporate Smart Content” (CSC) - semantically enriched data with high information content with sufficient benefits in economic areas - is the main focus of this study. We describe three actual research areas in the field of CSC concerning scenarios and datasets applicable for corporate applications, algorithms and research. Aspect- oriented Ontology Development advances modular ontology development and partial reuse of existing ontological knowledge. Complex Entity Recognition enhances traditional entity recognition techniques to recognize clusters of related textual information about entities. Semantic Pattern Mining combines semantic web technologies with pattern learning to mine for complex models by attaching background knowledge. This study introduces the afore-mentioned topics by analyzing applicable scenarios with economic and industrial focus, as well as research emphasis. Furthermore, a collection of existing datasets for the given areas of interest is presented and evaluated. The target audience includes researchers and developers of CSC technologies - people interested in semantic web features, ontology development, automation, extracting and mining valuable information in corporate environments. The aim of this study is to provide a comprehensive and broad overview over the three topics, give assistance for decision making in interesting scenarios and choosing practical datasets for evaluating custom problem statements. Detailed descriptions about attributes and metadata of the datasets should serve as starting point for individual ideas and approaches

    The landscape of multimedia ontologies in the last decade

    Get PDF
    Many efforts have been made in the area of multimedia to bridge the socalled “semantic-gap” with the implementation of ontologies from 2001 to the present. In this paper, we provide a comparative study of the most well-known ontologies related to multimedia aspects. This comparative study has been done based on a framework proposed in this paper and called FRAMECOMMON. This framework takes into account process-oriented dimension, such as the methodological one, and outcome-oriented dimensions, like multimedia aspects, understandability, and evaluation criteria. Finally, we derive some conclusions concerning this one decade state-of-art in multimedia ontologies

    Revisiting Urban Dynamics through Social Urban Data

    Get PDF
    The study of dynamic spatial and social phenomena in cities has evolved rapidly in the recent years, yielding new insights into urban dynamics. This evolution is strongly related to the emergence of new sources of data for cities (e.g. sensors, mobile phones, online social media etc.), which have potential to capture dimensions of social and geographic systems that are difficult to detect in traditional urban data (e.g. census data). However, as the available sources increase in number, the produced datasets increase in diversity. Besides heterogeneity, emerging social urban data are also characterized by multidimensionality. The latter means that the information they contain may simultaneously address spatial, social, temporal, and topical attributes of people and places. Therefore, integration and geospatial (statistical) analysis of multidimensional data remain a challenge. The question which, then, arises is how to integrate heterogeneous and multidimensional social urban data into the analysis of human activity dynamics in cities?  To address the above challenge, this thesis proposes the design of a framework of novel methods and tools for the integration, visualization, and exploratory analysis of large-scale and heterogeneous social urban data to facilitate the understanding of urban dynamics. The research focuses particularly on the spatiotemporal dynamics of human activity in cities, as inferred from different sources of social urban data. The main objective is to provide new means to enable the incorporation of heterogeneous social urban data into city analytics, and to explore the influence of emerging data sources on the understanding of cities and their dynamics.  In mitigating the various heterogeneities, a methodology for the transformation of heterogeneous data for cities into multidimensional linked urban data is, therefore, designed. The methodology follows an ontology-based data integration approach and accommodates a variety of semantic (web) and linked data technologies. A use case of data interlinkage is used as a demonstrator of the proposed methodology. The use case employs nine real-world large-scale spatiotemporal data sets from three public transportation organizations, covering the entire public transport network of the city of Athens, Greece.  To further encourage the consumption of linked urban data by planners and policy-makers, a set of webbased tools for the visual representation of ontologies and linked data is designed and developed. The tools – comprising the OSMoSys framework – provide graphical user interfaces for the visual representation, browsing, and interactive exploration of both ontologies and linked urban data.  After introducing methods and tools for data integration, visual exploration of linked urban data, and derivation of various attributes of people and places from different social urban data, it is examined how they can all be combined into a single platform. To achieve this, a novel web-based system (coined SocialGlass) for the visualization and exploratory analysis of human activity dynamics is designed. The system combines data from various geo-enabled social media (i.e. Twitter, Instagram, Sina Weibo) and LBSNs (i.e. Foursquare), sensor networks (i.e. GPS trackers, Wi-Fi cameras), and conventional socioeconomic urban records, but also has the potential to employ custom datasets from other sources.  A real-world case study is used as a demonstrator of the capacities of the proposed web-based system in the study of urban dynamics. The case study explores the potential impact of a city-scale event (i.e. the Amsterdam Light festival 2015) on the activity and movement patterns of different social categories (i.e. residents, non-residents, foreign tourists), as compared to their daily and hourly routines in the periods  before and after the event. The aim of the case study is twofold. First, to assess the potential and limitations of the proposed system and, second, to investigate how different sources of social urban data could influence the understanding of urban dynamics.  The contribution of this doctoral thesis is the design and development of a framework of novel methods and tools that enables the fusion of heterogeneous multidimensional data for cities. The framework could foster planners, researchers, and policy makers to capitalize on the new possibilities given by emerging social urban data. Having a deep understanding of the spatiotemporal dynamics of cities and, especially of the activity and movement behavior of people, is expected to play a crucial role in addressing the challenges of rapid urbanization. Overall, the framework proposed by this research has potential to open avenues of quantitative explorations of urban dynamics, contributing to the development of a new science of cities

    Revisiting Urban Dynamics through Social Urban Data:

    Get PDF
    The study of dynamic spatial and social phenomena in cities has evolved rapidly in the recent years, yielding new insights into urban dynamics. This evolution is strongly related to the emergence of new sources of data for cities (e.g. sensors, mobile phones, online social media etc.), which have potential to capture dimensions of social and geographic systems that are difficult to detect in traditional urban data (e.g. census data). However, as the available sources increase in number, the produced datasets increase in diversity. Besides heterogeneity, emerging social urban data are also characterized by multidimensionality. The latter means that the information they contain may simultaneously address spatial, social, temporal, and topical attributes of people and places. Therefore, integration and geospatial (statistical) analysis of multidimensional data remain a challenge. The question which, then, arises is how to integrate heterogeneous and multidimensional social urban data into the analysis of human activity dynamics in cities? To address the above challenge, this thesis proposes the design of a framework of novel methods and tools for the integration, visualization, and exploratory analysis of large-scale and heterogeneous social urban data to facilitate the understanding of urban dynamics. The research focuses particularly on the spatiotemporal dynamics of human activity in cities, as inferred from different sources of social urban data. The main objective is to provide new means to enable the incorporation of heterogeneous social urban data into city analytics, and to explore the influence of emerging data sources on the understanding of cities and their dynamics.  In mitigating the various heterogeneities, a methodology for the transformation of heterogeneous data for cities into multidimensional linked urban data is, therefore, designed. The methodology follows an ontology-based data integration approach and accommodates a variety of semantic (web) and linked data technologies. A use case of data interlinkage is used as a demonstrator of the proposed methodology. The use case employs nine real-world large-scale spatiotemporal data sets from three public transportation organizations, covering the entire public transport network of the city of Athens, Greece.  To further encourage the consumption of linked urban data by planners and policy-makers, a set of webbased tools for the visual representation of ontologies and linked data is designed and developed. The tools – comprising the OSMoSys framework – provide graphical user interfaces for the visual representation, browsing, and interactive exploration of both ontologies and linked urban data.   After introducing methods and tools for data integration, visual exploration of linked urban data, and derivation of various attributes of people and places from different social urban data, it is examined how they can all be combined into a single platform. To achieve this, a novel web-based system (coined SocialGlass) for the visualization and exploratory analysis of human activity dynamics is designed. The system combines data from various geo-enabled social media (i.e. Twitter, Instagram, Sina Weibo) and LBSNs (i.e. Foursquare), sensor networks (i.e. GPS trackers, Wi-Fi cameras), and conventional socioeconomic urban records, but also has the potential to employ custom datasets from other sources. A real-world case study is used as a demonstrator of the capacities of the proposed web-based system in the study of urban dynamics. The case study explores the potential impact of a city-scale event (i.e. the Amsterdam Light festival 2015) on the activity and movement patterns of different social categories (i.e. residents, non-residents, foreign tourists), as compared to their daily and hourly routines in the periods  before and after the event. The aim of the case study is twofold. First, to assess the potential and limitations of the proposed system and, second, to investigate how different sources of social urban data could influence the understanding of urban dynamics. The contribution of this doctoral thesis is the design and development of a framework of novel methods and tools that enables the fusion of heterogeneous multidimensional data for cities. The framework could foster planners, researchers, and policy makers to capitalize on the new possibilities given by emerging social urban data. Having a deep understanding of the spatiotemporal dynamics of cities and, especially of the activity and movement behavior of people, is expected to play a crucial role in addressing the challenges of rapid urbanization. Overall, the framework proposed by this research has potential to open avenues of quantitative explorations of urban dynamics, contributing to the development of a new science of cities

    Creation and extension of ontologies for describing communications in the context of organizations

    Get PDF
    Thesis submitted to Faculdade de Ciências e Tecnologia of the Universidade Nova de Lisboa, in partial fulfillment of the requirements for the degree of Master in Computer ScienceThe use of ontologies is nowadays a sufficiently mature and solid field of work to be considered an efficient alternative in knowledge representation. With the crescent growth of the Semantic Web, it is expectable that this alternative tends to emerge even more in the near future. In the context of a collaboration established between FCT-UNL and the R&D department of a national software company, a new solution entitled ECC – Enterprise Communications Center was developed. This application provides a solution to manage the communications that enter, leave or are made within an organization, and includes intelligent classification of communications and conceptual search techniques in a communications repository. As specificity may be the key to obtain acceptable results with these processes, the use of ontologies becomes crucial to represent the existing knowledge about the specific domain of an organization. This work allowed us to guarantee a core set of ontologies that have the power of expressing the general context of the communications made in an organization, and of a methodology based upon a series of concrete steps that provides an effective capability of extending the ontologies to any business domain. By applying these steps, the minimization of the conceptualization and setup effort in new organizations and business domains is guaranteed. The adequacy of the core set of ontologies chosen and of the methodology specified is demonstrated in this thesis by its effective application to a real case-study, which allowed us to work with the different types of sources considered in the methodology and the activities that support its construction and evolution

    Ontology driven integration platform for clinical and translational research

    Get PDF
    Semantic Web technologies offer a promising framework for integration of disparate biomedical data. In this paper we present the semantic information integration platform under development at the Center for Clinical and Translational Sciences (CCTS) at the University of Texas Health Science Center at Houston (UTHSC-H) as part of our Clinical and Translational Science Award (CTSA) program. We utilize the Semantic Web technologies not only for integrating, repurposing and classification of multi-source clinical data, but also to construct a distributed environment for information sharing, and collaboration online. Service Oriented Architecture (SOA) is used to modularize and distribute reusable services in a dynamic and distributed environment. Components of the semantic solution and its overall architecture are described
    • …
    corecore