713 research outputs found

    XML content warehousing: Improving sociological studies of mailing lists and web data

    Get PDF
    In this paper, we present the guidelines for an XML-based approach for the sociological study of Web data such as the analysis of mailing lists or databases available online. The use of an XML warehouse is a flexible solution for storing and processing this kind of data. We propose an implemented solution and show possible applications with our case study of profiles of experts involved in W3C standard-setting activity. We illustrate the sociological use of semi-structured databases by presenting our XML Schema for mailing-list warehousing. An XML Schema allows many adjunctions or crossings of data sources, without modifying existing data sets, while allowing possible structural evolution. We also show that the existence of hidden data implies increased complexity for traditional SQL users. XML content warehousing allows altogether exhaustive warehousing and recursive queries through contents, with far less dependence on the initial storage. We finally present the possibility of exporting the data stored in the warehouse to commonly-used advanced software devoted to sociological analysis

    A Survey on Array Storage, Query Languages, and Systems

    Full text link
    Since scientific investigation is one of the most important providers of massive amounts of ordered data, there is a renewed interest in array data processing in the context of Big Data. To the best of our knowledge, a unified resource that summarizes and analyzes array processing research over its long existence is currently missing. In this survey, we provide a guide for past, present, and future research in array processing. The survey is organized along three main topics. Array storage discusses all the aspects related to array partitioning into chunks. The identification of a reduced set of array operators to form the foundation for an array query language is analyzed across multiple such proposals. Lastly, we survey real systems for array processing. The result is a thorough survey on array data storage and processing that should be consulted by anyone interested in this research topic, independent of experience level. The survey is not complete though. We greatly appreciate pointers towards any work we might have forgotten to mention.Comment: 44 page

    Modeling views in the layered view model for XML using UML

    Get PDF
    In data engineering, view formalisms are used to provide flexibility to users and user applications by allowing them to extract and elaborate data from the stored data sources. Conversely, since the introduction of Extensible Markup Language (XML), it is fast emerging as the dominant standard for storing, describing, and interchanging data among various web and heterogeneous data sources. In combination with XML Schema, XML provides rich facilities for defining and constraining user-defined data semantics and properties, a feature that is unique to XML. In this context, it is interesting to investigate traditional database features, such as view models and view design techniques for XML. However, traditional view formalisms are strongly coupled to the data language and its syntax, thus it proves to be a difficult task to support views in the case of semi-structured data models. Therefore, in this paper we propose a Layered View Model (LVM) for XML with conceptual and schemata extensions. Here our work is three-fold; first we propose an approach to separate the implementation and conceptual aspects of the views that provides a clear separation of concerns, thus, allowing analysis and design of views to be separated from their implementation. Secondly, we define representations to express and construct these views at the conceptual level. Thirdly, we define a view transformation methodology for XML views in the LVM, which carries out automated transformation to a view schema and a view query expression in an appropriate query language. Also, to validate and apply the LVM concepts, methods and transformations developed, we propose a view-driven application development framework with the flexibility to develop web and database applications for XML, at varying levels of abstraction

    A unified view of data-intensive flows in business intelligence systems : a survey

    Get PDF
    Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

    OPEN SOURCE HBIM FOR CULTURAL HERITAGE: A PROJECT PROPOSAL

    Get PDF
    Actual technologies are changing Cultural Heritage research, analysis, conservation and development ways, allowing new innovative approaches. The possibility of integrating Cultural Heritage data, like archaeological information, inside a three-dimensional environment system (like a Building Information Modelling) involve huge benefits for its management, monitoring and valorisation. Nowadays there are many commercial BIM solutions. However, these tools are thought and developed mostly for architecture design or technical installations. An example of better solution could be a dynamic and open platform that might consider Cultural Heritage needs as priority. Suitable solution for better and complete data usability and accessibility could be guaranteed by open source protocols. This choice would allow adapting software to Cultural Heritage needs and not the opposite, thus avoiding methodological stretches. This work will focus exactly on analysis and experimentations about specific characteristics of these kind of open source software (DBMS, CAD, Servers) applied to a Cultural Heritage example, in order to verifying their flexibility, reliability and then creating a dynamic HBIM open source prototype. Indeed, it might be a starting point for a future creation of a complete HBIM open source solution that we could adapt to others Cultural Heritage researches and analysis

    A Survey of the State of Dataspaces

    Get PDF
    Published in International Journal of Computer and Information Technology.This paper presents a survey of the state of dataspaces. With dataspaces becoming the modern technique of systems integration, the achievement of complete dataspace development is a critical issue. This has led to the design and implementation of dataspace systems using various approaches. Dataspaces are data integration approaches that target for data coexistence in the spatial domain. Unlike traditional data integration techniques, they do not require up front semantic integration of data. In this paper, we outline and compare the properties and implementations of dataspaces including the approaches of optimizing dataspace development. We finally present actual dataspace development recommendations to provide a global overview of this significant research topic.This paper presents a survey of the state of dataspaces . With dataspaces becoming the modern technique of systems integration, the ach ievement of complete dataspace development is a critical issue. This has led to the design and implementation of dataspace systems using various approaches. Dataspaces are data integration approaches that target for data coexistence in the spatial domain. Unlike traditional data integration techniques, they do not require up front semantic integration of data. In this paper, we outline and compare the properties and implementations of dataspaces including the approaches of optimizing dataspace development. We finally present actual dataspace development recommendations to provide a global overview of this significant research topic

    Data management in cloud environments: NoSQL and NewSQL data stores

    Get PDF
    : Advances in Web technology and the proliferation of mobile devices and sensors connected to the Internet have resulted in immense processing and storage requirements. Cloud computing has emerged as a paradigm that promises to meet these requirements. This work focuses on the storage aspect of cloud computing, specifically on data management in cloud environments. Traditional relational databases were designed in a different hardware and software era and are facing challenges in meeting the performance and scale requirements of Big Data. NoSQL and NewSQL data stores present themselves as alternatives that can handle huge volume of data. Because of the large number and diversity of existing NoSQL and NewSQL solutions, it is difficult to comprehend the domain and even more challenging to choose an appropriate solution for a specific task. Therefore, this paper reviews NoSQL and NewSQL solutions with the objective of: (1) providing a perspective in the field, (2) providing guidance to practitioners and researchers to choose the appropriate data store, and (3) identifying challenges and opportunities in the field. Specifically, the most prominent solutions are compared focusing on data models, querying, scaling, and security related capabilities. Features driving the ability to scale read requests and write requests, or scaling data storage are investigated, in particular partitioning, replication, consistency, and concurrency control. Furthermore, use cases and scenarios in which NoSQL and NewSQL data stores have been used are discussed and the suitability of various solutions for different sets of applications is examined. Consequently, this study has identified challenges in the field, including the immense diversity and inconsistency of terminologies, limited documentation, sparse comparison and benchmarking criteria, and nonexistence of standardized query languages

    from reality to parametric models of cultural heritage assets for hbim

    Get PDF
    Abstract. The ability of managing big amounts of metric information coming from a LiDAR survey and the ability to reproduce high quality 3D models from them are still vivid problems to solve. Is it possible to create detailed models, geometrically and metrically correct, without using a large amount (often redundant) of metric data, such as massive point clouds? Obviously yes, but there are several ways to create a fitting 3D model for a specific research. A good solution is given by NURBS based algorithms that ensure high details of modelling. However, NURBS models can't be used directly on BIM platforms, because they need to be parametrized. In this sense, a parametric model is based on real measurements but each object could be interpreted and approximated based on objective and subjective (critic) view and also based on LODs (levels of detail or development) concerning a particular analysis. This kind of modelling of Cultural Heritage assets, fundamental for HBIM creation, need to be correctly planned especially for classification and definition of its historical features connected to an informative system, because nowadays information and then the semantic dimension are a necessary key points towards documentation analysis. Established this brief introduction, this schematic work will focus on the analysis of FreeCAD open BIM software and Rhinoceros as NURBS 3D modeller for Cultural Heritage is concerned, and whether and how they could integrate their tools for the purpose of managing dynamic high detailed data for the creation of an HBIM platform.</p

    HBIM for conservation: A new proposal for information modeling

    Get PDF
    Thanks to its capability of archiving and organizing all the information about a building, HBIM (Historical Building Information Modeling) is considered a promising resource for planned conservation of historical assets. However, its usage remains limited and scarcely adopted by the subjects in charge of conservation, mainly because of its rather complex 3D modeling requirements and a lack of shared regulatory references and guidelines as far as semantic data are concerned. In this study, we developed an HBIM methodology to support documentation, management, and planned conservation of historic buildings, with particular focus on non-geometric information: organized and coordinated storage and management of historical data, easy analysis and query, time management, flexibility, user-friendliness, and information sharing. The system is based on a standalone specific-designed database linked to the 3D model of the asset, built with BIM software, and it is highly adaptable to different assets. The database is accessible both with a developed desktop application, which acts as a plug-in for the BIM software, and through a web interface, implemented to ensure data sharing and easy usability by skilled and unskilled users. The paper describes in detail the implemented system, passing by semantic breaking down of the building, database design, as well as system architecture and capabilities. Two case studies, the Cathedral of Parma and Ducal Palace of Mantua (Italy), are then presented to show the results of the system's application
    • …
    corecore