245 research outputs found

    A unified view of data-intensive flows in business intelligence systems : a survey

    Get PDF
    Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft

    Business Intelligence for Small and Middle-Sized Entreprises

    Full text link
    Data warehouses are the core of decision support sys- tems, which nowadays are used by all kind of enter- prises in the entire world. Although many studies have been conducted on the need of decision support systems (DSSs) for small businesses, most of them adopt ex- isting solutions and approaches, which are appropriate for large-scaled enterprises, but are inadequate for small and middle-sized enterprises. Small enterprises require cheap, lightweight architec- tures and tools (hardware and software) providing on- line data analysis. In order to ensure these features, we review web-based business intelligence approaches. For real-time analysis, the traditional OLAP architecture is cumbersome and storage-costly; therefore, we also re- view in-memory processing. Consequently, this paper discusses the existing approa- ches and tools working in main memory and/or with web interfaces (including freeware tools), relevant for small and middle-sized enterprises in decision making

    Tecnologías para el manejo de metadatos en artículos científicos

    Get PDF
    (Eng) The use of Semantic Web technologies has been increasing, so it is common using them in different ways. This article evaluates how these technologies can contribute to improve the indexing in articles in scientific journals. Initially, there is a conceptual review about metadata. Later, studying the most important technologies for the use of metadata in Web and, this way, choosing one of them to apply it in the case of study of scientific articles indexing, in order to determine the metadata based in those used in impact research journals, and building a model for indexing scientific articles using Semantic Web technologies.(Spa) El uso de tecnologías de la Web Semántica ha venido acrecentándose, por lo que es común usarlo en diferentes aspectos. Este trabajo evalúa como estas tecnologías pueden contribuir a mejorar la indexación de artículos en revistas científicas. Inicialmente, se hace una revisión conceptual de los metadatos, para posteriormente estudiar las tecnologías más importantes para el uso de metadatos en la Web y, de esta manera, escoger una para aplicarla en el caso de estudio de indexación de artículos científicos, determinando los metadatos con bases en los usados por las revistas de investigación de impacto y construir un modelo para la indexación de artículos científicos usando una tecnología de Web Semántica

    Ontology evolution: a process-centric survey

    Get PDF
    Ontology evolution aims at maintaining an ontology up to date with respect to changes in the domain that it models or novel requirements of information systems that it enables. The recent industrial adoption of Semantic Web techniques, which rely on ontologies, has led to the increased importance of the ontology evolution research. Typical approaches to ontology evolution are designed as multiple-stage processes combining techniques from a variety of fields (e.g., natural language processing and reasoning). However, the few existing surveys on this topic lack an in-depth analysis of the various stages of the ontology evolution process. This survey extends the literature by adopting a process-centric view of ontology evolution. Accordingly, we first provide an overall process model synthesized from an overview of the existing models in the literature. Then we survey the major approaches to each of the steps in this process and conclude on future challenges for techniques aiming to solve that particular stage

    LST-Bench: Benchmarking Log-Structured Tables in the Cloud

    Full text link
    Log-Structured Tables (LSTs), also commonly referred to as table formats, have recently emerged to bring consistency and isolation to object stores. With the separation of compute and storage, object stores have become the go-to for highly scalable and durable storage. However, this comes with its own set of challenges, such as the lack of recovery and concurrency management that traditional database management systems provide. This is where LSTs such as Delta Lake, Apache Iceberg, and Apache Hudi come into play, providing an automatic metadata layer that manages tables defined over object stores, effectively addressing these challenges. A paradigm shift in the design of these systems necessitates the updating of evaluation methodologies. In this paper, we examine the characteristics of LSTs and propose extensions to existing benchmarks, including workload patterns and metrics, to accurately capture their performance. We introduce our framework, LST-Bench, which enables users to execute benchmarks tailored for the evaluation of LSTs. Our evaluation demonstrates how these benchmarks can be utilized to evaluate the performance, efficiency, and stability of LSTs. The code for LST-Bench is open sourced and is available at https://github.com/microsoft/lst-bench/

    Enhancing systems biology models through semantic data integration

    Get PDF
    Studying and modelling biology at a systems level requires a large amount of data of different experimental types. Historically, each of these types is stored in its own distinct format, with its own internal structure for holding the data produced by those experiments. While the use of community data standards can reduce the need for specialised, independent formats by providing a common syntax, standards uptake is not universal and a single standard cannot yet describe all biological data. In the work described in this thesis, a variety of integrative methods have been developed to reuse and restructure already extant systems biology data. SyMBA is a simple Web interface which stores experimental metadata in a published, common format. The creation of accurate quantitative SBML models is a time-intensive manual process. Modellers need to understand both the systems they are modelling and the intricacies of the SBML format. However, the amount of relevant data for even a relatively small and well-scoped model can be overwhelming. Saint is a Web application which accesses a number of external Web services and which provides suggested annotation for SBML and CellML models. MFO was developed to formalise all of the knowledge within the multiple SBML specification documents in a manner which is both human and computationally accessible. Rule-based mediation, a form of semantic data integration, is a useful way of reusing and re-purposing heterogeneous datasets which cannot, or are not, structured according to a common standard. This method of ontology-based integration is generic and can be used in any context, but has been implemented specifically to integrate systems biology data and to enrich systems biology models through the creation of new biological annotations. The work described in this thesis is one step towards the formalisation of biological knowledge useful to systems biology. Experimental metadata has been transformed into common structures, a Web application has been created for the retrieval of data appropriate to the annotation of systems biology models and multiple data models have been formalised and made accessible to semantic integration techniques.EThOS - Electronic Theses Online ServiceBBSRCEPSRCGBUnited Kingdo

    Adaptive object-modeling : patterns, tools and applications

    Get PDF
    Tese de Programa Doutoral. Informática. Universidade do Porto. Faculdade de Engenharia. 201

    Access to data is a small island state : the case for Malta

    Get PDF
    In a rapidly developing world where the introduction of massive online information systems has enabled both the scientist and the general public to interact with remotely-located data from across the globe, the reality of access to data and eventually to information is slowly bringing forth the realisation that decades-old barriers to access to data still need to be overcome. Whilst the massive volumes of data at hand can easily lead one to acquire a perception that there is everything one could require at the touch of a button, reality speaks with another voice; the data is there, the issue of reliability speaks otherwise. The fundamentals of research lie in the availability of reliable data, a phenomenon that has left disciplines struggling with issues of repeatability of scientific outcomes. Technology and legislative measures have caught up with the realities facing researchers.peer-reviewe

    ERP system implementation

    Get PDF
    Abstract. The aim of the thesis is to implement Odoo Enterprise Resource Planning (ERP) system to cover the operations of the target company as widely as possible within the schedule. The ERP system will be utilized in areas such as purchasing, warehousing, manufacturing, product development, and documentation. Based on the literature review, the scope and the schedule of the implementation is defined and a plan for the implementation process is created. The scope of implementation is defined by identifying the most important functions and processes in the target company and getting acquainted with the new ERP system. The transition plan consists of customizing the new system to suit the operations of the target company, transferring data between the old and the new database, and training employees to use the new system. ERP implementation was carried out on time after an early adjustment of the implementation schedule. The implementation process required in-depth studying of ERP system and reviewing of company’s processes. Document management was improved, and company’s stock can now be managed through the ERP system. The implementation process had little disruption on operational activities. The results of this study can be used to further develop the ERP and the company’s processes, and the plan is to continue to develop the ERP system to integrate more of the company’s processes into the ERP
    • …
    corecore