245 research outputs found
A unified view of data-intensive flows in business intelligence systems : a survey
Data-intensive flows are central processes in today’s business intelligence (BI) systems, deploying different technologies to deliver data, from a multitude of data sources, in user-preferred and analysis-ready formats. To meet complex requirements of next generation BI systems, we often need an effective combination of the traditionally batched extract-transform-load (ETL) processes that populate a data warehouse (DW) from integrated data sources, and more real-time and operational data flows that integrate source data at runtime. Both academia and industry thus must have a clear understanding of the foundations of data-intensive flows and the challenges of moving towards next generation BI environments. In this paper we present a survey of today’s research on data-intensive flows and the related fundamental fields of database theory. The study is based on a proposed set of dimensions describing the important challenges of data-intensive flows in the next generation BI setting. As a result of this survey, we envision an architecture of a system for managing the lifecycle of data-intensive flows. The results further provide a comprehensive understanding of data-intensive flows, recognizing challenges that still are to be addressed, and how the current solutions can be applied for addressing these challenges.Peer ReviewedPostprint (author's final draft
Business Intelligence for Small and Middle-Sized Entreprises
Data warehouses are the core of decision support sys- tems, which nowadays
are used by all kind of enter- prises in the entire world. Although many
studies have been conducted on the need of decision support systems (DSSs) for
small businesses, most of them adopt ex- isting solutions and approaches, which
are appropriate for large-scaled enterprises, but are inadequate for small and
middle-sized enterprises. Small enterprises require cheap, lightweight
architec- tures and tools (hardware and software) providing on- line data
analysis. In order to ensure these features, we review web-based business
intelligence approaches. For real-time analysis, the traditional OLAP
architecture is cumbersome and storage-costly; therefore, we also re- view
in-memory processing. Consequently, this paper discusses the existing approa-
ches and tools working in main memory and/or with web interfaces (including
freeware tools), relevant for small and middle-sized enterprises in decision
making
TecnologÃas para el manejo de metadatos en artÃculos cientÃficos
(Eng) The use of Semantic Web technologies has been increasing, so it is common using them in different ways. This article evaluates how these technologies can contribute to improve the indexing in articles in scientific journals. Initially, there is a conceptual review about metadata. Later, studying the most important technologies for the use of metadata in Web and, this way, choosing one of them to apply it in the case of study of scientific articles indexing, in order to determine the metadata based in those used in impact research journals, and building a model for indexing scientific articles using Semantic Web technologies.(Spa) El uso de tecnologÃas de la Web Semántica ha venido acrecentándose, por lo que es común usarlo en diferentes aspectos. Este trabajo evalúa como estas tecnologÃas pueden contribuir a mejorar la indexación de artÃculos en revistas cientÃficas. Inicialmente, se hace una revisión conceptual de los metadatos, para posteriormente estudiar las tecnologÃas más importantes para el uso de metadatos en la Web y, de esta manera, escoger una para aplicarla en el caso de estudio de indexación de artÃculos cientÃficos, determinando los metadatos con bases en los usados por las revistas de investigación de impacto y construir un modelo para la indexación de artÃculos cientÃficos usando una tecnologÃa de Web Semántica
Ontology evolution: a process-centric survey
Ontology evolution aims at maintaining an ontology up to date with respect to changes in the domain that it models or novel requirements of information systems that it enables. The recent industrial adoption of Semantic Web techniques, which rely on ontologies, has led to the increased importance of the ontology evolution research. Typical approaches to ontology evolution are designed as multiple-stage processes combining techniques from a variety of fields (e.g., natural language processing and reasoning). However, the few existing surveys on this topic lack an in-depth analysis of the various stages of the ontology evolution process. This survey extends the literature by adopting a process-centric view of ontology evolution. Accordingly, we first provide an overall process model synthesized from an overview of the existing models in the literature. Then we survey the major approaches to each of the steps in this process and conclude on future challenges for techniques aiming to solve that particular stage
LST-Bench: Benchmarking Log-Structured Tables in the Cloud
Log-Structured Tables (LSTs), also commonly referred to as table formats,
have recently emerged to bring consistency and isolation to object stores. With
the separation of compute and storage, object stores have become the go-to for
highly scalable and durable storage. However, this comes with its own set of
challenges, such as the lack of recovery and concurrency management that
traditional database management systems provide. This is where LSTs such as
Delta Lake, Apache Iceberg, and Apache Hudi come into play, providing an
automatic metadata layer that manages tables defined over object stores,
effectively addressing these challenges. A paradigm shift in the design of
these systems necessitates the updating of evaluation methodologies. In this
paper, we examine the characteristics of LSTs and propose extensions to
existing benchmarks, including workload patterns and metrics, to accurately
capture their performance. We introduce our framework, LST-Bench, which enables
users to execute benchmarks tailored for the evaluation of LSTs. Our evaluation
demonstrates how these benchmarks can be utilized to evaluate the performance,
efficiency, and stability of LSTs. The code for LST-Bench is open sourced and
is available at https://github.com/microsoft/lst-bench/
Enhancing systems biology models through semantic data integration
Studying and modelling biology at a systems level requires a large amount of data of different experimental types. Historically, each of these types is stored in its own distinct format, with its own internal structure for holding the data produced by those experiments. While the use of community data standards can reduce the need for specialised, independent formats by providing a common syntax, standards uptake is not universal and a single standard cannot yet describe all biological data. In the work described in this thesis, a variety of integrative methods have been developed to reuse and restructure already extant systems biology data. SyMBA is a simple Web interface which stores experimental metadata in a published, common format. The creation of accurate quantitative SBML models is a time-intensive manual process. Modellers need to understand both the systems they are modelling and the intricacies of the SBML format. However, the amount of relevant data for even a relatively small and well-scoped model can be overwhelming. Saint is a Web application which accesses a number of external Web services and which provides suggested annotation for SBML and CellML models. MFO was developed to formalise all of the knowledge within the multiple SBML specification documents in a manner which is both human and computationally accessible. Rule-based mediation, a form of semantic data integration, is a useful way of reusing and re-purposing heterogeneous datasets which cannot, or are not, structured according to a common standard. This method of ontology-based integration is generic and can be used in any context, but has been implemented specifically to integrate systems biology data and to enrich systems biology models through the creation of new biological annotations. The work described in this thesis is one step towards the formalisation of biological knowledge useful to systems biology. Experimental metadata has been transformed into common structures, a Web application has been created for the retrieval of data appropriate to the annotation of systems biology models and multiple data models have been formalised and made accessible to semantic integration techniques.EThOS - Electronic Theses Online ServiceBBSRCEPSRCGBUnited Kingdo
Adaptive object-modeling : patterns, tools and applications
Tese de Programa Doutoral. Informática. Universidade do Porto. Faculdade de Engenharia. 201
Access to data is a small island state : the case for Malta
In a rapidly developing world where the introduction of massive online information systems has
enabled both the scientist and the general public to interact with remotely-located data from
across the globe, the reality of access to data and eventually to information is slowly bringing
forth the realisation that decades-old barriers to access to data still need to be overcome. Whilst
the massive volumes of data at hand can easily lead one to acquire a perception that there is
everything one could require at the touch of a button, reality speaks with another voice; the data
is there, the issue of reliability speaks otherwise. The fundamentals of research lie in the
availability of reliable data, a phenomenon that has left disciplines struggling with issues of
repeatability of scientific outcomes. Technology and legislative measures have caught up with
the realities facing researchers.peer-reviewe
ERP system implementation
Abstract. The aim of the thesis is to implement Odoo Enterprise Resource Planning (ERP) system to cover the operations of the target company as widely as possible within the schedule. The ERP system will be utilized in areas such as purchasing, warehousing, manufacturing, product development, and documentation.
Based on the literature review, the scope and the schedule of the implementation is defined and a plan for the implementation process is created. The scope of implementation is defined by identifying the most important functions and processes in the target company and getting acquainted with the new ERP system. The transition plan consists of customizing the new system to suit the operations of the target company, transferring data between the old and the new database, and training employees to use the new system.
ERP implementation was carried out on time after an early adjustment of the implementation schedule. The implementation process required in-depth studying of ERP system and reviewing of company’s processes. Document management was improved, and company’s stock can now be managed through the ERP system. The implementation process had little disruption on operational activities. The results of this study can be used to further develop the ERP and the company’s processes, and the plan is to continue to develop the ERP system to integrate more of the company’s processes into the ERP
- …