41,474 research outputs found

    Data Pipeline Management in Practice: Challenges and Opportunities

    Get PDF
    Data pipelines involve a complex chain of interconnected activities that starts with a data source and ends in a data sink. Data pipelines are important for data-driven organizations since a data pipeline can process data in multiple formats from distributed data sources with minimal human intervention, accelerate data life cycle activities, and enhance productivity in data-driven enterprises. However, there are challenges and opportunities in implementing data pipelines but practical industry experiences are seldom reported. The findings of this study are derived by conducting a qualitative multiple-case study and interviews with the representatives of three companies. The challenges include data quality issues, infrastructure maintenance problems, and organizational barriers. On the other hand, data pipelines are implemented to enable traceability, fault-tolerance, and reduce human errors through maximizing automation thereby producing high-quality data. Based on multiple-case study research with five use cases from three case companies, this paper identifies the key challenges and benefits associated with the implementation and use of data pipelines

    Data transformation as a means towards dynamic data storage and polyglot persistence

    Get PDF
    Legacy applications have been built around the concept of storing their data in one relational data store. However, with the current differentiation in data store technologies as a consequence of the NoSQL paradigm, new and possibly more performant storage solutions are available to all applications. The concept of dynamic storage makes sure that application data are always stored in the most optimal data store at a given time to increase application performance. Additionally, polyglot persistence aims to push this performance even further by storing each different data type of an application in the data store technology best suited for it. To get legacy applications into dynamic storage and polyglot persistence, schema and data transformations between data store technologies are needed. This usually infers application redesigns as well to support the new data stores. This paper proposes such a transformation approach through a canonical model. It is based on the Lambda architecture to ensure no application downtime is needed during the transformation process, and after the transformation, the application can continue to query in the original query language, thus requiring no application code changes

    Middleware platform for distributed applications incorporating robots, sensors and the cloud

    Get PDF
    Cyber-physical systems in the factory of the future will consist of cloud-hosted software governing an agile production process executed by autonomous mobile robots and controlled by analyzing the data from a vast number of sensors. CPSs thus operate on a distributed production floor infrastructure and the set-up continuously changes with each new manufacturing task. In this paper, we present our OSGibased middleware that abstracts the deployment of servicebased CPS software components on the underlying distributed platform comprising robots, actuators, sensors and the cloud. Moreover, our middleware provides specific support to develop components based on artificial neural networks, a technique that recently became very popular for sensor data analytics and robot actuation. We demonstrate a system where a robot takes actions based on the input from sensors in its vicinity

    Intelligent energy management based on SCADA system in a real Microgrid for smart building applications

    Get PDF
    Energy management is one of the main challenges in Microgrids (MGs) applied to Smart Buildings (SBs). Hence, more studies are indispensable to consider both modeling and operating aspects to utilize the upcoming results of the system for the different applications. This paper presents a novel energy management architecture model based on complete Supervisory Control and Data Acquisition (SCADA) system duties in an educational building with an MG Laboratory (Lab) testbed, which is named LAMBDA at the Electrical and Energy Engineering Department of the Sapienza University of Rome. The LAMBDA MG Lab simulates in a small scale a SB and is connected with the DIAEE electrical network. LAMBDA MG is composed of a Photovoltaic generator (PV), a Battery Energy Storage System (BESS), a smart switchboard (SW), and different classified loads (critical, essential, and normal) some of which are manageable and controllable (lighting, air conditioning, smart plugs operating into the LAB). The aim of the LAMBDA implementation is making the DIAEE smart for energy saving purposes. In the LAMBDA Lab, the communication architecture consists in a complex of master/slave units and actuators carried out by two main international standards, Modbus (industrial serial standard for electrical and technical monitoring systems) and Konnex (an open standard for commercial and domestic building automation). Making the electrical department smart causes to reduce the required power from the main grid. Hence, to achieve the aims, results have been investigated in two modes. Initially, the real-time mode based on the SCADA system, which reveals real daily power consumption and production of different sources and loads. Next, the simulation part is assigned to shows the behavior of the main grid, loads and BESS charging and discharging based on energy management system. Finally, the proposed model has been examined in different scenarios and evaluated from the economic aspect
    • …
    corecore