7 research outputs found

    Implementation of Data Abstraction Layer Using Kafka on SEMAR Platform for Air Quality Monitoring

    Get PDF
    Urbanization and fast-growing industries causing air quality in urban areas to be bad and even tend to be dangerous. In addition, the largest percentage of energy emissions come from the transportation sector, specifically on road transportation. Therefore, the need for a quality detection system that is capable of distributing and displaying large data information in real-time cannot be resolved by the system currently used by the government. This research offers a solution to the implementation of data abstraction in cloud computing which is built using the concept microservice architecture and integrated with mobile-based sensors to detect air quality in real-time. This solution consists of integrated cloud computing services using Smart Environment Monitoring and Analytical in Real-time (SEMAR) and Vehicles as Mobile Sensor Networks (VaaMSN) to detecting air quality. SEMAR was built with microservice references consist of data abstraction, communication, data analytical with business analytics proccess, data storage with Big data service and also real-time visualization in maps, chart, and table through dasboard website. Through the experiments that we did show that the microservice of data abstraction layer can be installed at the SEMAR stage indicating that the average delay in sending information is around 0.09 ms (90μs), this indicates that the system can be said to be real-time. With specific and real-time locations in data visualization, the government can use this method as an new alternative method of air quality

    Implementation of Integration VaaMSN and SEMAR for Wide Coverage Air Quality Monitoring

    Get PDF
    The current air quality monitoring system cannot cover a large area, not real-time and has not implemented big data analysis technology with high accuracy. The purpose of an integration Mobile Sensor Network and Internet of Things system is to build air quality monitoring system that able to monitor in wide coverage. This system consists of Vehicle as a Mobile Sensors Network (VaaMSN) as edge computing and Smart Environment Monitoring and Analytic in Real-time (SEMAR) cloud computing. VaaMSN is a package of air quality sensor, GPS, 4G WiFi modem and single board computing. SEMAR cloud computing has a time-series database for real-time visualization, Big Data environment and analytics use the Support Vector Machines (SVM) and Decision Tree (DT) algorithm. The output from the system are maps, table, and graph visualization. The evaluation obtained from the experimental results shows that the accuracy of both algorithms reaches more than 90%. However, Mean Square Error (MSE) value of SVM algorithm about 0.03076293, but DT algorithm has 10x smaller MSE value than SVM algorithm

    Quantitative Estimation of Causality and Predictive Modeling for Precipitation Observation Sites and River Gage Sensors

    Get PDF
    This project seeks to investigate two questions: correlations from precipitation measurement sensors to river gage sensors, and predictive modeling of peak river gage heights during precipitation events. First, if correlations can be quantified, then a predictive model can be explored to predict peak water levels at river gage sensors, in response to precipitation inputs. Answering both research questions can provide early flood detection benefits and provide quantitative time assessments for flood risks. An extensive data-driven study was conducted across a geographical area of the U.S, spanning the time period 2008-2016 to identify river gage sensors that are closely correlated to nearby rainfall events. More than 1000 precipitation observation sites were identified and for each precipitation site, nearby river gage stations/sensors were ranked using a cross correlation measure. The cross correlation measures provide information such as which river gage sensors are most sensitive to nearby precipitation inputs. Predictive machine learning models were also developed around each rainfall-river gage pair to learn from historical rainfall and river gage levels, and then predict peak river gage heights. The predictive models generated were accurate and verified a strong causality between precipitation events and river gages that were sensitive to such events. A web-based and map-based decision support and visualization tool was also developed to depict the causality between precipitation and river gage sites and to graphically display the results of the predictive models. This study found about 3500 strongly correlated rain station and river gage pairs. Machine Learning models for these pairs yield high accuracy - 80 percent and above

    A Best of Both Worlds Approach to Complex, Efficient, Time Series Data Delivery

    No full text
    Part 5: Architectures, Infrastructures, Platforms and ServicesInternational audiencePoint time series are a key data-type for the description of real or modelled environmental phenomena. Delivering this data in useful ways can be challenging when the data volume is large, when computational work (such as aggregation, subsetting, or re-sampling) needs to be performed, or when complex metadata is needed to place data in context for understanding. Some aspects of these problems are especially relevant to the environmental domain: large sensor networks measuring continuous environmental phenomena sampling frequently over long periods of time generate very large datasets, and rich metadata is often required to understand the context of observations. Nevertheless, timeseries data, and most of these challenges, are prevalent beyond the environmental domain, for example in financial and industrial domains.A review of recent technologies illustrates an emerging trend toward high performance, lightweight, databases specialized for time series data. These databases tend to have non-existent or minimalistic formal metadata capacities. In contrast, the environmental domain boasts standards such as the Sensor Observation Service (SOS) that have mature and comprehensive metadata models but existing implementations have had problems with slow performance.In this paper we describe our hybrid approach to achieve efficient delivery of large time series datasets with complex metadata. We use three subsystems within a single system-of-systems: a proxy (Python), an efficient time series database (InfluxDB) and a SOS implementation (52 North SOS). Together these present a regular SOS interface. The proxy processes standard SOS queries and issues them to the either 52 North SOS or to InfluxDB for processing. Responses are returned directly from 52 North SOS or indirectly from InfluxDB via Python proxy where they are processed into WaterML. This enables the scalability and performance advantages of the time series database to be married with the sophisticated metadata handling of SOS. Testing indicates that a recent version of 52 North SOS configured with a Postgres/PostGIS database performs well but an implementation incorporating InfluxDB and 52 North SOS in a hybrid architecture performs approximately 12 times faster

    Time series database in Industrial IoT and its testing tool

    Get PDF
    Abstract. In the essence of the Industrial Internet of Things is data gathering. Data is time and event-based and hence time series data is key concept in the Industrial Internet of Things, and specific time series database is required to process and store the data. Solution development and choosing the right time series database for Industrial Internet of Things solution can be difficult. Inefficient comparison of time series databases can lead to wrong choices and consequently to delays and financial losses. This thesis is improving the tools to compare different time series databases in context of the Industrial Internet of Things. In addition, the thesis identifies the functional and non-functional requirements of time series database in Industrial Internet of Things and designs and implements a performance test bench. A practical example of how time series databases can be compared with identified requirements and developed test bench is also provided. The example is used to examine how selected time series databases fulfill these requirements. Eight functional requirements and eight non-functional requirements were identified. Functional requirements included, e.g., aggregation support, information models, and hierarchical configurations. Non-functional requirements included, e.g., scalability, performance, and lifecycle. Developed test bench took Industrial Internet of Things point of view by testing the database in three scenarios: write heavy, read heavy, and concurrent write and read operations. In the practical example, ABB’s cpmPlus History, InfluxDB, and TimescaleDB were evaluated. Both requirement evaluation and performance testing resulted that cpmPlus History performed best, InfluxDB second best, and TimescaleDB the worst. cpmPlus History showed extensive support for the requirements and best performance in all performance test cases. InfluxDB showed high performance for data writing while TimescaleDB showed better performance for data reading.Aikasarjatietokanta teollisuuden esineiden internetissä ja sen testipenkki. Tiivistelmä. Teollisuuden esineiden internetin ytimessä on tiedon keruu. Tieto on aika ja tapahtuma pohjaista ja sen vuoksi aikasarjatieto on teollisuuden esineiden internetin avainkäsitteitä. Prosessoidakseen tällaista tietoa tarvitaan erityinen aikasarjatietokanta. Sovelluskehitys ja oikean aikasarjatietokannan valitseminen teollisuuden esineiden internetin ratkaisuun voi olla vaikeaa. Tehoton aikasarjatietokantojen vertailu voi johtaa vääriin valintoihin ja siten viiveisiin sekä taloudellisiin tappioihin. Tässä diplomityössä kehitetään työkaluja, joilla eri aikasarjatietokantoja teollisuuden esineiden internetin ympäristössä voidaan vertailla. Diplomityössä tunnistetaan toiminnalliset ja ei-toiminnalliset vaatimukset aikasarjatietokannalle teollisuuden esineiden internetissä ja suunnitellaan ja toteutetaan suorituskykytestipenkki aikasarjatietokannoille. Työ tarjoaa myös käytännön esimerkin kuinka aikasarjatietokantoja voidaan vertailla tunnistetuilla vaatimuksilla ja kehitetyllä testipenkillä. Esimerkkiä hyödynnetään tutkimuksessa, jossa selvitetään kuinka nykyiset aikasarjatietokannat täyttävät tunnistetut vaatimukset. Diplomityössä tunnistettiin kahdeksan toiminnallista ja kahdeksan ei-toiminnallista vaatimusta. Toiminnallisiin vaatimuksiin sisältyi mm. aggregoinnin tukeminen, informaatiomallit ja hierarkkiset konfiguraatiot. Ei-toiminnallisiin vaatimuksiin sisältyi mm. skaalautuvuus, suorituskyky ja elinkaari. Kehitetty testipenkki otti teollisuuden esineiden internetin näkökulman kolmella eri testiskenaariolla: kirjoituspainoitteinen, lukemispainoitteinen ja yhtäaikaiset kirjoitus- ja lukemisoperaatiot. Käytännön esimerkissä ABB:n cpmPlus History, InfluxDB ja TimescaleDB tietokannat olivat arvioitavina. Sekä vaatimusten arviointi että suorituskykytestit osoittivat cpmPlus History:n suoriutuvan parhaiten, InfluxDB:n toiseksi parhaiten ja TimescaleDB:n huonoiten. cpmPlus History tuki tunnistettuja vaatimuksia laajimmin ja tarjosi parhaan suorituskyvyn kaikissa testiskenaarioissa. InfluxDB antoi hyvän suorituskyvyn tiedon kirjoittamiselle, kun vastaavasti TimescaleDB osoitti parempaa suorituskykyä tiedon lukemisessa

    Rancang Bangun Sistem Penentuan Keputusan untuk Distribusi Penyedian Kontainer dengan Multi Kriteria Secara Dinamis

    Get PDF
    Saat ini penggunaan kontainer docker dalam dunia tekonologi sangat banyak dilakukan. Kontainer docker merupakan operating-system-level virtualization untuk menjalankan beberapa sistem linux yang terisolasi (kontainer) pada sebuah host. Kontainer berfungsi untuk mengisolasi aplikasi atau servis dan dependensinya. Untuk setiap servis atau aplikasi yang terisolasi dibutuhkan satu kontainer pada server host yang ada dan setiap kontainer akan menggunakan sumber daya yang ada pada server host selama kontainer tersebut menyala. Oleh karena itu, jika servis atau aplikasi yang disediakan terus bertambah, maka kontainer juga akan terus bertambah. Hal ini akan menimbulkan masalah dikarenakan sumber daya server atau host yang terbatas. Oleh karena itu diperlukan beberapa server untuk menyediakan sebuah servis atau aplikasi yang terus bertambah. Akan tetapi, ketika menggunakan beberapa server atau host untuk menjalankan servisnya, ketersediaan sumber daya pada setiap server seringkali berbeda-beda. Hal ini dapat menimbulkan masalah dalam pendistribusian kontainer pada setiap server, karena jika pendistribusian penyedian kontainer tidak memperhatikan ketersediaan sumber daya pada setiap server, maka penyedian kontainer menjadi tidak efisien. Oleh karena itu, dibutuhkan sebuah cara untuk mengambil keputusan server manakah yang paling baik digunakan oleh pengguna pada saat permintaan penyediaan kontainer datang. Dalam tugas akhir ini, akan digunakan salah satu metode Multi Criteria Decision Making (MCDM) yaitu Analytical Hierarchy Process sebagai metode pengambilan keputusan. Analytical Hierarchy Process (AHP) adalah sebuah metode pengambilan keputusan yang dilakukan berdasarkan beberapa paramater yang diambil dari sejumlah server yang berbeda. Parameter-parameter yang digunakan berupa ketersediaan sumber daya dari setiap server host seperti ketersediaan RAM atau memory, CPU, dan Penyimpanan file. Dari parameter-parameter tersebut akan diambil sebuah server yang akan melakukan penyediaan kontainer. ================================================================================================ Currently, the use of docker container in the world of technology is very much done. Docker container is a operating-system-level virtualization to run some isolated linux system (container) on a host. Containers serve to isolate applications or services and their dependencies. For every isolated service or application, it takes one container on an existing host server and each container will use resource on the server host as long as the container is alive. Therefore, if the service or application provided continues to grow, then the container will also continue to grow. This will cause problems due to limited server or host resources. Therefore it takes some servers to provide an ever-increasing service or application. However, when using multiple servers or hosts to run its services, the resources and performance of each server are often different. This can cause problems in the distribution of containers on each server, because if the distribution of container providers does not pay attention to resource availability and performance of each server, then the container supply becomes inefficient. Therefore, it takes a way to decide which server is best used by the user upon request of container deployment arrives. In this final project, we will use one of the Multi Criteria Decision Making (MCDM) method which is the Analytical Hierarchy Process as a decision-making method. Analytical Hierarchy Process is a decision-making method based on several parameters taken from a number of different servers. The parameters used can be either RAM or Memory,CPU or File Storage usage. From these parameters a server will be chosen which will contain the container
    corecore