8 research outputs found

    Prediksi Penambahan Piutang Iuran Jaminan Sosial Ketenagakerjaan menggunakan Algoritma K-Nearest Neighbor

    Get PDF
    There are several issues with Social Security Organizing Agency (BPJS) employment at the moment, one of which is contribution receivable. To reduce the BPJS contribution receivables, BPJS has done various ways. However, the resulting effort is not maximal enough to reduce the number of receivables in BPJS. This study aims to provide input by predicting the addition of receivables from social security contributions made by several companies or organizations. This study used the K-Nearest Neighbor (KNN) Algorithm with a cross-validation technique. KNN is a very simple classification method in classifying an image based on the closest distance to its neighbors. This study conducted data processing from BPJS use, which amounted to 1193 data. The data is then preprocessed so that the processed data is clean from missing and noise, this data uses 70:30 data splitting. After the preprocessing and splitting of data were carried out, the next step was to do modeling using KNN, so the cross-validation to improve the accuracy of results obtained from the KNN algorithm. The results obtained from this research get the highest accuracy of 92% with the Optimal K value being 6, then the ROC curve gets 94% accuracy. From these results, it can be said that the use of cross-validation can increase the accuracy of this study

    Performance of Ensemble Classification for Agricultural and Biological Science Journals with Scopus Index

    Get PDF
    The ensemble method is considered an advanced method in both prediction and classification. The application of this method is estimated to have a more optimal output than the previous classification method. This article aims to determine the ensemble's performance to classify journal quartiles. The subject of agriculture was chosen because Indonesia is an agricultural country, and the interest of researchers in this field shows a positive response. The data is downloaded through the Scimago Journal and Country Rank with the accumulation in 2020. Labels have four classes: Q1, Q2, Q3, and Q4. The ensemble applied is Boosting and Bagging with Decision Tree (DT) and Gaussian Naïve Bayes (GNB) algorithms compiled from 2144 instances. The Boosting meta-ensembles used are Adaboost and XGBoost. From this study, the Bagging Decision Tree has the highest accuracy score at 71.36, followed by XGBoost Decision Tree with 69.51. The third is XGBoost Gaussian Naïve Bayes with 68.82, Adaboost Decision Tree with 60.42, Adaboost Gaussian Naïve Bayes with 58.2, and Bagging Gaussian Naïve Bayes with 56.12 results. This paper shows that the Bagging Decision Tree is the ensemble method that works optimally in this subject classification. This result suggests that the ensemble method can still fail to produce an ideal outcome that approaches the SJR system

    Big data driven vehicle battery management method: A novel cyber-physical system perspective

    Get PDF
    The establishment of an accurate battery model is of great significance to improve the reliability of electric vehicles (EVs). However, the battery is a complex electrochemical system with hardly observable and simulatable internal chemical reactions, and it is challenging to estimate the state of battery accurately. This paper proposes a novel flexible and reliable battery management method based on the battery big data platform and Cyber-Physical System (CPS) technology. First of all, to integrate the battery big data resources in the cloud, a Cyber-physical battery management framework is defined and served as the basic data platform for battery modeling issues. And to improve the quality of the collected battery data in the database, this work reports the first attempt to develop an adaptive data cleaning method for the cloud battery management issue. Furthermore, a deep learning algorithm-based feature extraction model, as well as a feature-oriented battery modeling method, is developed to mitigate the under-fitting problem and improve the accuracy of the cloud-based battery model. The actual operation data of electric buses is used to validate the proposed methodologies. The maximum data restoring error can be limited within 1.3% in the experiments, which indicates that the proposed data cleaning method is able to improve the cloud battery data quality effectively. Meanwhile, the maximum SoC estimation error in the proposed feature-oriented battery modeling method is within 2.47%, which highlights the effectiveness of the proposed method.</p

    Data cleaning and restoring method for vehicle battery big data platform

    Get PDF
    Battery is one of the most important and costly devices in electric vehicles (EVs). Developing an efficient battery management method is of great significance to enhancing vehicle safety and economy. Recently developed big-data and cloud platform computing technologies bring a bright perspective for efficient utilization and protection of vehicle batteries. However, a reliable data transmission network and a high-quality cloud battery dataset are indispensable to enable this benefit. This paper makes the first effort to systematically solve data quality problems in cloud-based vehicle battery monitoring and management by developing a novel integrated battery data cleaning framework. In the first stage, the outlier samples are detected by analyzing the temporal features in the battery data time series. The outlier data in the dataset can be accurately detected to avoid their impacts on battery monitoring and management. Then, the abnormal samples, including the noise polluted data and missing value, are restored by a novel future fusion data restoring model. The real electric bus operation data collected by a cloud-based battery monitoring and management platform are used to verify the performance of the developed data cleaning method. More than 93.3% of outlier samples can be detected, and the data restoring error can be limited to 2.11%, which validates the effectiveness of the developed methods. The proposed data cleaning method provides an effective data quality assessment tool in cloud-based vehicle battery management, which can further boost the practical application of the vehicle big data platform and Internet of vehicle.</p

    Resource Allocation of Industry 4.0 Micro-Service Applications across Serverless Fog Federation

    Full text link
    The Industry 4.0 revolution has been made possible via AI-based applications (e.g., for automation and maintenance) deployed on the serverless edge (aka fog) computing platforms at the industrial sites -- where the data is generated. Nevertheless, fulfilling the fault-intolerant and real-time constraints of Industry 4.0 applications on resource-limited fog systems in remote industrial sites (e.g., offshore oil fields) that are uncertain, disaster-prone, and have no cloud access is challenging. It is this challenge that our research aims at addressing. We consider the inelastic nature of the fog systems, software architecture of the industrial applications (micro-service-based versus monolithic), and scarcity of human experts in remote sites. To enable cloud-like elasticity, our approach is to dynamically and seamlessly (i.e., without human intervention) federate nearby fog systems. Then, we develop serverless resource allocation solutions that are cognizant of the applications' software architecture, their latency requirements, and distributed nature of the underlying infrastructure. We propose methods to seamlessly and optimally partition micro-service-based application across the federated fog. Our experimental evaluation express that not only the elasticity is overcome in a serverless manner, but also our developed application partitioning method can serve around 20% more tasks on-time than the existing methods in the literature.Comment: Accepted in the Future Generation Computer Systems (FGCS) Journa

    A Data-driven Methodology Towards Mobility- and Traffic-related Big Spatiotemporal Data Frameworks

    Get PDF
    Human population is increasing at unprecedented rates, particularly in urban areas. This increase, along with the rise of a more economically empowered middle class, brings new and complex challenges to the mobility of people within urban areas. To tackle such challenges, transportation and mobility authorities and operators are trying to adopt innovative Big Data-driven Mobility- and Traffic-related solutions. Such solutions will help decision-making processes that aim to ease the load on an already overloaded transport infrastructure. The information collected from day-to-day mobility and traffic can help to mitigate some of such mobility challenges in urban areas. Road infrastructure and traffic management operators (RITMOs) face several limitations to effectively extract value from the exponentially growing volumes of mobility- and traffic-related Big Spatiotemporal Data (MobiTrafficBD) that are being acquired and gathered. Research about the topics of Big Data, Spatiotemporal Data and specially MobiTrafficBD is scattered, and existing literature does not offer a concrete, common methodological approach to setup, configure, deploy and use a complete Big Data-based framework to manage the lifecycle of mobility-related spatiotemporal data, mainly focused on geo-referenced time series (GRTS) and spatiotemporal events (ST Events), extract value from it and support decision-making processes of RITMOs. This doctoral thesis proposes a data-driven, prescriptive methodological approach towards the design, development and deployment of MobiTrafficBD Frameworks focused on GRTS and ST Events. Besides a thorough literature review on Spatiotemporal Data, Big Data and the merging of these two fields through MobiTraffiBD, the methodological approach comprises a set of general characteristics, technical requirements, logical components, data flows and technological infrastructure models, as well as guidelines and best practices that aim to guide researchers, practitioners and stakeholders, such as RITMOs, throughout the design, development and deployment phases of any MobiTrafficBD Framework. This work is intended to be a supporting methodological guide, based on widely used Reference Architectures and guidelines for Big Data, but enriched with inherent characteristics and concerns brought about by Big Spatiotemporal Data, such as in the case of GRTS and ST Events. The proposed methodology was evaluated and demonstrated in various real-world use cases that deployed MobiTrafficBD-based Data Management, Processing, Analytics and Visualisation methods, tools and technologies, under the umbrella of several research projects funded by the European Commission and the Portuguese Government.A população humana cresce a um ritmo sem precedentes, particularmente nas áreas urbanas. Este aumento, aliado ao robustecimento de uma classe média com maior poder económico, introduzem novos e complexos desafios na mobilidade de pessoas em áreas urbanas. Para abordar estes desafios, autoridades e operadores de transportes e mobilidade estão a adotar soluções inovadoras no domínio dos sistemas de Dados em Larga Escala nos domínios da Mobilidade e Tráfego. Estas soluções irão apoiar os processos de decisão com o intuito de libertar uma infraestrutura de estradas e transportes já sobrecarregada. A informação colecionada da mobilidade diária e da utilização da infraestrutura de estradas pode ajudar na mitigação de alguns dos desafios da mobilidade urbana. Os operadores de gestão de trânsito e de infraestruturas de estradas (em inglês, road infrastructure and traffic management operators — RITMOs) estão limitados no que toca a extrair valor de um sempre crescente volume de Dados Espaciotemporais em Larga Escala no domínio da Mobilidade e Tráfego (em inglês, Mobility- and Traffic-related Big Spatiotemporal Data —MobiTrafficBD) que estão a ser colecionados e recolhidos. Os trabalhos de investigação sobre os tópicos de Big Data, Dados Espaciotemporais e, especialmente, de MobiTrafficBD, estão dispersos, e a literatura existente não oferece uma metodologia comum e concreta para preparar, configurar, implementar e usar uma plataforma (framework) baseada em tecnologias Big Data para gerir o ciclo de vida de dados espaciotemporais em larga escala, com ênfase nas série temporais georreferenciadas (em inglês, geo-referenced time series — GRTS) e eventos espacio- temporais (em inglês, spatiotemporal events — ST Events), extrair valor destes dados e apoiar os RITMOs nos seus processos de decisão. Esta dissertação doutoral propõe uma metodologia prescritiva orientada a dados, para o design, desenvolvimento e implementação de plataformas de MobiTrafficBD, focadas em GRTS e ST Events. Além de uma revisão de literatura completa nas áreas de Dados Espaciotemporais, Big Data e na junção destas áreas através do conceito de MobiTrafficBD, a metodologia proposta contem um conjunto de características gerais, requisitos técnicos, componentes lógicos, fluxos de dados e modelos de infraestrutura tecnológica, bem como diretrizes e boas práticas para investigadores, profissionais e outras partes interessadas, como RITMOs, com o objetivo de guiá-los pelas fases de design, desenvolvimento e implementação de qualquer pla- taforma MobiTrafficBD. Este trabalho deve ser visto como um guia metodológico de suporte, baseado em Arqui- teturas de Referência e diretrizes amplamente utilizadas, mas enriquecido com as característi- cas e assuntos implícitos relacionados com Dados Espaciotemporais em Larga Escala, como no caso de GRTS e ST Events. A metodologia proposta foi avaliada e demonstrada em vários cenários reais no âmbito de projetos de investigação financiados pela Comissão Europeia e pelo Governo português, nos quais foram implementados métodos, ferramentas e tecnologias nas áreas de Gestão de Dados, Processamento de Dados e Ciência e Visualização de Dados em plataformas MobiTrafficB
    corecore