27 research outputs found

    Propensity Score Matching Pada Pemanfaatan Data Hasil Web Scraping Untuk Perbaikan Statistik Resmi

    Get PDF
    The Central Statistics Agency (BPS) welcomes the challenge of utilizing big data. One of the BPS publications that can be supported using big data is the inflation figure collected from the consumer price survey. One part of the consumer price survey is the HK-4 Survey, which contains house contract rates. So far, the house contract rates produced by BPS have been underestimated or lower than the actual situation. Improvements to house contract rates are carried out by matching BPS data and web scraping of house rental sites using Propensity Score Matching (PSM). The data used in this study includes DKI Jakarta, Bandung, and Semarang from September to October 2023. This study aims to find the best matching model using PSM to improve official statistics (house contract rates) by combining several propensity score value estimation methods and matching algorithms. Furthermore, the results matching the best model will be used to calculate the corrected house contract rates. The study results show that the best matching model generally uses logistic regression propensity score value estimation, the nearest neighbor matching algorithm with returns and uses a 1:1 ratio. The corrected contract rates are far above the official ones (DKI Jakarta corrected 87.27%, Bandung 316.15%, and Semarang 60.04%). Web Scraping allows it to improve official statistics because it is cost and time-saving, enhances the quality of official statistical data, and supports better decision-making in various sectors

    COMPARISON OF SARIMA, SVR, AND GA-SVR METHODS FOR FORECASTING THE NUMBER OF RAINY DAYS IN BENGKULU CITY

    Get PDF
    The number of rainy days is a calculation of the rainy days that occur in one month. In recent years, there has been a decrease in rainy days in some parts of Indonesia. One of the areas at risk of quite a high decreasing number of rainy days is the Bengkulu City area. The decrease in the number of rainy days is one of the impacts caused by climate change. The community will feel the impact of climate change-related to the season, especially those working in the agricultural sector. In compiling the planting calendar, it is necessary to consider the seasons to estimate water availability. This study aimed to forecast the data on the number of rainy days in Bengkulu City in the period January 2000 to December 2020 using the Seasonal Autoregressive Integrated Moving Average (SARIMA), Support Vector Regression (SVR), and Genetic Algorithm Support Vector Regression (GA-SVR) methods. The criteria for selecting the best model used was Mean Absolute Deviation (MAD). The MAD value in the SARIMA method was 4,16, 5,07 in the SVR model, and 3,67 in the GA-SVR model. Based on these results, it can be concluded that the GA-SVR model is the best model for forecasting the number of rainy days in Bengkulu City

    Metode AdaBoost dan Random Forest untuk Prediksi Peserta JKN-KIS yang Menunggak

    Get PDF
    The contribution of participants, employers, and/or the government is one of the most important things in the National Health Insurance Program-Healthy Indonesia Card (JKN-KIS) implementation. All Indonesian residents were required to participate in the JKN-KIS program which is divided into four types of participation, one of which is Non-Wage Recipient Participants (PBPU) whose contributions are paid independently. However, based on December 2021 data, 60% of PBPU participants were late in paying monthly until they were in arrears. Arrears in payment of contributions cause several problems, including payment of claims to deficits. This research utilized big data owned by the Healthcare and Social Security Agency (BPJS Kesehatan) and machine learning based on ensemble trees, namely AdaBoost and random forest to get the predictions of participants in arrears. The results showed that machine learning based on an ensemble tree was able to predict PBPU participants in arrears with high accuracy, as evidenced by the AUC values in both models above 80%. The random forest model has an F1-score and the AUC value is better than the AdaBoost, namely the F1-score of 85,43% and the AUC value of 87,20% in predicting JKN-KIS participants who are in arrears in payment of contributions

    Pattern Recognition of Food Security in Indonesia Using Biclustering Plaid Model

    Get PDF
    Biclustering come in various algorithms, selecting the most suitable biclustering algorithm can be a challenging task. The performance of algorithms can vary significantly depending on the specific data characteristics. The Plaid model is one of popular biclustering algorithms, has gained recognition for its efficiency and versatility across various applications, including food security. Indonesia deals with complex food security challenges. The nation's unique geographic and socioeconomic diversity demands region-specific food security solutions. Identifying province-specific food security patterns is crucial for effective policymaking and resource allocation, ultimately promoting food sufficiency and stability at the regional level. This study assesses the performance of the Plaid model in identifying food security patterns at the provincial level in Indonesia. To optimize biclusters, we explore various parameter tuning scenarios (the choice of model, the number of layers, and the threshold value for row and column releases). The selection criteria are based on the change ratio of the initial matrix's mean square residue to the mean square residue of the Plaid model, the average mean square residue, and the number of biclusters. The constant column model was selected with a mean square residue change ratio of 0.52, an average mean square plaid model residue of 4.81, and it generates 6 overlapping biclusters. The results show each bicluster has unique characteristics. Notably, Bicluster 1 that consist of 2 provinces, exhibits the lowest food security levels, marked by variables X1, X2, X4, and X7. Furthermore, the variables X1, X4, and X7 consistently appear across several biclusters. This highlights the importance of prioritizing these three variables to improve the food security status of the regions.

    Comparison of BEKK GARCH and MEWMA Methods on IDX Composite and Exchange Rate Volatility

    Get PDF
    At the beginning of 2020, the world was busy with a new virus namely COVID-19. In Indonesia, COVID-19 virus was first identified on March 2nd, 2020. This global pandemic made several impacts. One of the impact is on Country's Economy that can be seen in the decline of IDX Composite and the weakening of US Dollar exchange rate to Rupiah. The movement of IDX Composite and US Dollar exchange rate to Rupiah often increases and decreases every day. This condition can be caused by volatility due to fluctuation. There are several methods to cover the volatility of multivariate data, one of them can be approached using Multivariate Generalized Autoregressive Conditional Heteroskedasticity (MGARCH) model. In addition to the GARCH model, there is another approach that can also be used to cover volatility data, that is Multivariate Exponential Weighted Moving Average (MEWMA) model. Based on the analysis results of the three training data, it was found that the RMSE of the BEKK GARCH method was greater than the RMSE of the MEWMA method and VAR(2)-MEWMA that be used on the three training data had the consistently volatility predict of IDX Composite return and US Dollar exchange rate to Rupiah return. MEWMA method can be said to have a better predictive ability, so VAR(2)-MEWMA is used to model IDX Composite return data and US Dollar exchange rate to Rupiah return data from November 2019 to August 2021 and is used to predict the volatility of the next month on September 2021. MEWMA model’s ability is quite good in predicting volatility of IDX Composite return data and US Dollar exchange rate to Rupiah return data

    Algorithm for Predicting Compound Protein Interaction Using Tanimoto Similarity and Klekota-roth Fingerprint

    Get PDF
    This research aimed to develop a method for predicting interaction between chemical compounds contained in herbs and proteins related to particular disease. The algorithm of this method is based on binary local models algorithm, with protein similarity section is omitted. Klekota-Roth fingerprint is used for the compound's representation. In the development process of the method, three similarity functions are compared: Tanimoto, Cosine, and Dice. Youden’s index is used to evaluate optimum threshold value. The result showed that Tanimoto similarity function yielded higher similarity values and higher AUC value than those of the other two functions. Moreover, the optimum threshold value obtained is 0.65. Therefore, Tanimoto similarity function and threshold value 0.65 are selected to be used on the prediction method. The average evaluation accuracy of the developed algorithm is only about 50%. The low accuracy value is allegedly caused by the only use of compound similarity on the prediction method, without including the protein similarity

    Pengaruh Spiritualitas Kerja terhadap Keterlekatan Karyawan melalui Kepuasan Kerja pada UKM Kota Bogor

    Get PDF
    The Quality human resources are needed in global economic competition. Spirituality in work becomes a solution developed by companies, because it can be created a conducive environment for employees to work as good as possible. The purpose of this study is to analyze the influence of work spirituality on employee engagement through job satisfaction in Small and Medium Enterprises cluster of food and beverages in the city of Bogor. This research used Structural Equation Modeling PLS for data analysis. Samples are SMEs that have at least 5 employees and have been registered in the Department of Industry and Trade (Disperindag) and the Department of Cooperatives and SMEs Bogor City. So that 25 SMEs are eligible, consisting of 65 people consisting of employees and owners of SMEs. Sampling method using purposive sampling. The results showed that the spirituality of work has a positive effect directly on employee engagement and indirectly influence through job satisfaction on employee engagement to the organization. Meanwhile, job satisfaction has a direct positive effect on employees'  engagement to the organization. Therefore, increased employee engagement to SMEs is suggested through several supporting activities such as: communicating and facilitating the need for spirituality in the workplace

    Simultaneous clustering analysis with molecular docking in network pharmacology for type 2 antidiabetic compounds

    Get PDF
    The database of drug compounds and human proteins plays a very important role in identifying the protein target and the compound in drug discovery. Recently, a network pharmacology approach was established by updating the research paradigm from the current “one disease-one target-one drug” to a new “drug-target-disease network”. Ligand-protein interactions can be analyzed quantitatively using simultaneous clustering and molecular docking. The docking method offers the ability to quickly and cheaply predict the ligand-protein binding free energy (DG) in structure-based virtual screening. Meanwhile, simultaneous clustering was used to find subgroups of compounds that exhibit a high correlation with subgroups of target proteins. This study is focused on the interaction between the 306 compounds from medicinal plants (brotowali Tinospora crispa, ginger Zingiber officinale, pare Momordica charantia, sembung Blumea balsamifera, synthetic drugs (FDA-approved) and the 21 significant human proteins associated with type 2 diabetes. We found that brotowali (B018), sembung (S031), pare (P231), and ginger (J036, J033) were close to the synthetic drugs and can possibly be developed as antidiabetic drug candidates. Likewise, the proteins AKT1, WFS1, APOE, EP300, PTH, GCG, and UBC which assemble each other and which have a high association with INS can be seen as target proteins that play a role in type 2 diabetes
    corecore