15 research outputs found

    Geographically and Temporally Weighted Autoregressive to Modeling the Levels of Poverty Population in Java in 2012-2018

    Get PDF
    Geographically and temporally weighted regression (GTWR) is a method applied when there is spatial and temporal diversity in the observation. GTWR model just considers local influences of spatial-temporal response variable on the explanatory variables. The GTWR model can add an autoregressive component of response variable, the resulting model is known as a geographically and temporally weighted autoregressive model (GTWAR). This study aims to perform GTWAR modeling which is applied to the data on the proportion of poor people by districts/cities in Java in 2012-2018. The results showed that GTWAR produced Akaike Information Criterion (AIC) smaller than GTWR, and the coefficient of determination (R2) is higher than GTWR

    LASSO : SOLUSI ALTERNATIF SELEKSI PEUBAH DAN PENYUSUTAN KOEFISIEN MODEL REGRESI LINIER

    Get PDF
    A new method, known as LASSO, has recently developed for selections and shrinkage linear regression methods. The method gives an alternative solution on high correlated data between independent variables, where the least squares produces high variance. Based on simulation this method is not better than forward selection (in the case the parameters contains many zero values) and ridge regression (in the case all parameter values close to zero). Unknowing the true parameter and consistency estimates for all conditions that put the LASSO is better than ridge or forward selection.Keywords : LASSO, least square, forward selection, ridge, cross validatio

    Land Use Change Modelling Using Logistic Regression, Random Forest and Additive Logistic Regression in Kubu Raya Regency, West Kalimantan

    Get PDF
    Kubu Raya Regency is a regency in the province of West Kalimantan which has a wetland ecosystem including a high-density swamp or peatland ecosystem along with an extensive area of mangroves. The function of wetland ecosystems is essential for fauna, as a source of livelihood for the surrounding community and as storage reservoir for carbon stocks. Most of the land in Kubu Raya Regency is peatland. As a consequence, peat has long been used for agriculture and as a source of livelihood for the community. Along with the vast area of peat, the regency also has a potential high risk of peat fires. This study aims to predict land use changes in Kubu Raya Regency using three statistical machine learning models, specifically Logistic Regression (LR), Random Forest (RF) and Additive Logistic Regression (ALR). Land cover map data were acquired from the Ministry of Environment and Forestry and subsequently reclassified into six types of land cover at a resolution of 100 m. The land cover data were employed to classify land use or land cover class for the Kubu Raya regency, for the years 2009, 2015 and 2020. Based on model performance, RF provides greater accuracy and F1 score as opposed to LR and ALR. The outcome of this study is expected to provide knowledge and recommendations that may aid in developing future sustainable development planning and management for Kubu Raya Regency

    TEXT CLUSTERING ONLINE LEARNING OPINION DURING COVID-19 PANDEMIC IN INDONESIA USING TWEETS

    Get PDF
    To prevent the spread of corona virus, restriction of social activities are implemented including school activities which reaps the pros and cons in community. Opinions about online learning are widely conveyed mainly on Twitter. Tweets obtained can be used to extract information using text clustering to group topics about online learning during pandemic in Indonesia. K-Means is often used and has good performance in text clustering area. However, the problem of high dimensionality in textual data can result in difficult computations so that a sampling method is proposed. This paper aims to examine whether a sampling method to cluster tweets can result to an efficient clustering than using the whole dataset. After pre-processing, five sample sizes are selected from 28300 tweets which are 250, 500, 2500, 10000 and 20000 to conduct K-Means clustering. Results showed that from 10 iterations, three main cluster topics appeared 90%-100% in sample size of 2500, 10000 and 20000. Meanwhile sample size of 250 and 500 tend to produced 20%-60% appearance of the three main cluster topics. This means that around 8% to 35% of tweets used can yield representative clusters and efficient computation which is four times faster than using entire dataset

    Features Analysis of the Research and Development Industry in Indonesia

    Get PDF
    R&D is one of the key drivers of technological progress and contributes to increased productivity and profit growth. Indonesian percentage of Gross Domestic Expenditure on R & R&D (GERD) to GDP in 2018 is one of the Global Competitiveness Index indicators, only reaches 0.28% and is dominated by the government sector, while the industrial sector is only 7.34%. One of the reasons for this small value is that the data collection of R&D on the business sector in Indonesia has not been carried out optimally. A classification model is needed to determine the data collection target so that the results are more optimal. The main objective of this study is to classify R&D industries actors in Indonesia using XGBoost and then analyze the features for R&D industries actors using SHAP. XGBoost is one of the black-box models that is difficult to interpret, and SHAP is one of the interpretation methods. The classification results using XGBoost obtained the accuracy, AUC, and F1-Score values of 79.61%, 0.7646, and 84.44%, respectively. Based on the Shapley value of the SHAP method, it was found that the average growth in R&D expenditure had the highest contribution. The feature's contribution to the estimation will be even higher if the mean of R&D expenditure growth is higher (more than 0). The other one is the ratio of researchers to R&D human resources. If the ratio is more than 75%, it will negatively contribute. Finally, exports and State-Owned Enterprise (BUMN) feature with the smallest contribution.https://dorl.net/dor/20.1001.1.20088302.2022.20.2.4.

    MODEL OTENTIKASI KOMPOSISI OBAT BAHAN ALAM BERDASARKAN SPEKTRA INFRAMERAH DAN KOMPONEN UTAMA STUDI KASUS : OBAT BAHAN ALAM/FITOFARMAKA PENURUN TEKANAN DARAH

    Get PDF
    Komposisi kimia yang terkandung dalam ekstrak obat bahan alam merupakan suatu komposisi yang kompleks, dengan demikian pengujian keotentikannya tidak dapat dilakukan melalui pedekatan tunggal.  Salah satu teknik analisis yang dapat menggambarkan secara menyeluruh karakteristik kimia suatu bahan adalah teknik spektroskopi FTIR. Spektra FTIR dihasilkan dari interaksi antara energi sinar inframerah dan komponen kimia penyusun campuran bahan, sehingga suatu spektra FTIR merupakan indentitas khas campuran tersebut. Keotentikan komposisi suatu obat bahan alam pada studi  ini ditentukan berdasarkan pada analisis komponen utama spektra inframerahnya.  Studi dilakukan pada obat bahan alam/fitofarmaka penurun tekanan darah (Tensigard® : terdiri dari ekstrak seledri dan ekstrak daun kumis kucing). Pengukuran spektra inframerah dilakukan terhadap formula obat yang persentase komposisinya ditentukan melalui simplex lattice design. Selain itu pengukuran spektra inframerah juga dilakukan terhadap formula obat dengan mengganti (adulterasi) ekstrak kumis kucing dengan obat sintetis (reserpin) dan ekstrak sambiloto. Berdasarkan plot antara skor komponen utama pertama dan skor komponen utama kedua menunjukkan plot tersebut dapat digunakan untuk mendeteksi komposisi obat, tetapi tidak dapat mendeteksi adanya adulterasi komposisi oleh bahan lain.   Kata Kunci : model otentikasi fitofarmaka, simplex lattice design, komponen utama, tensigar

    PENGENDALIAN KOEFISIEN REGRESI LEAST ABSOLUTE DEVIATION PADA RENTANG BERMAKNA MENGGUNAKAN PROGRAM LINIER

    Get PDF
    So far, regression analysis is used to model the mean of response variable as a function of some independent variables, using the least squares (LS) method. In general, the LS method is able to describe well the measure of central tendency, however it is not robust against outliers. Therefore, in certain cases, a regression analysis that minimizes the sum of absolute residuals (least absolute deviation - LAD) is required, which is more robust against outliers. So far, the value of the regression coefficient is not modeled and only depends entirely on the data processed. In some cases, the sign and the value of regression coefficients need to be controlled, in order to be in the meaningful range. The results of this study showed that the modification of the constraints on the LAD regression able to control the regression coefficients to be in the meaningful range. The results of bootstrap showed that distribution of controlled regression coefficients were different from distribution of uncontrolled regression coefficients

    Deep Learning Image Classification Rontgen Dada pada Kasus Covid-19 Menggunakan Algoritma Convolutional Neural Network

    Get PDF
    Penelitian ini mengusulkan penggunaan Convolutional Neural Network (CNN) dengan arsitektur VGGNet-19 dan ResNet-50 untuk diagnosis COVID-19 melalui analisis citra rontgen dada. Modifikasi dilakukan dengan membandingkan nilai regularisasi dropout 50% dan 80% untuk kedua arsitektur dan mengubah jumlah lapisan klasfikasi menjadi 4 kelas. Selanjutnya, kinerja model dibandingkan berdasarkan ukuran dataset. Dataset terdiri dari 21165 citra, dengan pembagian 10% sebagai data uji dan 90% data dibagi menjadi data latih (80%) dan data validasi (20%). Kinerja model dievaluasi menggunakan metode validasi silang berulang 5 kali lipat. Proses pelatihan menggunakan learning rate 0.0001, optimasi stochastic gradient descent (SGD), dan sepuluh iterasi. Hasil penelitian menunjukkan bahwa penambahan lapisan dropout dengan peluang 50% untuk kedua arsitektur secara efektif mengatasi overfitting dan meningkatkan performa model. Ditemukan bahwa kinerja yang lebih baik dicapai pada ukuran kumpulan data lebih besar dan memberikan peningkatan signifikan pada kinerja model. Hasil klasifikasi menunjukkan arsitektur ResNet-50 mencapai akurasi rata-rata 94.4%, recall rata-rata 94.1%, presisi rata-rata 95.5%, spesifisitas rata-rata 97% dan F1-score rata-rata 94.8%. Sedangkan arsitektur VGGNet-19 mencapai akurasi rata-rata 91%, recall rata-rata 89%, presisi rata-rata 95.0%, spesifisitas rata-rata 96.8% dan F1-score rata-rata 92.7%. Pemanfaatan model ini dapat membantu mengidentifikasi penyebab kematian pasien dan memberikan informasi yang berharga bagi pengambilan keputusan medis dan epidemiologi.   Abstract This research proposes using a Convolutional Neural Network (CNN) with VGGNet-19 and ResNet-50 architectures for COVID-19 diagnosis through chest X-ray image analysis. Modifications were made by comparing the dropout regularization values of 50% and 80% for both architectures and altering the number of classification layers to 4 classes. Furthermore, the model\u27s performance was compared based on dataset size. The dataset comprised 21,165 images, with a division of 10% for testing and 90% divided into training data (80%) and validation data (20%). The model\u27s performance was evaluated using the 5-fold repeat cross-validation method. The training process employed a learning rate of 0.0001, stochastic gradient descent (SGD) optimization, and ten iterations. The study\u27s results indicate that adding dropout layers with a 50% probability for both architectures effectively addressed overfitting and improved the model\u27s performance. It was found that better performance was achieved with larger dataset sizes. The classification results indicate the ResNet-50 architecture achieved an average accuracy of 94.4%, average recall of 94.1%, average precision of 95.5%, average specificity of 97%, and average F1-score of 94.8%. Meanwhile, the VGGNet-19 architecture achieved an average accuracy of 91%, an average recall of 89%, average precision of 95.0%, average specificity of 96.8%, and an average F1-score of 92.7%. Utilizing these models can assist in identifying the causes of patient mortality and offer valuable information for medical and epidemiological decision-making

    Nowcasting Indonesia's GDP Growth: Are Fiscal Data Useful?

    Get PDF
    Since introduced by Giannone et al. (2008), GDP nowcasting models have been used in many countries, including Indonesia. Variables to select usually include housing and construction, income, manufacturing, labor, surveys, international trade, retails and consumptions. Interestingly, fiscal variables are excluded even though government expenditure is an integral part of the basic GDP identity. By employing the Bok et al. (2018)’s quarter-to-quarter real GDP growth nowcasting technique, this paper is aimed at testing the usefulness of inclusion of fiscal variables, in addition to 61 non-fiscal variables, in nowcasting Indonesia GDP. The results show, even though based on the fact that fiscal data have low correlation coefficients to GDP, the inclusion of fiscal data may help to produce a better early estimate of GDP growth
    corecore