59 research outputs found

    REGRESI KUADRAT TERKECIL PARSIAL MULTI RESPON UNTUK STATISTICAL DOWNSCALING (Multi Response Partial Least Square for Statistical Downscaling)

    Get PDF
    In  climatology  partial  least  square  regression  (PLSR)  can  be  used  as  an alternative  technique  in  statistical  downscaling  based  on  global  circulation model  (GCM)  output.  PLSR  is  the  technique  to  forecast  not  only  one  response but  also  multi  responses  to  accommodate  the  correlation  among  responses. PLSR is compared to PCR (Principal Component Regression). The results show that PLSR is better than PCR and can be used to forecast rainfall simultaneously in more than one rainfall stations relatively as well as in one station.  Keywords: statistical downscaling, PLSR, PCR, multi response

    PENDEKATAN REGRESI KUADRAT TERKECIL PARSIAL ROBUST MULTIRESPONS DALAM MODEL KALIBRASI

    Get PDF
    Regresi Kuadrat Terkecil Parsial (RKTP) merupakan sebuah tehnik prediktif yang mampu mengatasi peubah bebas yang berdimensi besar, khususnya ketika terdapat masalah multikolinearitas. Skor dalam RKTP dihitung dengan memaksimalkan kriteria koragam antara peubah x dan y sehingga dalam teknik ini respons telah dilibatkan dalam analisis sejak awal. SIMPLS merupakan salah satu algoritma RKTP yang dikenalkan oleh De Jong (1993). Karena SIMPLS didasari dari matriks koragam silang empirik antara peubah respon dan peubah bebas dan dalam regresi linier kuadrat terkecil, maka SIMPLS tidak resisten terhadap pengamatan pencilan (outlier). Untuk mengatasi masalah pencilan diperlukan suatu metode penduga yang tegar terhadap pencilan yang disebut sebagai metode robust. Dua metode RKTP robust, RSIMCD dan RSIMPLS, yang dibangun dari matriks koragam robust untuk data berdimensi besar dan regresi linier robust, mampu mengatasi pengaruh pengamatan pencilan. Selanjutnya nilai RMSECV robust diperoleh untuk membangun model kalibrasi dan RMSEP robust digunakan untuk validasi model. Diagnosa plot akan dibuat sebagai visualisasi dan klasifikasi pencilan

    CLUSTERING RURAL DEVELOPMENT TYPOLOGY IN EAST JAVA PROVINCE USING LATENT CLASS ANALYSIS

    Get PDF
    To deliver the sustained and equitable regional development in Indonesia, the government must understand the characteristics of each region based on its area features, therefore, the classification of rural area must be handled to increase the precision of the development program. Since the rural area has their own specific characteristics that may lead to its rural level, the classification must ensure that the development policy fit in each area. In this paper, we try to classify the typology of rural development that measured based on the rural potential characteristics, education, and socioeconomic. We select villages in East Java province as a scope of research area since East Java was well-known as a center of agricultural in Java, however, in 2011-2014, according to BPS, the poverty rate put east Java in 15th position in the national poverty rate. The classification uses latent class analysis, which models the data into particular statistical distribution to identify immeasurable cluster membership between subjects with observed categorical or continuous variables. The method was able to handle overlapping model data by setting different characteristics, and the modeling results can be tested its accuracy level. Expectation Maximization (EM) algorithm is used to estimate parameters of the latent class model. The research uses PODES 2011 dataset which contains characteristics and facilities information of 8502 villages. The result showed that using latent class analysis generates five clusters of rural area development, while the current classification from Ministry of Home Affairs only uses three typologies of rural development. The research result was able to give more detail additional information of current three classifications by dividing its typology into several detail typology classifications. Key words: Latent Class Analysis, Maximum Likelihood, Expectation Maximization Algorithm, Rural Development Typolog

    RIDGE AND LASSO PERFORMANCE IN SPATIAL DATA WITH HETEROGENEITY AND MULTICOLLINEARITY

    Get PDF
    Spatial heterogeneity becomes a separate issue on the analysis of spatial data. GWR (Geographically Weighted Regression) is a statistical technique to explore spatial nonstationarity by form the differrent regression models at different point in observation space. Multicollinearity is a condition that the independent variables in model have linear relationship. It would be a problem for estimation parameters process, because that condition produces unstable model. This problem may be found in GWR models, which allow the linear relationship between independent variables at each location called local multicollinearity. GWRR (Geographically Weighted Ridge Regression) and GWL (Geographically Weighted Lasso) which use the concept of ridge and lasso is shrink the regression coefficient in GWR model. GWRR and GWL techniques are consider to be capable of overcoming local multicollinearity to produce more stable models with lower variance. In this study, GWRR and GWL is used to model Gross Regional Domestic Product (GRDP) in Java using kernel exponential weighted function. The results showed that GWL has better performance to predict GRDP with lower RMSE and higher value than GWRR.Keyword : Spatial Heterogeneity, GWR, Local Multicollinearity, Ridge, Lass

    REGRESI PROSES GAUSSIAN UNTUK PEMODELAN KALIBRASI SPEKTROSKOPI (STUDI KASUS : PENGUKURAN KONSENTRASI KURKUMIN, SEBUAH SENYAWA PENCIRI PADA TANAMAN OBAT TEMU LAWAK)

    Get PDF
    Model-model kalibrasi multivariat telah dikembangkan dengan menggunakan metode regresi melalui pendekatan teknik regresi komponen utama dan kuadrat terkecil sebagian. Penelitian ini mengusulkan penerapan regresi proses gaussian sebagai metode alternatif. Sebuah proses gaussian diturunkan dari perspektif regresi nonparametrik bayesian dimana pendugaan nilai hyperparameternya dilakukan dengan metode kemungkinan maksimum. Untuk mengatasi banyaknya peubah bebas yang terlibat, pereduksian peubah dilakukan dengan metode analisis komponen utama. Regresi proses gaussian lebih fleksibel jika dibandingkan dengan metode-metode sebelumnya, dalam arti bahwa dengan pemilihan fungsi peragam yang tepat dia mampu menangkap struktur linear maupun nonliner dari gugus-gugus data yang diteliliti

    NONLINEAR PRINCIPAL COMPONENT ANALYSIS AND PRINCIPAL COMPONENT ANALYSIS WITH SUCCESSIVE INTERVAL IN K-MEANS CLUSTER ANALYSIS

    Get PDF
    K-Means Cluster is a cluster analysis for continuous variables with the concept of distance used is a euclidean distance where that distance is used as observation variables which are uncorrelated with each other. The case with the type data that is correlated categorical can be solved either by Nonlinear Principal Component Analysis or by making categorical data into numerical data by the method called successive interval and then used Principal Component Analysis. By comparing the ratio of the variance within cluster and between cluster in poverty data of East Nusa Tenggara Province in K-Means cluster obtained that Principal Component Analysis with Successive interval has a smaller variance ratio than Nonlinear Principal Component Analysis. Variables that take effect to the clusterformation are toilet, fuel,and job.Keywords: K-Means Cluster Analysis, Nonlinear Principal Component Analysis, Principal Component Analysis, Successive interval

    APPLICATION OF PENALIZED SPLINE-SPATIAL AUTOREGRESSIVE MODEL TO HIV CASE DATA IN INDONESIA

    Get PDF
    Spatial regression analysis is a statistical method used to perform modeling by considering spatial effects. Spatial models generally use a parametric approach by assuming a linear relationship between explanatory and response variables. The nonparametric regression method is better suited for data with a nonlinear connection because it does not need linear assumptions. One of the nonparametric regression methods is penalized spline regression (P-Spline). The P-spline has a simple mathematical relationship with mixed linear model. The use of a mixed linear model allows the P-Spline to be combined with other statistical models. PS-SAR is a combination of the P-Spline and the SAR spatial model so that it can analyze spatial data with a semiparametric approach. Based on data from monitoring the development of the HIV situation in 2018, the number of HIV cases in Indonesia shows a clustered pattern that indicate spatial dependence. In addition, the relationship between the number of positive cases and the factors tends to be nonlinear. Therefore, this study aims to apply the PS-SAR model to HIV case data in Indonesia. The resulting model is evaluated based on the estimates of autoregressive spatial coefficient, MSE, MAPE, and Pseudo R2. Based on the results, the PS-SAR model has an autoregressive spatial coefficient similar to the SAR model and has smaller MSE and MAPE than the SAR model

    Regression for Exploring Rainfall Pattern in Indramayu Regency

    Get PDF
    Quantile regression is an important tool for conditional quantiles estimation of a response Y for a given vector of covariates X. It can be used to measure the effect of covariates not only in the center of a distribution, but also in the upper and lower tails. Regression coefficients for each quantile can be estimated through an objective function which is weighted average absolute errors. Each quantile regression characterizes a particular aspect of a conditional distribution. Thus we can combine different quantile regressions to describe more completely the underlying conditional distribution. The analysis model of quantile regression would be specifically useful when the conditional distribution is not a normal shape, such as an asymmetric distribution or truncated distribution. In general, rainfall in Indramayu regency during 1972-2001 at 23 stations is highly variable in amount across time (month)andspace. So,the first objective of the research is reducing the variability in space using classification of the rainfall stations. The second objective is modelling the variability in time using quantile regression for every cluster of rainfall stations. The result shows that there are two clusters of rainfall stations. The first cluster has higher amount of rainfall than the second cluster. The coefficient of quantile regression for quantile 50 and 75 percent are similar, but for quantile 5 and 90 percent are very different. Exploring pattern of rainfall using quantile regression can detect normal or extreme rainfall that very useful in agricultural

    PRE-PROCESSING DATA ON MULTICLASS CLASSIFICATION OF ANEMIA AND IRON DEFICIENCY WITH THE XGBOOST METHOD

    Get PDF
    Anemia and iron deficiency are health problems in Indonesia and globally. In Multiclass Classification, data problems often occur, such as missing data, too many variables, and unbalanced data. Then pre-processing data will be carried out using MissForest imputation, Boruta featuring selection, and SMOTE to help improve the performance of the classification model in predicting a particular class. After the data pre-processing process is carried out, classification modeling will be carried out using the XGBoost algorithm. It was found that when pre-processing the data could improve the performance of the model in predicting multiclass classification for cases of anemia and iron deficiency in women in Indonesia by 0.815 for the accuracy value and 0.9693 for the AUC valu

    PERBANDINGAN METODE KEKAR BIWEIGHT MIDCOVARIANCE DAN MINIMUM COVARIANCE DETERMINANT DALAM ANALISIS KORELASI KANONIK

    Get PDF
    Canonical Correlation Analysis(CCA) is a multivariate linear used toidentify and quantify associationsbetween two sets of random variables. Itsstandard computation is based on samplecovariance matrices, which are howeververy sensitive to outlying observations.The robust methods are needed. Thereare two robust methods, i.e robustBiweight Midcovariance (BICOV) andMinimum Covariance Determinant(MCD) methods. The objective of thisresearch is to compare the performanceof both methods based on mean squareerror. The data simulations aregenerated from various conditions. Thevariation data consists of the proportionof outliers, and the kind of outliers: shift,scale, and radial outlier. Theperformance of robust BICOV method inCCA is the best compared to MCD andClassi
    corecore