Search CORE

20 research outputs found

Multi-objective evolutionary GAN for tabular data synthesis

Author: Allmendinger Richard
Elliot Mark
Little Claire
Nasution Bahrul Ilmi
Ran Nian
Publication venue
Publication date: 15/04/2024
Field of study

Synthetic data has a key role to play in data sharing by statistical agencies and other generators of statistical data products. Generative Adversarial Networks (GANs), typically applied to image synthesis, are also a promising method for tabular data synthesis. However, there are unique challenges in tabular data compared to images, eg tabular data may contain both continuous and discrete variables and conditional sampling, and, critically, the data should possess high utility and low disclosure risk (the risk of re-identifying a population unit or learning something new about them), providing an opportunity for multi-objective (MO) optimization. Inspired by MO GANs for images, this paper proposes a smart MO evolutionary conditional tabular GAN (SMOE-CTGAN). This approach models conditional synthetic data by applying conditional vectors in training, and uses concepts from MO optimisation to balance disclosure risk against utility. Our results indicate that SMOE-CTGAN is able to discover synthetic datasets with different risk and utility levels for multiple national census datasets. We also find a sweet spot in the early stage of training where a competitive utility and extremely low risk are achieved, by using an Improvement Score. The full code can be downloaded from https://github.com/HuskyNian/SMO\_EGAN\_pytorch

The University of Manchester - Institutional Repository

Revisiting social vulnerability analysis in Indonesia data

Author: Agustina Neli
Kurniawan Robert
Nasution Bahrul Ilmi
Yuniarto Budi
Publication venue
Publication date: 23/12/2021
Field of study

This paper presents the dataset about the social vulnerability in Indonesia. This dataset contains several dimensions which rely on previous studies. The data was compiled mainly from the 2017 National Socioeconomic Survey (SUSENAS) done by BPS-Statistics Indonesia. We utilize the weight to obtain the estimation based on multistage sampling. We also received additional information on population, the number, and population growth from the BPS-Statistics Indonesia's 2017 Population projection. Furthermore, we provide the distance matrix as the supplementary information and the number of populations to do the Fuzzy Geographically Weighted Clustering (FGWC). This data can be utilized to do further analysis of social vulnerability to promote disaster management. The data can be accessed further at https://raw.githubusercontent.com/bmlmcmc/naspaclust/main/data/sovi_data.csv

PubMed Central

The University of Manchester - Institutional Repository

The Effects of COVID-19 and Workplace Mobility to Stock Price and Exchange Rate in Indonesia: An Econometric Approach

Author: Kurniawan Novianto Budi
Nasution Bahrul Ilmi
Ragamustari Safendrri Komara
Publication venue
Publication date: 06/07/2022
Field of study

The University of Manchester - Institutional Repository

Investment and Unemployment Reduction: An Empirical Study of Indonesia using Panel Data Regression

Author: Nasution Bahrul Ilmi
Siregar Sri Indriyani
Tarigan Adelia Christine Br
Publication venue
Publication date: 15/02/2021
Field of study

The University of Manchester - Institutional Repository

Using Harris hawk optimization towards support vector regression to ozone prediction

Author: Caraka Rezzy Eko
Kurniawan Robert
Nasution Bahrul Ilmi
Setiawan I. Nyoman
Publication venue
Publication date: 30/01/2022
Field of study

As an area experiencing air pollution, especially ozone concentrations that often exceed the threshold or are unhealthy, JABODETABEK (Jakarta, Bogor, Depok, Tangerang, and Bekasi) seeks to prevent and control pollution as well as restore air quality. Therefore, this study aims to build a predictive model of ozone concentration using Harris hawks optimization-support vector regression (HHO-SVR) in 14 sub-districts in JABODETABEK. This goal is achieved by collecting data on ozone concentration as a response variable and meteorological factors as predictor variables from the website that provides the data. Other predictor variables such as time and significant lag detected with partial autocorrelation function of ozone concentration were also used. Then the variables will be selected using the recursive feature elimination-support vector regression (RFE-SVR) to obtain a significant predictor variable that affects the ozone concentration. After that, the prediction model will be built using the HHO-SVR method, support vector regression (SVR) whose parameter values are optimized with the Harris hawks optimization (HHO) algorithm. When the model has been formed, several evaluation metrics used to determine the best model include mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), Coefficient of Determination (R(2)), Variance Ratio (VR), and Diebold–Mariano test. The results of this study indicate that lag 1, lag 2, air temperature, humidity, and UV index are significant predictor variables of the RFE-SVR results for most sub-districts. In general, the HHO process takes longer than other metaheuristic algorithms. On average, 7 of the 14 sub-districts using the HHO-SVR model yielded the best predictions with MAE below 10, RMSE and MAPE below 20, R(2) around 0.97, and VR around 0.98. Then, the results of the Diebold–Mariano test also show that the accuracy of the prediction results and the stability of the performance of the HHO-SVR model is better, especially for the Ciputat and South Bekasi sub-districts. This shows that the two sub-districts are very suitable to use HHO-SVR in predicting ozone concentrations

PubMed Central

The University of Manchester - Institutional Repository

Data Analysis and Synthesis of COVID-19 Patients using Deep Generative Models: A Case Study of Jakarta, Indonesia

Author: Bhaswara Irfan Dwiki
Kanggrawan Juan Intan
Nasution Bahrul Ilmi
Nugraha Yudhistira
Publication venue
Publication date: 26/09/2022
Field of study

The University of Manchester - Institutional Repository

Air Pollution Index (API) Analysis at Jakarta in 2019-2020 using Fuzzy C-Means and Gaussian Mixture Model

Author: Aminanto Muhammad Erza
Kanggrawan Juan Intan
Nasution Bahrul Ilmi
Nugraha Yudhistira
Situmorang Melva Hilda Stephanie
Publication venue: Association for Computing Machinery
Publication date: 22/11/2022
Field of study

The University of Manchester - Institutional Repository