Search CORE

271 research outputs found

Imputation of Rainfall Data Using the Sine Cosine Function Fitting Neural Network

Author: Chan Chiu Po
Fenza Giuseppe
Herrera-Viedma Enrique
Krejcar Ondrej
Kuok Kuok King
Selamat Ali
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 01/01/2021
Field of study

Missing rainfall data have reduced the quality of hydrological data analysis because they are the essential input for hydrological modeling. Much research has focused on rainfall data imputation. However, the compatibility of precipitation (rainfall) and non-precipitation (meteorology) as input data has received less attention. First, we propose a novel pre-processing mechanism for non-precipitation data by using principal component analysis (PCA). Before the imputation, PCA is used to extract the most relevant features from the meteorological data. The final output of the PCA is combined with the rainfall data from the nearest neighbor gauging stations and then used as the input to the neural network for missing data imputation. Second, a sine cosine algorithm is presented to optimize neural network for infilling the missing rainfall data. The proposed sine cosine function fitting neural network (SC-FITNET) was compared with the sine cosine feedforward neural network (SCFFNN), feedforward neural network (FFNN) and long short-term memory (LSTM) approaches. The results showed that the proposed SC-FITNET outperformed LSTM, SC-FFNN and FFNN imputation in terms of mean absolute error (MAE), root mean square error (RMSE) and correlation coefficient (R), with an average accuracy of 90.9%. This study revealed that as the percentage of missingness increased, the precision of the four imputation methods reduced. In addition, this study also revealed that PCA has potential in pre-processing meteorological data into an understandable format for the missing data imputation

Directory of Open Access Journals

Universiti Teknologi Malaysia Institutional Repository

Re-UNIR

A systematic review of data quality issues in knowledge discovery tasks

Author: Corrales David Camilo
Corrales Juan Carlos
Ledezma Agapito Ismael
Publication venue: 'Universidad de Medellin'
Publication date: 07/11/2015
Field of study

Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Universidad de Medellín: Revistas Científicas

Repositorio Institucional Universidad de Medellín

DIALNET

Metaheuristic Algorithms to Enhance the Performance of a Feedforward Neural Network in Addressing Missing Hourly Precipitation

Author: Bakri Muhammad Khusairy
Gato-Trinidad Shirley
Kuok King Kuok
Lai Wai Yan
Rahman Md. Rezaur
Publication venue: 'Penerbit UTHM'
Publication date: 04/04/2023
Field of study

This research study investigates the implementation of three metaheuristic algorithms, namely, Grey wolf optimizer (GWO), Multi-verse optimizer (MVO), and Moth-flame optimisation (MFO), for coupling with a feedforward neural network (FNN) in addressing missing hourly rainfall observations, while overcoming the limitation of conventional training algorithm of artificial neural network that often traps in local optima. The proposed GWOFNN, MVOFNN, and MFOFNN were compared against the conventional Levenberg Marquardt Feedforward Neural Network (LMFNN) in addressing the artificially introduced missing hourly rainfall records of Kuching Third Mile Station. The findings show that the proposed approaches are superior to LMFNN in predicting the 20% hourly rainfall observations in terms of mean absolute error (MAE) and coefficient of correlation (r). The best performance ANN model is GWOFNN, followed with MVOFNN, MFOFNN and lastly LMFNN

Journals of Universiti Tun Hussein Onn Malaysia (UTHM)

Metaheuristic Algorithms to Enhance the Performance of a Feedforward Neural Network in Addressing Missing Hourly Precipitation

Author: Bakri Muhammad Khusairy
Gato-Trinidad Shirley
Kuok King Kuok
Lai Wai Yan
Rahman Md. Rezaur
Publication venue: 'Penerbit UTHM'
Publication date: 04/04/2023
Field of study

International Journal of Integrated Engineering

Imputation of rainfall data using the sine cosine function fitting neural network

Author: Chiu Po Chan
Enrique Herrera-Viedma
Fenza Giuseppe
Krejcar Ondrej
Kuok Kuok King
Selamat Ali
Publication venue: 'Universidad Internacional de La Rioja'
Publication date: 01/01/2021
Field of study

Missing rainfall data have reduced the quality of hydrological data analysis because they are the essential input for hydrological modeling. Much research has focused on rainfall data imputation. However, the compatibility of precipitation (rainfall) and non-precipitation (meteorology) as input data has received less attention. First, we propose a novel pre-processing mechanism for non-precipitation data by using principal component analysis (PCA). Before the imputation, PCA is used to extract the most relevant features from the meteorological data. The final output of the PCA is combined with the rainfall data from the nearest neighbor gauging stations and then used as the input to the neural network for missing data imputation. Second, a sine cosine algorithm is presented to optimize neural network for infilling the missing rainfall data. The proposed sine cosine function fitting neural network (SC-FITNET) was compared with the sine cosine feedforward neural network (SC-FFNN), feedforward neural network (FFNN) and long short-term memory (LSTM) approaches. The results showed that the proposed SC-FITNET outperformed LSTM, SC-FFNN and FFNN imputation in terms of mean absolute error (MAE), root mean square error (RMSE) and correlation coefficient (R), with an average accuracy of 90.9%. This study revealed that as the percentage of missingness increased, the precision of the four imputation methods reduced. In addition, this study also revealed that PCA has potential in pre-processing meteorological data into an understandable format for the missing data imputation

Universiti Teknologi Malaysia Institutional Repository

A framework for cloud cover prediction using machine learning with data imputation

Author: Mandal Nabanita
Sarode Tanuja
Publication venue: Institute of Advanced Engineering and Science
Publication date: 01/02/2024
Field of study

The climatic conditions of a region are affected by multiple factors. These factors are dew point temperature, humidity, wind speed, and wind direction. These factors are closely related to each other. In this paper, the correlation between these factors is studied and an approach has been proposed for data imputation. The idea is to utilize all these features to obtain the prediction of the total cloud cover of a region instead of removing the missing values. Total cloud cover prediction is significant because it affects the agriculture, aviation, and energy sectors. Based on the imputed data which is obtained as the output of the proposed method, a machine learning-based model is proposed. The foundation of this proposed model is the bi-directional approach of the long short-term memory (LSTM) model. It is trained for 8 stations for two different approaches. In the first approach, 80% of the entire data is considered for training and 20% of the data is considered for testing. In the second approach, 90% of the entire data is accounted for training and 10% of the data is accounted for testing. It is observed that in the first approach, the model gives less error for prediction

Institute of Advanced Engineering and Science

A Comparison of Multiple Imputation Methods for Recovering Missing Data in Hydrological Studies

Author: Hamzah Fatimah Bibi
Mohd Hamzah Firdaus
Mohd Razali Siti Fatin
Samad Hafiza
Publication venue: 'Ital Publication'
Publication date: 01/09/2021
Field of study

Missing data is a common problem in hydrological studies; therefore, data reconstruction is critical, especially when it is crucial to employ all available resources, even incomplete records. Furthermore, missing data could have an impact on statistical analysis results, and the amount of variability in the data would not be fittingly anticipated. As a result, this study compared the performance of three imputation methods in predicting recurrence in streamflow datasets: robust random regression imputation (RRRI), k-nearest neighbours (k-NN), and classification and regression tree (CART). Furthermore, entire historical daily streamflow data from 2012 to 2014 (as training dataset) were utilised to assess and validate the effectiveness of the imputation methods in addressing missing streamflow data. Following that, all three methods coupled with multiple linear regression (MLR), were used to restore streamflow rates in Malaysia's Langat River Basin from 1978 to 2016. The estimation techniques effectiveness was evaluated using metrics inclusive of the Nash-Sutcliffe efficiency coefficient (CE), root-mean-square error (RMSE), and mean absolute percentage error (MAPE). The results confirmed that RRRI coupled with MLR (RRRI-MLR) had the lowest RMSE and MAPE values, outperforming all other techniques tested for filling missing data in daily streamflow datasets. This indicates that the RRRI-MLR is the best method for dealing with missing data in streamflow datasets. Doi: 10.28991/cej-2021-03091747 Full Text: PD

Civil Engineering Journal (C.E.J)

Interpolation of Missing Precipitation Data Using Kernel Estimations for Hydrologic Modeling

Author: Hyojin Lee
Kwangmin Kang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2015
Field of study

Crossref

PERBANDINGAN IMPUTASI DAN PARAMETER SUPPORT VECTOR REGRESSION UNTUK PERAMALAN CUACA

Author: Cholidhazia Putri
Priyatno Arif Mudi
Syuhada Fahmi
Wiratmo Agung
Publication venue: 'Universitas Muria Kudus'
Publication date: 29/11/2019
Field of study

Curah hujan adalah informasi penting di bidang transportasi, pertanian, industri dll. Dengan mengetahui informasi curah hujan, tindakan dapat diambil secara tepat di beberapa bidang tersebut. sehingga tidak ada kerugian karena kesalahan dalam informasi curah hujan. Makalah ini bertujuan untuk menemukan metode yang sesuai dalam peramalan curah hujan yang terkait dengan metode pemrosesan data imputasi dan nilai parameter dalam Support Vector Regression (SVR). Hasil percobaan menunjukkan bahwa metode preprocessing data imputasi terbaik diperoleh untuk digunakan ke dalam SVR berdasarkan nilai Mean Squared Error (MSE) dan Mean Absolute Error (MAE). Berdasarkan hasil MSE, k-nearest neighbor adalah metode terbaik yang digunakan untuk preprocessing data imputasi. Data preprocessing menghasilkan eksperimen pada SVR Polinomial dengan parameter C 1000, toleransi 0,001, epsilon 0,01 dan iterasi tak terbatas. Di sisi lain, hasil MAE menunjukkan bahwa Artificial Neural Network (ANN) adalah metode terbaik dalam imputasi data preprocessing. ANN dengan radial basis function kernel, gamma 0,001, C 1000, toleransi 0,001 dan iterasi tanpa batas. JST diuji pada RBF SVR dengan gamma 0,001, C 1000, toleransi 0,001 dan iterasi tak terbatas

E-Journal Universitas Muria Kudus