37 research outputs found

    Spatial and temporal torrential rainfall guided cluster pattern based on dimension reduction methods

    Get PDF
    This thesis identifies the spatial and temporal cluster patterns for torrential rainfall data in Peninsular Malaysia. Two dimension reduction methods are used to improve the cluster patterns of the torrential rainfall data. Firstly, a robust dimension reduction method in Principal Component Analysis (PCA) is used to rectify the issue of unbalanced clusters in rainfall patterns due to the skewed nature of rainfall data. A robust measure in PCA using Tukey’s biweight correlation to downweigh observations is introduced and the optimum breakdown point to extract the number of components in PCA using this approach is proposed. The simulated data indicates a breakdown optimum point of at 70% cumulative percentage of variance to give a good balance in extracting the number of components to avoid variations of low frequency or insignificant spatial scale in the clusters. The results show a significance improvement with the robust PCA than the PCA based Pearson correlation in terms of the average number of clusters obtained and its cluster quality. Secondly, based on the decomposing properties in Singular Spectrum Analysis (SSA), a two-way approach to identify the range of local time scale for a cluster of torrential rainfall pattern by discriminating the noise in a time series trend is introduced. Firstly, appropriate window length for the trajectory matrix and adjustments on the coinciding singular values obtained from the decomposed time series matrix based on a restricted singular value decomposition (SVD) using iterative oblique SSA (Iterative O-SSA) is proposed. In addition, a guided clustering method called Robust Sparse k-means (RSk-means) to discriminate the eigenvectors from this iterative procedure is suggested to identify the trend and noise components more objectively. The modified SSA indicates strongest separability between the reconstructed components based on a simulated skewed and short time series rainfall data to effectively identify the local time scale

    A comparison on classical-hybrid conjugate gradient method under exact line search

    Get PDF
    One of the popular approaches in modifying the Conjugate Gradient (CG) Method is hybridization. In this paper, a new hybrid CG is introduced and its performance is compared to the classical CG method which are Rivaie-Mustafa-Ismail-Leong (RMIL) and Syarafina-Mustafa-Rivaie (SMR) methods. The proposed hybrid CG is evaluated as a convex combination of RMIL and SMR method. Their performance are analyzed under the exact line search. The comparison performance showed that the hybrid CG is promising and has outperformed the classical CG of RMIL and SMR in terms of the number of iterations and central processing unit per time

    Identification of rainfall patterns on hydrological simulation using robust principal component analysis

    Get PDF
    A robust dimension reduction method in Principal Component Analysis (PCA) was used to rectify the issue of unbalanced clusters in rainfall patterns due to the skewed nature of rainfall data. A robust measure in PCA using Tukey’s biweight correlation to downweigh observations was introduced and the optimum breakdown point to extract the number of components in PCA using this approach is proposed. A set of simulated data matrix that mimicked the real data set was used to determine an appropriate breakdown point for robust PCA and compare the performance of the both approaches. The simulated data indicated a breakdown point of 70% cumulative percentage of variance gave a good balance in extracting the number of components. The results showed a more significant and substantial improvement with the robust PCA than the PCA based Pearson correlation in terms of the average number of clusters obtained and its cluster quality

    Performance comparison of group chain sampling plan and modified group chain sampling plan based on mean product lifetime for rayleigh distribution

    Get PDF
    The performance of a sampling plan from group sampling family is measured by its minimum number of groups and probability of lot acceptance. Basically, once the minimum number of groups is determined, the corresponding probability of acceptance can be obtained for various sets of design parameters. This article compares the performance of two acceptance sampling plans namely group chain sampling plan (GChSP) and modified group chain sampling plan (MGChSP-1) based on the mean product lifetime for Rayleigh distribution. GChSP and MGShSP were developed based on the operating procedure in both chain sampling plan (1955) and group sampling plan (2009). The findings proved that the MGChSP performed better than the GChSP

    Development and validation of early childhood care and education pre-service lecturer instrument

    Get PDF
    This paper presents to develop and validate the Early Childhood Care and Pre-Service Lecturer Instrument constructed to determine their level of competencies toward the quality of early childhood carers-educators’ professionalism in Malaysia. Components which affect the early childhood quality were characterized through inclusive literature reviews alongside interviews conducted with experts and experienced lecturers. In this study, two experts were elected to review this instrument so as to enhance its validity while 70 more lecturers in Malaysia were involved. There are four scales in principal component analysis pertaining the quality of early childhood professionalism, namely: (1) disposition, (2) knowledge, (3) skills, and (4) practices. The component loading range or respective instrument item were between 0.56 and 0.79, while the range for respective scales the alpha reliability coefficient were between 0.90 and 0.94. Concisely, the findings from this study corroborated the weight and consistency of the ECCE Pre-Service Lecturer Instrument

    Internet of things (IoT); security requirements, attacks and counter measures

    Get PDF
    Internet of Things (IoT) is a network of connected and communicating nodes. Recent developments in IoT have led to advancements like smart home, industrial IoT and smart healthcare etc. This smart life did bring security challenges along with numerous benefits. Monitoring and control in IoT is done using smart phone and web browsers easily. There are different attacks being launched on IoT layers on daily basis and to ensure system security there are seven basic security requirements which must be met. Here we have used these requirements for classification and subdivided them on the basis of attacks, followed by degree of their severity, affected system components and respective countermeasures. This work will not only give guidelines regarding detection and removal of attacks but will also highlight the impact of these attacks on system, which will be a decision point to safeguard system from high impact attacks on priority basis

    A Systematic Review of Anomaly Detection within High Dimensional and Multivariate Data

    Get PDF
    In data analysis, recognizing unusual patterns (outliers’ analysis or anomaly detection) plays a crucial role in identifying critical events. Because of its widespread use in many applications, it remains an important and extensive research brand in data mining. As a result, numerous techniques for finding anomalies have been developed, and more are still being worked on. Researchers can gain vital knowledge by identifying anomalies, which helps them make better meaningful data analyses. However, anomaly detection is even more challenging when the datasets are high-dimensional and multivariate. In the literature, anomaly detection has received much attention but not as much as anomaly detection, specifically in high dimensional and multivariate conditions. This paper systematically reviews the existing related techniques and presents extensive coverage of challenges and perspectives of anomaly detection within highdimensional and multivariate data. At the same time, it provides a clear insight into the techniques developed for anomaly detection problems. This paper aims to help select the best technique that suits its rightful purpose. It has been found that PCA, DOBIN, Stray algorithm, and DAE-KNN have a high learning rate compared to Random projection, ROBEM, and OCP methods. Overall, most methods have shown an excellent ability to tackle the curse of dimensionality and multivariate features to perform anomaly detection. Moreover, a comparison of each algorithm for anomaly detection is also provided to produce a better algorithm. Finally, it would be a line of future studies to extend by comparing the methods on other domain-specific datasets and offering a comprehensive anomaly interpretation in describing the truth of anomalies

    A comparative study of different imputation methods for daily rainfall data in east-coast Peninsular Malaysia

    Get PDF
    Rainfall data are the most significant values in hydrology and climatology modelling. However, the datasets are prone to missing values due to various issues. This study aspires to impute the rainfall missing values by using various imputation method such as Replace by Mean, Nearest Neighbor, Random Forest, Non-linear Interactive Partial Least-Square (NIPALS) and Markov Chain Monte Carlo (MCMC). Daily rainfall datasets from 48 rainfall stations across east-coast Peninsular Malaysia were used in this study. The dataset were then fed into Multiple Linear Regression (MLR) model. The performance of abovementioned methods were evaluated using Root Mean Square Method (RMSE), Mean Absolute Error (MAE) and Nash-Sutcliffe Efficiency Coefficient (CE). The experimental results showed that RF coupled with MLR (RF-MLR) approach was attained as more fitting for satisfying the missing data in east-coast Peninsular Malaysia

    A Systematic Review of Anomaly Detection within High Dimensional and Multivariate Data

    Get PDF
    In data analysis, recognizing unusual patterns (outliers’ analysis or anomaly detection) plays a crucial role in identifying critical events. Because of its widespread use in many applications, it remains an important and extensive research brand in data mining. As a result, numerous techniques for finding anomalies have been developed, and more are still being worked on. Researchers can gain vital knowledge by identifying anomalies, which helps them make better meaningful data analyses. However, anomaly detection is even more challenging when the datasets are high-dimensional and multivariate. In the literature, anomaly detection has received much attention but not as much as anomaly detection, specifically in high dimensional and multivariate conditions. This paper systematically reviews the existing related techniques and presents extensive coverage of challenges and perspectives of anomaly detection within highdimensional and multivariate data. At the same time, it provides a clear insight into the techniques developed for anomaly detection problems. This paper aims to help select the best technique that suits its rightful purpose. It has been found that PCA, DOBIN, Stray algorithm, and DAE-KNN have a high learning rate compared to Random projection, ROBEM, and OCP methods. Overall, most methods have shown an excellent ability to tackle the curse of dimensionality and multivariate features to perform anomaly detection. Moreover, a comparison of each algorithm for anomaly detection is also provided to produce a better algorithm. Finally, it would be a line of future studies to extend by comparing the methods on other domain-specific datasets and offering a comprehensive anomaly interpretation in describing the truth of anomalies

    Development of trace metals concentration model for river: application of principal component analysis and artificial neural network

    Get PDF
    Rapid development along the Kuantan River was long perceived as the rivers serve many communities in terms of drinking water source, domestic, fisheries, recreation, and agricultural purposes. Due to the rapid changes in technology and upsurge in chemical usage, pollutant alterations turn out to be more drastic with respect to space and time. Research on the trace metals in river water is quite limited in Malaysia, probably due to their ppb-level existence and the need for special handling techniques. Hence, the aim of this study is to forecast heavy metals concentration in Kuantan River waters using a collective of 10 years (2007 – 2016) dataset of heavy metals that provided by the Department of Environment, Malaysia. Principal Component Analysis (PCA) was used to compute the data, which showed that As, Cr, Fe, Zn and Cd explain 67.3% of the total variance through three principal components. For ANN computation, those significant metals extracted from rotating PCA was selected and used in ANN model. The developed approaches were trained and tested using 80% and 20% of the data, respectively. Then, the coefficient of determination (R2) was executed to calculate the model performance. Out of five metals, only As shown acceptable R2 for ANN models with 0.8690 and 0.8088 for training and testing, respectively, probably due to the model’s limitation. Generally, this study illustrates the usefulness of PCA and ANN for analysis and interpretation of complex data sets and understanding the temporal and spatial variations in the Kuantan River for effective river water management
    corecore