2 research outputs found

    Random forest age estimation model based on length of left hand bone for Asian population

    Get PDF
    In forensic anthropology, age estimation is used to ease the process of identifying the age of a living being or the body of a deceased person. Nonetheless, the specialty of the estimation models is solely suitable to a specific people. Commonly, the models are inter and intra-observer variability as the qualitative set of data is being used which results the estimation of age to rely on forensic experts. This study proposes an age estimation model by using length of bone in left hand of Asian subjects range from newborn up to 18-year-old. One soft computing model, which is Random Forest (RF) is used to develop the estimation model and the results are compared with Artificial Neural Network (ANN) and Support Vector Machine (SVM), developed in the previous case studies. The performance measurement used in this study and the previous case study are R-square and Mean Square Error (MSE) value. Based on the results produced, the RF model shows comparable results with the ANN and SVM model. For male subjects, the performance of the RF model is better than ANN, however less ideal than SVM model. As for female subjects, the RF model overperfoms both ANN and SVM model. Overall, the RF model is the most suitable model in estimating age for female subjects compared to ANN and SVM model, however for male subjects, RF model is the second best model compared to the both models. Yet, the application of this model is restricted only to experimental purpose or forensic practice

    Identification of pathway and gene markers using enhanced directed random walk for multiclass cancer expression data

    Get PDF
    Cancer markers play a significant role in the diagnosis of the origin of cancers and in the detection of cancers from initial treatments. This is a challenging task owing to the heterogeneity nature of cancers. Identification of these markers could help in improving the survival rate of cancer patients, in which dedicated treatment can be provided according to the diagnosis or even prevention. Previous investigations show that the use of pathway topology information could help in the detection of cancer markers from gene expression. Such analysis reduces its complexity from thousands of genes to a few hundreds of pathways. However, most of the existing methods group different cancer subtypes into just disease samples, and consider all pathways contribute equally in the analysis process. Meanwhile, the interaction between multiple genes and the genes with missing edges has been ignored in several other methods, and hence could lead to the poor performance of the identification of cancer markers from gene expression. Thus, this research proposes enhanced directed random walk to identify pathway and gene markers for multiclass cancer gene expression data. Firstly, an improved pathway selection with analysis of variances (ANOVA) that enables the consideration of multiple cancer subtypes is performed, and subsequently the integration of k-mean clustering and average silhouette method in the directed random walk that considers the interaction of multiple genes is also conducted. The proposed methods are tested on benchmark gene expression datasets (breast, lung, and skin cancers) and biological pathways. The performance of the proposed methods is then measured and compared in terms of classification accuracy and area under the receiver operating characteristics curve (AUC). The results indicate that the proposed methods are able to identify a list of pathway and gene markers from the datasets with better classification accuracy and AUC. The proposed methods have improved the classification performance in the range of between 1% and 35% compared with existing methods. Cell cycle and p53 signaling pathway were found significantly associated with breast, lung, and skin cancers, while the cell cycle was highly enriched with squamous cell carcinoma and adenocarcinoma
    corecore