26 research outputs found
An improved directed random walk framework for cancer classification using gene expression data
Early diagnosis methods in cancer diagnosis studies are making great challenge as they
require the involvement of different fields. Deoxyribonucleic acid (DNA) microarray
analysis is one of the modern cancer diagnosis techniques used by scientists to measure
the gene expression level changes in gene expression data. From the perspective of
computing, an algorithm is developed to ease the diagnosis process, but the feasibility
is not reliable. Numerous cancer studies have combined different machine learning
techniques for the cancer diagnosis to improve the accuracy of cancer classification.
This study is conducted to improve the accuracy of cancer classification by introducing
an improved directed random walk (DRW) framework. This improved DRW
framework is proposed to identify risk pathway while correctly predict the significant
genes. It is named as significant directed walk (SDW) because of its ability to identify
significant genes for cancer. In this study, six gene expression datasets are applied to
study the effectiveness of the sub-algorithm, directed graph and classifier in SDW in
terms of cancer prediction and cancer classification. Sub-algorithms of SDW can be
further divided into data pre-processing phase, specific tuning parameter selection,
weight as additional variable, and exclusion of unwanted adjacency matrix. Besides
that, SDW also incorporated four directed graphs to study the usability of the directed
graph. The best directed graph among the four is chosen to be part of the structure in
SDW. This directed graph is the combination between KEGG pathway and PPI
network and named as walker network. The experimental results showed that the
combination of SDW with walker network and linear regression is the best among all.
SDW achieves an accuracy of 95.03% in average which is higher by 8.97% compare
to conventional DRW for all cancer datasets. This study provides a foundation for
further studies and research on early diagnosis of cancer with machine learning
technique. It is found that these findings would improve the early diagnosis methods
of cancer classification
Specific Tuning Parameter for Directed Random Walk Algorithm Cancer Classification
Accuracy of cancerous gene classification is a central challenge in clinical cancer research. Microarray-based gene biomarkers have proved the performance and its ability over traditional clinical parameters. However, gene biomarkers of an individual are less robustness due to litter reproducibility between different cohorts of patients. Several methods incorporating pathway information such as directed random walk have been proposed to infer the pathway activity. This paper discusses the implementation of group specific tuning parameter in directed random walk algorithm. In this experiment, gene expression data and pathway data are used as input data. Throughout this experiment, more significant pathway activities can be identified which increases the accuracy of cancer classification. The lung cancer gene is used as the experimental dataset, with which, the sDRW is used in determining significant pathways. More risk-active pathways are identified throughout this experiment
Analysis of Attribute Selection and Classification Algorithm Applied to Hepatitis Patients
Data mining techniques are widely used in classification, attribute selection and prediction in the field of bioinformatics because it helps to discover meaningful new correlations, patterns and trends by sifting through large volume of data, using pattern recognition technologies as well as statistical and mathematical techniques. Hepatitis is one of the most important health problem in the world. Many studies have been performed in the diagnosis of hepatitis disease but medical diagnosis is quite difficult and visual task which is mostly done by doctors. Therefore, this research is conducted to analyse the attribute selection and classification algorithm that applied to hepatitis patients. In order to achieve goals, WEKA tool is used to conduct the experiment with different attribute selector and classification algorithm . Hepatitis dataset that are used is taken from UC Irvine repository. This research deals with various attribute selector namely CfsSubsetEval, WrapperSubsetEval, GainRatioSubsetEval and CorrelationAttributeEval. The classification algorithm that used in this research are NaiveBayesUpdatable, SMO, KStar, RandomTree and SimpleLogistic. The results of the classification model are time and accuracy. Finally, it concludes that the best attribute selector is CfsSubsetEval while the best classifier is given to SMO because SMO performance is better than other classification techniques for hepatitis patients
Cost implication analysis of concrete and Masonry waste in construction project
Concrete and masonry waste are the main types of waste typically generated at a construction project. There is a lack of studies in the country regarding the cost implication of managing these types of construction waste To address this need in Malaysia, the study is carried out to measure the disposal cost of concrete and masonry waste. The study was carried out by a site visit method using an indirect measurement approach to quantify the quantity of waste generated at the project. Based on the recorded number of trips for waste collection, the total expenditure to dispose the waste were derived in three construction stages. Data was collected four times a week for the period July 2014 to July 2015. The total waste generated at the study site was 762.51 m3 and the cost incurred for the 187 truck trips required to dispose the waste generated from the project site to the nearby landfill was RM22,440.00. The findings will be useful to both researchers and policy makers concerned with construction waste
A direct proof of significant directed random walk
This paper is presented to disclose the relationship between weight and connectivity of nodes. An equation is formed to enhance the connectivity of nodes in directed graph via weigh. With implementation of references data, the adjacency matrix is further enhances to increases the accessibility of nodes via vector. The evolution of random walk is disclosed in this paper as well. Significant directed random walk will be used to prove the importance of weight in this paper
Performances analysis of heart disease dataset using different data mining classifications
Nowadays, heart disease is one of the major diseases that cause death. It is a matter for us to concern in today’s highly chaotic life style that leads to various diseases. Early prediction of identification to heart-related diseases has been investigated by many researchers. The death rate can be further brought down if we can predict or identify the heart disease earlier. There are many studies that explore the different classification algorithms for classification and prediction of heart disease. This research studied the prediction of heart disease by using five different techniques in WEKA tools by using the input attributes of the dataset. This research used 13 attributes, such as sex, blood pressure, cholesterol and other medical terms to detect the likelihood of a patient getting heart disease. The classification techniques, namely J48, Decision Stump, Random Forest, Sequential Minimal Optimization (SMO), and Multilayer Perceptron used to analyze the heart disease. Performance measurement for this study are the accuracy of correct classification, mean absolute error and kappa statistics of the classifier. The result shows that Multilayer Perceptron Neural Networks is the most suited for early prediction of heart diseases
Batu Pahat car workshops finder
Batu Pahat Car Workshops Finder is an application for driver to find the nearest car workshops with the current location and can get GPS navigation to the selected car workshop. The proposed application is because lots of car breakdown problem are appearing among the car, this problem become worst when driver is at some unfamiliar place. Therefore, Batu Pahat Car Workshops Finder is an application for user to solve the problem. The application helps user to repair their car while travelling and suddenly the car breakdown and they did not find any car service or repair station nearby. With this application, it will show the available car workshops nearby at user’s current location along with their contact number
WEB BASED MANAGEMENT SYSTEM FOR ENACTUS MALAYSIA NATIONAL CUP (E-EMNC)
Web Based Management System of Enactus Malaysia National Cup (E-EMNC) is event management system that
specially designed for the uses of Enactus Malaysia National Cup. Without a proper event management system,
Enactus Malaysia National Cup cannot be perfect enough due to several outburst issues such as miss update
of payment, long queue at registration counter, misunderstand of information and others. Thus, this event
management system was developed to overcome most of the problem that occurs before, during and after the
events. Related objectives are stated to solve the problem. System Development Life Cycle Waterfall model is used
as the methodology to develop this system. This system is developed by using PHP as server-side scripting and
Bootstrap as front end framework. The prototype of the system was considered success because it could navigate
student leader to complete submit the report by shown the flow of system. This system is expected to lighten the
burden of event committees and educated team leader on filing document through internet
A REVIEW ON FEATURE BASED APPROACH IN SEMANTIC SIMILARITY FOR MULTIPLE ONTOLOGY
Measuring semantic similarity between terms is an important step in information retrieval and information
integration which requires semantic content matching. Semantic similarity has attracted great concern for a long
time in artificial intelligence, psychology and cognitive science. This paper contains a review of the state of art
approaches including structure based approach, information content based approach, and feature based approach
and hybrid approach. We also discussed similarity according to their advantages, disadvantages and issues related
to multiple ontology especially on method in features based approach