13 research outputs found
Recommended from our members
Contributions to Ensembles of Models for Predictive Toxicology Applications. On the Representation, Comparison and Combination of Models in Ensembles.
The increasing variety of data mining tools offers a large palette
of types and representation formats for predictive models. Managing
the models then becomes a big challenge, as well as reusing the
models and keeping the consistency of model and data repositories.
Sustainable access and quality assessment of these models become
limited to researchers. The approach for the Data and Model Governance
(DMG) makes easier to process and support complex solutions.
In this thesis, contributions are proposed towards ensembles
of models with a focus on model representation, comparison and
usage.
Predictive Toxicology was chosen as an application field to demonstrate
the proposed approach to represent predictive models linked
to data for DMG. Further analysing methods such as predictive models
comparison and predictive models combination for reusing the
models from a collection of models were studied. Thus in this thesis,
an original structure of the pool of models was proposed to
represent predictive toxicology models called Predictive Toxicology
Markup Language (PTML). PTML offers a representation scheme for
predictive toxicology data and models generated by data mining tools.
In this research, the proposed representation offers possibilities
to compare models and select the relevant models based on different
performance measures using proposed similarity measuring techniques.
The relevant models were selected using a proposed cost
function which is a composite of performance measures such as
Accuracy (Acc), False Negative Rate (FNR) and False Positive Rate
(FPR). The cost function will ensure that only quality models be
selected as the candidate models for an ensemble.
The proposed algorithm for optimisation and combination of Acc,
FNR and FPR of ensemble models using double fault measure as
the diversity measure improves Acc between 0.01 to 0.30 for all toxicology
data sets compared to other ensemble methods such as Bagging,
Stacking, Bayes and Boosting. The highest improvements for
Acc were for data sets Bee (0.30), Oral Quail (0.13) and Daphnia
(0.10). A small improvement (of about 0.01) in Acc was achieved
for Dietary Quail and Trout. Important results by combining all
the three performance measures are also related to reducing the
distance between FNR and FPR for Bee, Daphnia, Oral Quail and
Trout data sets for about 0.17 to 0.28. For Dietary Quail data set
the improvement was about 0.01 though, but this data set is well
known as a difficult learning exercise. For five UCI data sets tested,
similar results were achieved with Acc improvement between 0.10 to
0.11, closing more the gaps between FNR and FPR.
As a conclusion, the results show that by combining performance
measures (Acc, FNR and FPR), as proposed within this thesis, the
Acc increased and the distance between FNR and FPR decreased
Phosphatidylinositol (4,5)-bisphosphate turnover by INP51 regulates the cell wall integrity pathway in "Saccharomyces cerevisiae"
Signal transduction pathways are important for the cell to transduce external or internal stimuli where second messengers play an important role as mediators of the stimuli. One important group of second messengers are the phosphoinositide family present in organisms ranging from yeast to mammals. The dephosphorylation and phosphorylation cycle of the phosphatidylinositol species are thought to be important in signaling for recruitment or activation of proteins involved in vesicular transport and/or to control the organization of the actin cytoskeleton. In mammals, phosphatidylinositol (4,5)bisphosphate (PI(4,5)P2) signaling is essential and regulated by various kinases and phosphatases. In the model organism Saccharomyces cerevisiae PI(4,5)P2 signaling is also essential but the regulation remains unclear. My dissertation focuses on the regulation of PI(4,5)P2 signaling in Saccharomyces cerevisiae. The organization of the actin cytoskeleton in Saccharomyces cerevisiae is regulated by different proteins such as calmodulin, CMD1, and here I present data that CMD1 plays a role in the regulation of the only phosphatidylinositol 4-phosphate 5-kinase, MSS4, in Saccharomyces cerevisiae. CMD1 regulates MSS4 activity through an unknown mechanism and thereby controls the organization of the actin cytoskeleton. MSS4 and CMD1 do not physically interact but MSS4 seems to be part of a large molecular weight complex as shown by gel filtration chromatography. This complex could contain regulators of the MSS4 activity. The complex is not caused by dimerization of MSS4 since MSS4 does not interact with itself. Two pathways, the cell wall integrity pathway and TORC2 (target of rapamycin complex 2) signaling cascade are important for the organization of the actin cytoskeleton. Loss of TOR2 function results in a growth defect that can be suppressed by MSS4 overexpression. To further characterize the link between MSS4 and the TORC2 signaling pathway and the cell wall integrity pathway we looked for targets of PI(4,5)P2. The TORC2 pathway and the cell wall integrity pathway signal to the GEF ROM2, an activator of the small GTPase RHO1. In our study we identified ROM2 as a target of PI(4,5)P2 signaling. We observed that the ROM2 localization changes in an mss4 conditional mutant. This suggests that the proper localization needs PI(4,5)P2. This could be mediated by the putative PI(4,5)P2 binding pleckstrin homology (PH) domain of ROM2. To better understand the regulation of PI(4,5)P2 levels in Saccharomyces cerevisiae we
focused on one of the PI(4,5)P2 5-phosphatases, INP51. Here we present evidence that
INP51 is a new negative regulator of the cell wall integrity pathway as well as the TORC2
pathway. INP51 probably regulates these two pathways by the turnover of PI(4,5)P2
thereby inactivating the effector/s. The deletion of INP51 does not result in any phenotype,
but when combined with mutations of the cell wall integrity pathway we observe synthetic
interaction.
INP51 together with the GTPase activating protein (GAP) SAC7, responsible for the
negative regulation of RHO1, negatively regulates the cell wall integrity pathway during
vegetative growth. One of the targets of cell wall integrity pathway, the cell wall
component chitin, which is normally deposited at the bud end, bud neck and forms bud
scars, is delocalized in the mother cell in the sac7 inp51 double deletion mutant. In
addition, another downstream component of the cell wall integrity pathway, the MAP
kinase MPK1, has increased phosphorylation and protein level in the sac7 inp51 double
deletion mutant. This suggests that INP51 is important for the negative regulation of the
cell wall integrity pathway.
Furthermore, we show evidence that INP51 forms a complex with TAX4 or IRS4, with two
EH-domain containing proteins, that positively regulates the activity of INP51 and in this
manner negatively regulate the cell wall integrity pathway. The EH-domain is known to
bind the NPF-motif. This motif is present in INP51 and is important for INP51 interaction
with TAX4 or IRS4. The EH-NPF interaction is a conserved mechanism to build up
protein networks. The interaction between an EH-domain containing protein and a
PI(4,5)P2 5-phosphatase is conserved. This is demonstrated by the epidermal growth factor
substrate EPS15 (EH) interaction with the PI(4,5)P2 5-phosphatase synaptojanin the
mammalian orthologue of the Saccharomyces cerevisiae INP proteins.
In summary, INP51 together with TAX4 and IRS4, forms complexes important for
regulation of PI(4,5)P2 levels. The complexes are linked to the TORC2 signaling pathway
and the cell wall integrity pathway, specifically regulating MPK1 activation and chitin
biosynthesis. The work presented in this dissertation facilitates the development of a model
of the complex regulation of PI(4,5)P2 signaling in Saccharomyces cerevisiae
Improving Accuracy and Performance of Customer Churn Prediction Using Feature Reduction Algorithms
Prediction of customer churn is one of the most essential activities in Customer Relationship Management (CRM). However, the state-of-the-art of the customer churn prediction approach only focuses on the classifier selection in improving the accuracy and performance of churn prediction, but rarely contemplate the feature reduction algorithms. Furthermore, there are numerous attributes that contribute to customer churn and it is crucial to determine the most substantial features in order to acquire the highest prediction accuracy and to improve the prediction performance. Feature reduction decreases the dimensionality of the information and may allow learning algorithms to function faster and more effectively and able to produce predictive models that deliver the highest rate of accuracy. In this research, we investigated and proposed two (2) different feature reduction algorithms which are Correlation based Feature Selection (CFS) and Information Gain (IG) and built classification models based on three 3) different classifiers, namely Bayes Net, Simple Logistic and Decision Table. Experimental results demonstrate that the performance of classifiers improves with the application of features reduction of the customer churn data set. A CFS feature reduction algorithm with the Decision Table classifier yields the highest accuracy of 92.08% and has the lowest RMSE of 0.2554. This study recommends the use of feature reduction algorithms in the context of CRM for churn prediction to improve accuracy and performance of customer churn prediction
A review on missing tags detection approaches in RFID system
Radio Frequency Identification (RFID) system can provides automatic detection on very large number of tagged objects within short time. With this advantage, it is been using in many areas especially in the supply chain management, manufacturing and many others. It has the ability to track individual object all away from the manufacturing factory until it reach the retailer store. However, due to its nature that depends on radio signal to do the detection, reading on tagged objects can be missing due to the signal lost. The signal lost can be caused by weak signal, interference and unknown source. Missing tag detection in RFID system is truly significant problem, because it makes system reporting becoming useless, due to the misleading information generated from the inaccurate readings. The missing detection also can invoke fake alarm on theft, or object left undetected and unattended for some period. This paper provides review regarding this issue and compares some of the proposed approaches including Window Sub-range Transition Detection (WSTD), Efficient Missing-Tag Detection Protocol (EMD) and Multi-hashing based Missing Tag Identification (MMTI) protocol. Based on the reviews it will give insight on the current challenges and open up for a new solution in solving the problem of missing tag detection
Shape-Based Single Object Classification Using Ensemble Method Classifiers
Nowadays, more and more images are available. Annotation and retrieval of the images pose classification problems, where each class is defined as the group of database images labelled with a common semantic label. Various systems have been proposed for content-based retrieval, as well as for image classification and indexing. In this paper, a hierarchical classification framework has been proposed for bridging the semantic gap effectively and achieving multi-category image classification. A well-known pre-processing and post-processing method was used and applied to three problems; image segmentation, object identification and image classification. The method was applied to classify single object images from Amazon and Google datasets. The classification was tested for four different classifiers; BayesNetwork (BN), Random Forest (RF), Bagging and Vote. The estimated classification accuracies ranged from 20% to 99% (using 10-fold cross validation). The Bagging classifier presents the best performance, followed by the Random Forest classifier
Chemical composition, pH value, and points of zero charge of high calcium and high iron electric arc furnace slag
Electric arc furnace (EAF) slag as filter media has been extensively used nowadays for wastewater treatment technology. Steel slag was produced as byproduct from steelmaking processes. However, different batches of steel slag production produce different composition. Thus, this study determined the chemical composition, pH value and points of zero charge (PZC) of two different samples of electric arc furnace (EAF) slag; high iron EAF slag (Slag HFe) and high calcium EAF slag (Slag HCa). The steel slag were characterized using Xray Fluorescence Spectroscopy (XRF) analysis for the chemical composition, extraction with boiling water for pH value, and salt addition method for PZC. Slag HFe was mainly consisted of 38.2% ferric oxide and 20.4% calcium oxide, 10.20 pH value and pH 10.55 for PZC. While for Slag HCa, they were composed of 1.64% ferric oxide and 49.5% calcium oxide of pH value of 11.11 and pH 11.75 for PZC. Therefore, Slag HCa was considered as a more basic species compared to Slag HFe
A Survey of Machine Learning Techniques for Behavioral-Based Biometric User Authentication
Authentication is a way to enable an individual to be uniquely identified usually based on passwords and personal identification number (PIN). The main problems of such authentication techniques are the unwillingness of the users to remember long and challenging combinations of numbers, letters, and symbols that can be lost, forged, stolen, or forgotten. In this paper, we investigate the current advances in the use of behavioral-based biometrics for user authentication. The application of behavioral-based biometric authentication basically contains three major modules, namely, data capture, feature extraction, and classifier. This application is focusing on extracting the behavioral features related to the user and using these features for authentication measure. The objective is to determine the classifier techniques that mostly are used for data analysis during authentication process. From the comparison, we anticipate to discover the gap for improving the performance of behavioral-based biometric authentication. Additionally, we highlight the set of classifier techniques that are best performing for behavioral-based biometric authentication
The Application of Apriori Algorithm in Predicting Flood Areas
The changing of physical characteristics of the hydrological system have caused a lot of natural phenomenon, which leads to flooding as one of the major problems that cause economic damages and affect people’s life. Therefore, the need for a systematic and comprehensive approach to flood area prediction is needed. This research proposed a flood area prediction model with the application of Apriori algorithm towards hydrological data sets. Department of Irrigation and Drainage Malaysia supply the data sets and flood report from year 2009 to 2015 (November until January) which consist of 7 district. The research begins with the data selection, pre-process the data, and data transformation, then the cleaned data will be tested with the Apriori algorithm. The rules will be evaluating using support, confidence and lift value to rank either it is best rules or not. The results show that each district generates best and crucial rules which consist the association of the villages and water level. Thus, hopefully the result can be use in flood management and can give early an early warning to the villagers at flood risk area
Heuristic Evaluation Of i-Dyslex Tool for Dyslexia Screening
Early detection for dyslexia is crucial in order for children to receive early as well as proper treatment. There are various studies that have focused on early detection of dyslexia, however the results remain limited. Therefore, an easy and user-friendly dyslexia screening tool called i-Dyslex was developed. In order to make sure the tool is free from design and interface problems, heuristic evaluation has been carried out. This paper discusses the heuristic evaluation of i-Dyslex tool for dyslexia screening among expert evaluators. This study adopted ten Usability Heuristics to be included in the questionnaire. Overall result derived from the evaluation is above average mean score, which are neutral (3.00) in one domain. Several comments and feedback from the experts. Both the experts’ evaluation and the feedback were essentials for further improvement of the i-Dyslex tool to ensure meets the user requirement and expectation