18 research outputs found

    Learning Using Privileged Information: SVM+ and Weighted SVM

    Full text link
    Prior knowledge can be used to improve predictive performance of learning algorithms or reduce the amount of data required for training. The same goal is pursued within the learning using privileged information paradigm which was recently introduced by Vapnik et al. and is aimed at utilizing additional information available only at training time -- a framework implemented by SVM+. We relate the privileged information to importance weighting and show that the prior knowledge expressible with privileged features can also be encoded by weights associated with every training example. We show that a weighted SVM can always replicate an SVM+ solution, while the converse is not true and we construct a counterexample highlighting the limitations of SVM+. Finally, we touch on the problem of choosing weights for weighted SVMs when privileged features are not available.Comment: 18 pages, 8 figures; integrated reviewer comments, improved typesettin

    Training host-pathogen protein–protein interaction predictors

    Get PDF
    Detection of protein–protein interactions (PPIs) plays a vital role in molecular biology. Particularly, pathogenic infections are caused by interactions of host and pathogen proteins. It is important to identify host–pathogen interactions (HPIs) to discover new drugs to counter infectious diseases. Conventional wet lab PPI detection techniques have limitations in terms of cost and large-scale application. Hence, computational approaches are developed to predict PPIs. This study aims to develop machine learning models to predict inter-species PPIs with a special interest in HPIs. Specifically, we focus on seeking answers to three questions that arise while developing an HPI predictor: (1) How should negative training examples be selected? (2) Does assigning sample weights to individual negative examples based on their similarity to positive examples improve generalization performance? and, (3) What should be the size of negative samples as compared to the positive samples during training and evaluation? We compare two available methods for negative sampling: random versus DeNovo sampling and our experiments show that DeNovo sampling offers better accuracy. However, our experiments also show that generalization performance can be improved further by using a soft DeNovo approach that assigns sample weights to negative examples inversely proportional to their similarity to known positive examples during training. Based on our findings, we have also developed an HPI predictor called HOPITOR (Host-Pathogen Interaction Predictor) that can predict interactions between human and viral proteins. The HOPITOR web server can be accessed at the URL: http://faculty.pieas.edu.pk/fayyaz/software.html#HoPItor

    Public Procurement Crisis of Iraq and its Impact on Construction Projects

    Get PDF
      The public procurement crisis in Iraq plays a fundamental role in the delay in the implementation of construction projects at different stages of project bidding (pre, during, and after). The procurement system of any country plays an important role in economic growth and revival. The paper aims to use the fuzzy logic inference model to predict the impact of the public procurement crisis (relative importance index and Likert scale) was carried out at the beginning to determine the most important parameters that affect construction projects, the fuzzy analytical hierarchy process (FAHP) to set up, and finally, the fuzzy decision maker's (FDM) verification of the parameter for comparison with reality. Sixty-five construction projects in Iraq have been selected, and the most crucial crisis variables were used for calculating the weights and their importance, using the fuzzy logic inference model to verify the crisis parameters and the extent of their impact in preparation for predicting the mathematical model of public procurement parameters. After the algorithm had been completed, it was noted that the fast, messy genetic algorithm produced a little difference between training and testing (0.012% and 0.0057%), which is more reliable for predicting mean results from models. The paper’s major conclusion is that 18 crisis factors in public procurement through different stages affect construction projects in Iraq.

    Water filtration by using apple and banana peels as activated carbon

    Get PDF
    Water filter is an important devices for reducing the contaminants in raw water. Activated from charcoal is used to absorb the contaminants. Fruit peels are some of the suitable alternative carbon to substitute the charcoal. Determining the role of fruit peels which were apple and banana peels powder as activated carbon in water filter is the main goal. Drying and blending the peels till they become powder is the way to allow them to absorb the contaminants. Comparing the results for raw water before and after filtering is the observation. After filtering the raw water, the reading for pH was 6.8 which is in normal pH and turbidity reading recorded was 658 NTU. As for the colour, the water becomes more clear compared to the raw water. This study has found that fruit peels such as banana and apple are an effective substitute to charcoal as natural absorbent

    A novel approach to data mining using simplified swarm optimization

    Get PDF
    Data mining has become an increasingly important approach to deal with the rapid growth of data collected and stored in databases. In data mining, data classification and feature selection are considered the two main factors that drive people when making decisions. However, existing traditional data classification and feature selection techniques used in data management are no longer enough for such massive data. This deficiency has prompted the need for a new intelligent data mining technique based on stochastic population-based optimization that could discover useful information from data. In this thesis, a novel Simplified Swarm Optimization (SSO) algorithm is proposed as a rule-based classifier and for feature selection. SSO is a simplified Particle Swarm Optimization (PSO) that has a self-organising ability to emerge in highly distributed control problem space, and is flexible, robust and cost effective to solve complex computing environments. The proposed SSO classifier has been implemented to classify audio data. To the author’s knowledge, this is the first time that SSO and PSO have been applied for audio classification. Furthermore, two local search strategies, named Exchange Local Search (ELS) and Weighted Local Search (WLS), have been proposed to improve SSO performance. SSO-ELS has been implemented to classify the 13 benchmark datasets obtained from the UCI repository database. Meanwhile, SSO-WLS has been implemented in Anomaly-based Network Intrusion Detection System (A-NIDS). In A-NIDS, a novel hybrid SSO-based Rough Set (SSORS) for feature selection has also been proposed. The empirical analysis showed promising results with high classification accuracy rate achieved by all proposed techniques over audio data, UCI data and KDDCup 99 datasets. Therefore, the proposed SSO rule-based classifier with local search strategies has offered a new paradigm shift in solving complex problems in data mining which may not be able to be solved by other benchmark classifiers
    corecore