14 research outputs found

    Applicability of semi-supervised learning assumptions for gene ontology terms prediction

    Get PDF
    Gene Ontology (GO) is one of the most important resources in bioinformatics, aiming to provide a unified framework for the biological annotation of genes and proteins across all species. Predicting GO terms is an essential task for bioinformatics, but the number of available labelled proteins is in several cases insufficient for training reliable machine learning classifiers. Semi-supervised learning methods arise as a powerful solution that explodes the information contained in unlabelled data in order to improve the estimations of traditional supervised approaches. However, semi-supervised learning methods have to make strong assumptions about the nature of the training data and thus, the performance of the predictor is highly dependent on these assumptions. This paper presents an analysis of the applicability of semi-supervised learning assumptions over the specific task of GO terms prediction, focused on providing judgment elements that allow choosing the most suitable tools for specific GO terms. The results show that semi-supervised approaches significantly outperform the traditional supervised methods and that the highest performances are reached when applying the cluster assumption. Besides, it is experimentally demonstrated that cluster and manifold assumptions are complimentary to each other and an analysis of which GO terms can be more prone to be correctly predicted with each assumption, is provided.Postprint (published version

    Detecting genuine multipartite entanglement via machine learning

    Full text link
    In recent years, supervised and semi-supervised machine learning methods such as neural networks, support vector machines (SVM), and semi-supervised support vector machines (S4VM) have been widely used in quantum entanglement and quantum steering verification problems. However, few studies have focused on detecting genuine multipartite entanglement based on machine learning. Here, we investigate supervised and semi-supervised machine learning for detecting genuine multipartite entanglement of three-qubit states. We randomly generate three-qubit density matrices, and train an SVM for the detection of genuine multipartite entangled states. Moreover, we improve the training method of S4VM, which optimizes the grouping of prediction samples and then performs iterative predictions. Through numerical simulation, it is confirmed that this method can significantly improve the prediction accuracy.Comment: 9 pages, 8 figure

    Dark Web Data Classification Using Neural Network

    Get PDF
    There are several issues associated with Dark Web Structural Patterns mining (including many redundant and irrelevant information), which increases the numerous types of cybercrime like illegal trade, forums, terrorist activity, and illegal online shopping. Understanding online criminal behavior is challenging because the data is available in a vast amount. To require an approach for learning the criminal behavior to check the recent request for improving the labeled data as a user profiling, Dark Web Structural Patterns mining in the case of multidimensional data sets gives uncertain results. Uncertain classification results cause a problem of not being able to predict user behavior. Since data of multidimensional nature has feature mixes, it has an adverse influence on classification. The data associated with Dark Web inundation has restricted us from giving the appropriate solution according to the need. In the research design, a Fusion NN (Neural network)-S3VM for Criminal Network activity prediction model is proposed based on the neural network; NN- S3VM can improve the prediction

    Cost-sensitive online classification

    Get PDF
    Ministry of Education, Singapore under its Academic Research Funding Tier

    A Cost-Sensitive Sparse Representation Based Classification for Class-Imbalance Problem

    Get PDF

    Cost-Sensitive Double Updating Online Learning and Its Application to Online Anomaly Detection

    Get PDF
    Although both cost-sensitive classification and online learning have been well studied separately in data mining and machine learning, there was very few comprehensive study of cost-sensitive online classification in literature. In this paper, we formally investigate this problem by directly optimizing cost-sensitive measures for an online classification task. As the first comprehensive study, we propose the Cost-Sensitive Double Updating Online Learning (CSDUOL) algorithms, which explores a recent double updating technique to tackle the online optimization task of cost-sensitive classification by maximizing the weighted sum or minimizing the weighted misclassification cost. We theoretically analyze the cost-sensitive measure bounds of the proposed algorithms, extensively examine their empirical performance for cost-sensitive online classification tasks, and finally demonstrate the application of our technique to solve online anomaly detection tasks.

    Cost-sensitive online classification

    Get PDF
    Ministry of Education, Singapore under its Academic Research Funding Tier

    Cost-sensitive online classification

    Get PDF
    Ministry of Education, Singapore under its Academic Research Funding Tier
    corecore