196,629 research outputs found

    Mining a database of Fungi for Pharmacological Use via Minimum Message Length Encoding

    Get PDF
    Abstract. This paper concerns the use of fungi in pharmaceutical design. More specifically, this research involves mining a database of fungi to determine which ones have waste products that are unusual in their spectral fingerprints, and therefore worth being tested for medicinal properties. The technique described in this paper involves Minimum Message Length encoding. Minimum Message Length (sometimes called Minimum Description Length) encoding is a method for choosing a binary coding for a set of data. The method's goal is to use the frequency of occurrence of each data point to ensure that frequently occurring data are given short codes. Minimum Message Length encoding provides a solution that is optimal in the sense that if the entire data set is employed in the encoding, then the code generated will have the property that no other unambiguous prefix code will provide a shorter encoded version of the entire set. In this paper, the process is turned on its head. The problem that is addressed is: given a large database, how can we pick out the elements that are quite different from the others. The first step in our solution involves using the Minimum Message Length algorithm to generate a compact code for all, or a representative learning section, of the data. The data that require long descriptions in this code are likely to be the ones that possess unusual features. In this paper, we describe this process in some detail, and explain the application of it to a database of fungi

    Software Defect Association Mining and Defect Correction Effort Prediction

    Get PDF
    Much current software defect prediction work concentrates on the number of defects remaining in software system. In this paper, we present association rule mining based methods to predict defect associations and defect-correction effort. This is to help developers detect software defects and assist project managers in allocating testing resources more effectively. We applied the proposed methods to the SEL defect data consisting of more than 200 projects over more than 15 years. The results show that for the defect association prediction, the accuracy is very high and the false negative rate is very low. Likewise for the defect-correction effort prediction, the accuracy for both defect isolation effort prediction and defect correction effort prediction are also high. We compared the defect-correction effort prediction method with other types of methods: PART, C4.5, and Na¨ıve Bayes and show that accuracy has been improved by at least 23%. We also evaluated the impact of support and confidence levels on prediction accuracy, false negative rate, false positive rate, and the number of rules. We found that higher support and confidence levels may not result in higher prediction accuracy, and a sufficient number of rules is a precondition for high prediction accuracy

    Graph-based discovery of ontology change patterns

    Get PDF
    Ontologies can support a variety of purposes, ranging from capturing conceptual knowledge to the organisation of digital content and information. However, information systems are always subject to change and ontology change management can pose challenges. We investigate ontology change representation and discovery of change patterns. Ontology changes are formalised as graph-based change logs. We use attributed graphs, which are typed over a generic graph with node and edge attribution.We analyse ontology change logs, represented as graphs, and identify frequent change sequences. Such sequences are applied as a reference in order to discover reusable, often domain-specific and usagedriven change patterns. We describe the pattern discovery algorithms and measure their performance using experimental result

    On the Feature Discovery for App Usage Prediction in Smartphones

    Full text link
    With the increasing number of mobile Apps developed, they are now closely integrated into daily life. In this paper, we develop a framework to predict mobile Apps that are most likely to be used regarding the current device status of a smartphone. Such an Apps usage prediction framework is a crucial prerequisite for fast App launching, intelligent user experience, and power management of smartphones. By analyzing real App usage log data, we discover two kinds of features: The Explicit Feature (EF) from sensing readings of built-in sensors, and the Implicit Feature (IF) from App usage relations. The IF feature is derived by constructing the proposed App Usage Graph (abbreviated as AUG) that models App usage transitions. In light of AUG, we are able to discover usage relations among Apps. Since users may have different usage behaviors on their smartphones, we further propose one personalized feature selection algorithm. We explore minimum description length (MDL) from the training data and select those features which need less length to describe the training data. The personalized feature selection can successfully reduce the log size and the prediction time. Finally, we adopt the kNN classification model to predict Apps usage. Note that through the features selected by the proposed personalized feature selection algorithm, we only need to keep these features, which in turn reduces the prediction time and avoids the curse of dimensionality when using the kNN classifier. We conduct a comprehensive experimental study based on a real mobile App usage dataset. The results demonstrate the effectiveness of the proposed framework and show the predictive capability for App usage prediction.Comment: 10 pages, 17 figures, ICDM 2013 short pape

    Substantiating rational parameters of a method for shrinkage ore stoping while developing thin- vein steeply inclined deposits

    Get PDF
    Objective of the paper is to substantiate rational ore-stoping technique while using small wells in the context of thin-vein steeply inclined deposit mining. The technique is based upon the repeated field studies and simulation of ore drawing processes for shrinkage ore stoping in terms of the oriented drilling of periphery holes. A design of a blast-hole charge with low-density porous intermediate layer has been proposed as a result as well as a mechanism of shock-wave propagation within rock mass in the process of thin steeply inclined vein stoping. Scientific novelty is represented by means of analytical results of scientific sources, and dependences of ore losses on the vein wall hypsometry resulting from shrinkage stoping in the context of the technique being proposed. Practical relevance is to substantiate rational parameters of the ore-stoping technique being proposed. The technique involves designs of blast-hole charges with low-density porous intermediate layer in stemming. Moreover, the technique proposes to place the intermediate low-density stemming layer right after a blast hole was charged with explosives and live primers were inserted
    corecore