342 research outputs found

    Active Sampling-based Binary Verification of Dynamical Systems

    Full text link
    Nonlinear, adaptive, or otherwise complex control techniques are increasingly relied upon to ensure the safety of systems operating in uncertain environments. However, the nonlinearity of the resulting closed-loop system complicates verification that the system does in fact satisfy those requirements at all possible operating conditions. While analytical proof-based techniques and finite abstractions can be used to provably verify the closed-loop system's response at different operating conditions, they often produce conservative approximations due to restrictive assumptions and are difficult to construct in many applications. In contrast, popular statistical verification techniques relax the restrictions and instead rely upon simulations to construct statistical or probabilistic guarantees. This work presents a data-driven statistical verification procedure that instead constructs statistical learning models from simulated training data to separate the set of possible perturbations into "safe" and "unsafe" subsets. Binary evaluations of closed-loop system requirement satisfaction at various realizations of the uncertainties are obtained through temporal logic robustness metrics, which are then used to construct predictive models of requirement satisfaction over the full set of possible uncertainties. As the accuracy of these predictive statistical models is inherently coupled to the quality of the training data, an active learning algorithm selects additional sample points in order to maximize the expected change in the data-driven model and thus, indirectly, minimize the prediction error. Various case studies demonstrate the closed-loop verification procedure and highlight improvements in prediction error over both existing analytical and statistical verification techniques.Comment: 23 page

    Supervised inference of gene-regulatory networks

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Inference of protein interaction networks from various sources of data has become an important topic of both systems and computational biology. Here we present a supervised approach to identification of gene expression regulatory networks.</p> <p>Results</p> <p>The method is based on a kernel approach accompanied with genetic programming. As a data source, the method utilizes gene expression time series for prediction of interactions among regulatory proteins and their target genes. The performance of the method was verified using Saccharomyces cerevisiae cell cycle and DNA/RNA/protein biosynthesis gene expression data. The results were compared with independent data sources. Finally, a prediction of novel interactions within yeast gene expression circuits has been performed.</p> <p>Conclusion</p> <p>Results show that our algorithm gives, in most cases, results identical with the independent experiments, when compared with the YEASTRACT database. In several cases our algorithm gives predictions of novel interactions which have not been reported.</p

    Premise Selection for Mathematics by Corpus Analysis and Kernel Methods

    Get PDF
    Smart premise selection is essential when using automated reasoning as a tool for large-theory formal proof development. A good method for premise selection in complex mathematical libraries is the application of machine learning to large corpora of proofs. This work develops learning-based premise selection in two ways. First, a newly available minimal dependency analysis of existing high-level formal mathematical proofs is used to build a large knowledge base of proof dependencies, providing precise data for ATP-based re-verification and for training premise selection algorithms. Second, a new machine learning algorithm for premise selection based on kernel methods is proposed and implemented. To evaluate the impact of both techniques, a benchmark consisting of 2078 large-theory mathematical problems is constructed,extending the older MPTP Challenge benchmark. The combined effect of the techniques results in a 50% improvement on the benchmark over the Vampire/SInE state-of-the-art system for automated reasoning in large theories.Comment: 26 page

    Accelerated Particle Swarm Optimization and Support Vector Machine for Business Optimization and Applications

    Full text link
    Business optimization is becoming increasingly important because all business activities aim to maximize the profit and performance of products and services, under limited resources and appropriate constraints. Recent developments in support vector machine and metaheuristics show many advantages of these techniques. In particular, particle swarm optimization is now widely used in solving tough optimization problems. In this paper, we use a combination of a recently developed Accelerated PSO and a nonlinear support vector machine to form a framework for solving business optimization problems. We first apply the proposed APSO-SVM to production optimization, and then use it for income prediction and project scheduling. We also carry out some parametric studies and discuss the advantages of the proposed metaheuristic SVM.Comment: 12 page

    Intelligent OS X malware threat detection with code inspection

    Get PDF
    With the increasing market share of Mac OS X operating system, there is a corresponding increase in the number of malicious programs (malware) designed to exploit vulnerabilities on Mac OS X platforms. However, existing manual and heuristic OS X malware detection techniques are not capable of coping with such a high rate of malware. While machine learning techniques offer promising results in automated detection of Windows and Android malware, there have been limited efforts in extending them to OS X malware detection. In this paper, we propose a supervised machine learning model. The model applies kernel base Support Vector Machine (SVM) and a novel weighting measure based on application library calls to detect OS X malware. For training and evaluating the model, a dataset with a combination of 152 malware and 450 benign were is created. Using common supervised Machine Learning algorithm on the dataset, we obtain over 91% detection accuracy with 3.9% false alarm rate. We also utilize Synthetic Minority Over-sampling Technique (SMOTE) to create three synthetic datasets with different distributions based on the refined version of collected dataset to investigate impact of different sample sizes on accuracy of malware detection. Using SMOTE datasets we could achieve over 96% detection accuracy and false alarm of less than 4%. All malware classification experiments are tested using cross validation technique. Our results reflect that increasing sample size in synthetic datasets has direct positive effect on detection accuracy while increases false alarm rate in compare to the original dataset

    Prediction of drug–target interaction networks from the integration of chemical and genomic spaces

    Get PDF
    Motivation: The identification of interactions between drugs and target proteins is a key area in genomic drug discovery. Therefore, there is a strong incentive to develop new methods capable of detecting these potential drug–target interactions efficiently

    Locality-Convolution Kernel and Its Application to Dependency Parse Ranking

    Full text link
    Abstract. We propose a Locality-Convolution (LC) kernel in applica-tion to dependency parse ranking. The LC kernel measures parse similar-ities locally, within a small window constructed around each matching feature. Inside the window it makes use of a position sensitive func-tion to take into account the order of the feature appearance. The sim-ilarity between two windows is calculated by computing the product of their common attributes and the kernel value is the sum of the window similarities. We applied the introduced kernel together with Regular-ized Least-Squares (RLS) algorithm to a dataset containing dependency parses obtained from a manually annotated biomedical corpus of 1100 sentences. Our experiments show that RLS with LC kernel performs bet-ter than the baseline method. The results outline the importance of local correlations and the order of feature appearance within the parse. Final validation demonstrates statistically significant increase in parse ranking performance.

    On consensus biomarker selection

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent development of mass spectrometry technology enabled the analysis of complex peptide mixtures. A lot of effort is currently devoted to the identification of biomarkers in human body fluids like serum or plasma, based on which new diagnostic tests for different diseases could be constructed. Various biomarker selection procedures have been exploited in recent studies. It has been noted that they often lead to different biomarker lists and as a consequence, the patient classification may also vary.</p> <p>Results</p> <p>Here we propose a new approach to the biomarker selection problem: to apply several competing feature ranking procedures and compute a consensus list of features based on their outcomes. We validate our methods on two proteomic datasets for the diagnosis of ovarian and prostate cancer.</p> <p>Conclusion</p> <p>The proposed methodology can improve the classification results and at the same time provide a unified biomarker list for further biological examinations and interpretation.</p
    corecore