24 research outputs found

    DroidFusion: A Novel Multilevel Classifier Fusion Approach for Android Malware Detection

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI linkAndroid malware has continued to grow in volume and complexity posing significant threats to the security of mobile devices and the services they enable. This has prompted increasing interest in employing machine learning to improve Android malware detection. In this paper, we present a novel classifier fusion approach based on a multilevel architecture that enables effective combination of machine learning algorithms for improved accuracy. The framework (called DroidFusion), generates a model by training base classifiers at a lower level and then applies a set of ranking-based algorithms on their predictive accuracies at the higher level in order to derive a final classifier. The induced multilevel DroidFusion model can then be utilized as an improved accuracy predictor for Android malware detection. We present experimental results on four separate datasets to demonstrate the effectiveness of our proposed approach. Furthermore, we demonstrate that the DroidFusion method can also effectively enable the fusion of ensemble learning algorithms for improved accuracy. Finally, we show that the prediction accuracy of DroidFusion, despite only utilizing a computational approach in the higher level, can outperform stacked generalization, a well-known classifier fusion method that employs a meta-classifier approach in its higher level

    Exploiting the Parallelism Exposed by Partial Evaluation

    Get PDF
    We describe the key role played by partial evaluation in the Supercomputing Toolkit, a parallel computing system for scientific applications that effectively exploits the vast amount of parallelism exposed by partial evaluation. The Supercomputing Toolkit parallel processor and its associated partial evaluation-based compiler have been used extensively by scientists at MIT, and have made possible recent results in astrophysics showing that the motion of the planets in our solar system is chaotically unstable

    Partial Evaluation for Scientific Computing: The Supercomputer Toolkit Experience

    Get PDF
    We describe the key role played by partial evaluation in the Supercomputer Toolkit, a parallel computing system for scientific applications that effectively exploits the vast amount of parallelism exposed by partial evaluation. The Supercomputer Toolkit parallel processor and its associated partial evaluation-based compiler have been used extensively by scientists at M.I.T., and have made possible recent results in astrophysics showing that the motion of the planets in our solar system is chaotically unstable

    Concurrent Scheme

    Get PDF
    Journal ArticleThis paper describes an evolution of the Scheme language to support parallelism with tight coupling of control and data. Mechanisms are presented to address the difficult and related problems of mutual exclusion and data sharing which arise in concurrent language systems. The mechanisms are tailored to preserve Scheme semantics as much as possible while allowing for efficient implementation. Prototype implementations of the resulting language are described which have been completed. A third implementation is underway for the Mayfly, a distributed memory, twisted-torus communication topology, parallel processor, under development at the Hewlett-Packard Research Laboratories. The language model is particularly well suited for the Mayfly processor, as will be shown

    Exploiting the Parallelism Exposed by Partial Evaluation

    Get PDF
    We describe an approach to parallel compilation that seeks to harness the vast amount of fine-grain parallelism that is exposed through partial evaluation of numerically-intensive scientific programs. We have constructed a compiler for the Supercomputer Toolkit parallel processor that uses partial evaluation to break down data abstractions and program structure, producing huge basic blocks that contain large amounts of fine-grain parallelism. We show that this fine-grain prarllelism can be effectively utilized even on coarse-grain parallel architectures by selectively grouping operations together so as to adjust the parallelism grain-size to match the inter-processor communication capabilities of the target architecture

    A Performance Evaluation of Pairing-Based Broadcast Encryption Systems

    Get PDF
    In a broadcast encryption system, a sender can encrypt a message for any subset of users who are listening on a broadcast channel. The goal of broadcast encryption is to leverage the broadcasting structure to achieve better efficiency than individually encrypting to each user; in particular, reducing the bandwidth (i.e., ciphertext size) required to transmit securely, although other factors such as public and private key size and the time to execute setup, encryption and decryption are also important. In this work, we conduct a detailed performance evaluation of eleven public-key, pairing-based broadcast encryption schemes offering different features and security guarantees, including public-key, identity-based, traitor-tracing, private linear and augmented systems. We implemented each system using the MCL Java pairings library, reworking some of the constructions to achieve better efficiency. We tested their performance on a variety of parameter choices, resulting in hundreds of data points to compare, with some interesting results from the classic Boneh-Gentry-Waters scheme (CRYPTO 2005) to Zhandry\u27s recent generalized scheme (CRYPTO 2020), and more. We combine this performance data and knowledge of the systems\u27 features with data we collected on practical usage scenarios to determine which schemes are likely to perform best for certain applications, such as video streaming services, online gaming, live sports betting and smartphone streaming. This work can inform both practitioners and future cryptographic designs in this area

    Do we need hundreds of classifiers to solve real world classification problems?

    Get PDF
    We evaluate 179 classifiers arising from 17 families (discriminant analysis, Bayesian, neural networks, support vector machines, decision trees, rule-based classifiers, boosting, bagging, stacking, random forests and other ensembles, generalized linear models, nearest-neighbors, partial least squares and principal component regression, logistic and multinomial regression, multiple adaptive regression splines and other methods), implemented in Weka, R (with and without the caret package), C and Matlab, including all the relevant classifiers available today. We use 121 data sets, which represent the whole UCI data base (excluding the large- scale problems) and other own real problems, in order to achieve significant conclusions about the classifier behavior, not dependent on the data set collection. The classifiers most likely to be the bests are the random forest (RF) versions, the best of which (implemented in R and accessed via caret) achieves 94.1% of the maximum accuracy overcoming 90% in the 84.3% of the data sets. However, the difference is not statistically significant with the second best, the SVM with Gaussian kernel implemented in C using LibSVM, which achieves 92.3% of the maximum accuracy. A few models are clearly better than the remaining ones: random forest, SVM with Gaussian and polynomial kernels, extreme learning machine with Gaussian kernel, C5.0 and avNNet (a committee of multi-layer perceptrons implemented in R with the caret package). The random forest is clearly the best family of classifiers (3 out of 5 bests classifiers are RF), followed by SVM (4 classifiers in the top-10), neural networks and boosting ensembles (5 and 3 members in the top-20, respectively)We would like to acknowledge support from the Spanish Ministry of Science and Innovation (MICINN), which supported this work under projects TIN2011-22935 and TIN2012-32262S

    Implementing Distributed Systems Using Linear Naming

    Get PDF
    Linear graph reduction is a simple computational model in which the cost of naming things is explicitly represented. The key idea is the notion of "linearity". A name is linear if it is only used once, so with linear naming you cannot create more than one outstanding reference to an entity. As a result, linear naming is cheap to support and easy to reason about. Programs can be translated into the linear graph reduction model such that linear names in the program are implemented directly as linear names in the model. Nonlinear names are supported by constructing them out of linear names. The translation thus exposes those places where the program uses names in expensive, nonlinear ways. Two applications demonstrate the utility of using linear graph reduction: First, in the area of distributed computing, linear naming makes it easy to support cheap cross-network references and highly portable data structures, Linear naming also facilitates demand driven migration of tasks and data around the network without requiring explicit guidance from the programmer. Second, linear graph reduction reveals a new characterization of the phenomenon of state. Systems in which state appears are those which depend on certain -global- system properties. State is not a localizable phenomenon, which suggests that our usual object oriented metaphor for state is flawed
    corecore