5,964 research outputs found

    Automatically combining static malware detection techniques

    Get PDF
    Malware detection techniques come in many different flavors, and cover different effectiveness and efficiency trade-offs. This paper evaluates a number of machine learning techniques to combine multiple static Android malware detection techniques using automatically constructed decision trees. We identify the best methods to construct the trees. We demonstrate that those trees classify sample apps better and faster than individual techniques alone

    Learning and Interpreting Multi-Multi-Instance Learning Networks

    Get PDF
    We introduce an extension of the multi-instance learning problem where examples are organized as nested bags of instances (e.g., a document could be represented as a bag of sentences, which in turn are bags of words). This framework can be useful in various scenarios, such as text and image classification, but also supervised learning over graphs. As a further advantage, multi-multi instance learning enables a particular way of interpreting predictions and the decision function. Our approach is based on a special neural network layer, called bag-layer, whose units aggregate bags of inputs of arbitrary size. We prove theoretically that the associated class of functions contains all Boolean functions over sets of sets of instances and we provide empirical evidence that functions of this kind can be actually learned on semi-synthetic datasets. We finally present experiments on text classification, on citation graphs, and social graph data, which show that our model obtains competitive results with respect to accuracy when compared to other approaches such as convolutional networks on graphs, while at the same time it supports a general approach to interpret the learnt model, as well as explain individual predictions.Comment: JML

    On Cognitive Preferences and the Plausibility of Rule-based Models

    Get PDF
    It is conventional wisdom in machine learning and data mining that logical models such as rule sets are more interpretable than other models, and that among such rule-based models, simpler models are more interpretable than more complex ones. In this position paper, we question this latter assumption by focusing on one particular aspect of interpretability, namely the plausibility of models. Roughly speaking, we equate the plausibility of a model with the likeliness that a user accepts it as an explanation for a prediction. In particular, we argue that, all other things being equal, longer explanations may be more convincing than shorter ones, and that the predominant bias for shorter models, which is typically necessary for learning powerful discriminative models, may not be suitable when it comes to user acceptance of the learned models. To that end, we first recapitulate evidence for and against this postulate, and then report the results of an evaluation in a crowd-sourcing study based on about 3.000 judgments. The results do not reveal a strong preference for simple rules, whereas we can observe a weak preference for longer rules in some domains. We then relate these results to well-known cognitive biases such as the conjunction fallacy, the representative heuristic, or the recogition heuristic, and investigate their relation to rule length and plausibility.Comment: V4: Another rewrite of section on interpretability to clarify focus on plausibility and relation to interpretability, comprehensibility, and justifiabilit

    Metaphysics and Law

    Get PDF
    The dichotomy between questions of fact and questions of law serves as a starting point for the following discussion of the nature of legal reasoning. In the course of the dialogue the author notes similarities and dissimilarities between legal reasoning and philosophical and mathematical reasoning. In the end we are left with a clearer insight into the distinctive features of the adjudicative process

    A Customer Segmentation Mining System on the Web Platform

    Get PDF
    We will introduce a knowledge discovery system developed on the World Wide Web platform in this paper. Its algorithm is based on Fuzzy Inductive Learning Method (FILM), which can segment consumers\u27 behavior from a set of customer data with noises. In a visualization way, the system will present the acquired knowledge as a set of IF-THEN rules that can be run on top of an expert system. Moreover, the system will provide advices in response to a user\u27s request through the network and a friendly user interface. At last, we evaluate the function of the system by training it with a transaction database provided by a local automobile dealer

    Data mining using Matlab

    Get PDF
    Data mining is a relatively new field emerging in many disciplines. It is becoming more popular as technology advances, and the need for efficient data analysis is required. The aim of data mining itself is not to provide strict rules by analysing the full data set, data mining is used to predict with some certainty while only analysing a small portion of the data. This project seeks to compare the efficiency of a decision tree induction method with that of the neural network method. MATLAB has inbuilt data mining toolboxes. However the decision tree induction method is not as yet implemented. Decision tree induction has been implemented in several forms in the past. The greatest contribution to this method has been made by DR John Ross Quinlan, who has brought forward this method in the form of ID3, C4.5 and C5 algorithms. The methodologies used within ID3 and C4.5 are well documented and therefore provide a strong platform for the implementation of this method within a higher level language. The objectives of this study are to fully comprehend two methods of data mining, namely decision tree induction and neural networks. The decision tree induction method is to be implemented within the mathematical computer language MATLAB. The results found when analysing some suitable data will be compared with the results from the neural network toolbox already implemented in MATLAB. The data used to compare and contrast the two methods included voting records from the US House of Representatives, which consists of yes, no and undecided votes on sixteen separate issues. The voters are grouped into categories according to their political party. This can be either republican or democratic. The objective of using this data set is to predict what party a congressman is affiliated with by analysing their voting trends. The findings of this study reveal that the decision tree method can accurately predict outcomes if an ideal data set is used for building the tree. The neural network method has less accuracy in some situations however it is more robust towards unexpected data

    Automated Certification of Authorisation Policy Resistance

    Full text link
    Attribute-based Access Control (ABAC) extends traditional Access Control by considering an access request as a set of pairs attribute name-value, making it particularly useful in the context of open and distributed systems, where security relevant information can be collected from different sources. However, ABAC enables attribute hiding attacks, allowing an attacker to gain some access by withholding information. In this paper, we first introduce the notion of policy resistance to attribute hiding attacks. We then propose the tool ATRAP (Automatic Term Rewriting for Authorisation Policies), based on the recent formal ABAC language PTaCL, which first automatically searches for resistance counter-examples using Maude, and then automatically searches for an Isabelle proof of resistance. We illustrate our approach with two simple examples of policies and propose an evaluation of ATRAP performances.Comment: 20 pages, 4 figures, version including proofs of the paper that will be presented at ESORICS 201
    • 

    corecore