3,006 research outputs found
How to Explain Individual Classification Decisions
After building a classifier with modern tools of machine learning we
typically have a black box at hand that is able to predict well for unseen
data. Thus, we get an answer to the question what is the most likely label of a
given unseen data point. However, most methods will provide no answer why the
model predicted the particular label for a single instance and what features
were most influential for that particular instance. The only method that is
currently able to provide such explanations are decision trees. This paper
proposes a procedure which (based on a set of assumptions) allows to explain
the decisions of any classification method.Comment: 31 pages, 14 figure
Classification and Analysis of Regulatory Pathways Using Graph Property, Biochemical and Physicochemical Property, and Functional Property
Given a regulatory pathway system consisting of a set of proteins, can we predict which pathway class it belongs to? Such a problem is closely related to the biological function of the pathway in cells and hence is quite fundamental and essential in systems biology and proteomics. This is also an extremely difficult and challenging problem due to its complexity. To address this problem, a novel approach was developed that can be used to predict query pathways among the following six functional categories: (i) “Metabolism”, (ii) “Genetic Information Processing”, (iii) “Environmental Information Processing”, (iv) “Cellular Processes”, (v) “Organismal Systems”, and (vi) “Human Diseases”. The prediction method was established trough the following procedures: (i) according to the general form of pseudo amino acid composition (PseAAC), each of the pathways concerned is formulated as a 5570-D (dimensional) vector; (ii) each of components in the 5570-D vector was derived by a series of feature extractions from the pathway system according to its graphic property, biochemical and physicochemical property, as well as functional property; (iii) the minimum redundancy maximum relevance (mRMR) method was adopted to operate the prediction. A cross-validation by the jackknife test on a benchmark dataset consisting of 146 regulatory pathways indicated that an overall success rate of 78.8% was achieved by our method in identifying query pathways among the above six classes, indicating the outcome is quite promising and encouraging. To the best of our knowledge, the current study represents the first effort in attempting to identity the type of a pathway system or its biological function. It is anticipated that our report may stimulate a series of follow-up investigations in this new and challenging area
Learning, Generalization, and Functional Entropy in Random Automata Networks
It has been shown \citep{broeck90:physicalreview,patarnello87:europhys} that
feedforward Boolean networks can learn to perform specific simple tasks and
generalize well if only a subset of the learning examples is provided for
learning. Here, we extend this body of work and show experimentally that random
Boolean networks (RBNs), where both the interconnections and the Boolean
transfer functions are chosen at random initially, can be evolved by using a
state-topology evolution to solve simple tasks. We measure the learning and
generalization performance, investigate the influence of the average node
connectivity , the system size , and introduce a new measure that allows
to better describe the network's learning and generalization behavior. We show
that the connectivity of the maximum entropy networks scales as a power-law of
the system size . Our results show that networks with higher average
connectivity (supercritical) achieve higher memorization and partial
generalization. However, near critical connectivity, the networks show a higher
perfect generalization on the even-odd task
- …