59,961 research outputs found

    Multiclass Cancer Classification by Using Fuzzy Support Vector Machine and Binary Decision Tree With Gene Selection

    Get PDF
    We investigate the problems of multiclass cancer classification with gene selection from gene expression data. Two different constructed multiclass classifiers with gene selection are proposed, which are fuzzy support vector machine (FSVM) with gene selection and binary classification tree based on SVM with gene selection. Using F test and recursive feature elimination based on SVM as gene selection methods, binary classification tree based on SVM with F test, binary classification tree based on SVM with recursive feature elimination based on SVM, and FSVM with recursive feature elimination based on SVM are tested in our experiments. To accelerate computation, preselecting the strongest genes is also used. The proposed techniques are applied to analyze breast cancer data, small round blue-cell tumors, and acute leukemia data. Compared to existing multiclass cancer classifiers and binary classification tree based on SVM with F test or binary classification tree based on SVM with recursive feature elimination based on SVM mentioned in this paper, FSVM based on recursive feature elimination based on SVM can find most important genes that affect certain types of cancer with high recognition accuracy

    BINARY QUANTITATIVE STRUCTURE-ACTIVITY RELATIONSHIP ANALYSIS IN RETROSPECTIVE STRUCTURE-BASED VIRTUAL SCREENING CAMPAIGNS TARGETING ESTROGEN RECEPTOR ALPHA

    Get PDF
      Objective: The objective of this study is to construct predictive unbiased structure-based virtual screening (SBVS) protocols to identify potent ligands for estrogen receptor alpha by combining molecular docking, protein-ligand interaction fingerprinting (PLIF), and binary quantitative structure-activity relationship (QSAR) analysis using recursive partition and regression tree method.Methods: Employing the enhanced version of a directory of useful decoys, SBVS protocols using molecular docking simulations, and PLIF were constructed and retrospectively validated. To avoid bias, SMILES format of the compounds was used. The predictive abilities of the SBVS protocols were then compared based on the enrichment factor (EF) and the F-measure values.Results: The SBVS protocols resulted in this research were SBVS_1 (employing docking scores of the best pose on every compound to rank the results and selecting compounds within 1% false positives as positive), SBVS_2 (employing decision tree resulted from the binary QSAR analysis using docking scores and PLIF bitstrings of the best pose of every compound as descriptors), and SBVS_3 (employing decision tree resulted from the binary QSAR analysis using ensemble PLIF of the selected poses from optimized docking score as the cutoff). The EF values of SBVS_1, SBVS_2, and SBVS_3 are 28.315, 576.084, and 713.472, respectively, while their F-measure values are 0.310, 0.573, and 0.769, respectively.Conclusion: Highly predictive unbiased SBVS protocols to identify potent estrogen receptor alpha ligands were constructed. Further application in prospective screening is therefore highly suggested

    Limit Laws for Functions of Fringe trees for Binary Search Trees and Recursive Trees

    Full text link
    We prove limit theorems for sums of functions of subtrees of binary search trees and random recursive trees. In particular, we give simple new proofs of the fact that the number of fringe trees of size k=kn k=k_n in the binary search tree and the random recursive tree (of total size n n ) asymptotically has a Poisson distribution if k k\rightarrow\infty , and that the distribution is asymptotically normal for k=o(n) k=o(\sqrt{n}) . Furthermore, we prove similar results for the number of subtrees of size k k with some required property P P , for example the number of copies of a certain fixed subtree T T . Using the Cram\'er-Wold device, we show also that these random numbers for different fixed subtrees converge jointly to a multivariate normal distribution. As an application of the general results, we obtain a normal limit law for the number of \ell-protected nodes in a binary search tree or random recursive tree. The proofs use a new version of a representation by Devroye, and Stein's method (for both normal and Poisson approximation) together with certain couplings

    On the Uniform Weak König’s Lemma

    Get PDF
    The so-called weak K¨onig's lemma WKL asserts the existence of an infinitepath b in any infinite binary tree (given by a representing function f). Based onthis principle one can formulate subsystems of higher-order arithmetic whichallow to carry out very substantial parts of classical mathematics but are PI^0_2-conservativeover primitive recursive arithmetic PRA (and even weaker fragments of arithmetic). In [10] we established such conservation results relative to finite type extensions PRA^omega of PRA (together with a quantifier-free axiom of choice schema). In this setting one can consider also a uniform version UWKL of WKL which asserts the existence of a functional Phi which selects uniformly in a given infinite binary tree f an infinite path Phi f of that tree.This uniform version of WKL is of interest in the context of explicit mathematics as developed by S. Feferman. The elimination process in [10] actually can be used to eliminate even this uniform weak K¨onig's lemma provided that PRA^omega only has a quantifier-free rule of extensionality QF-ER instead of the full axioms (E) of extensionality for all finite types. In this paper we show that in the presence of (E), UWKL is much stronger than WKL: whereas WKL remains to be Pi^0_2 -conservative over PRA, PRA^omega +(E)+UWKL contains (and is conservative over) full Peano arithmetic PA

    Drawing Binary Tanglegrams: An Experimental Evaluation

    Full text link
    A binary tanglegram is a pair of binary trees whose leaf sets are in one-to-one correspondence; matching leaves are connected by inter-tree edges. For applications, for example in phylogenetics or software engineering, it is required that the individual trees are drawn crossing-free. A natural optimization problem, denoted tanglegram layout problem, is thus to minimize the number of crossings between inter-tree edges. The tanglegram layout problem is NP-hard and is currently considered both in application domains and theory. In this paper we present an experimental comparison of a recursive algorithm of Buchin et al., our variant of their algorithm, the algorithm hierarchy sort of Holten and van Wijk, and an integer quadratic program that yields optimal solutions.Comment: see http://www.siam.org/proceedings/alenex/2009/alx09_011_nollenburgm.pd
    corecore