483 research outputs found

    Easy over Hard: A Case Study on Deep Learning

    Full text link
    While deep learning is an exciting new technique, the benefits of this method need to be assessed with respect to its computational cost. This is particularly important for deep learning since these learners need hours (to weeks) to train the model. Such long training time limits the ability of (a)~a researcher to test the stability of their conclusion via repeated runs with different random seeds; and (b)~other researchers to repeat, improve, or even refute that original work. For example, recently, deep learning was used to find which questions in the Stack Overflow programmer discussion forum can be linked together. That deep learning system took 14 hours to execute. We show here that applying a very simple optimizer called DE to fine tune SVM, it can achieve similar (and sometimes better) results. The DE approach terminated in 10 minutes; i.e. 84 times faster hours than deep learning method. We offer these results as a cautionary tale to the software analytics community and suggest that not every new innovation should be applied without critical analysis. If researchers deploy some new and expensive process, that work should be baselined against some simpler and faster alternatives.Comment: 12 pages, 6 figures, accepted at FSE201

    Moment Angle Complexes and Big Cohen-Macaulayness

    Full text link
    Let Z_K be the moment angle complex associated to a simplicial complex K, with the canonical torus T-action. In this paper, we prove that, for any possibly disconnected subgroup G of T, G-equivariant cohomology of Z_K over the integer Z is isomophic to the Tor-module Tor_{H(BR;Z)}(Z[K],Z) as graded modules, where Z_[K] is the Stanley-Reisner ring of K. Based on this, we prove that the surjectivity of the natural map H_T(Z_K;Z) to H_G(Z_K;Z) is equivalent to the vanishing of Tor^{H(BR;Z)}_1(Z[K],Z). Since the integral cohomology of various toric orbifolds can be identified with H_G(Z_K;Z), we studied the conditions for the cohomology of a toric orbifold to be a quotient of its equivariant cohomology by linear terms.Comment: 21 papges. Comments are welcom

    HIV Drug Resistant Prediction and Featured Mutants Selection using Machine Learning Approaches

    Get PDF
    HIV/AIDS is widely spread and ranks as the sixth biggest killer all over the world. Moreover, due to the rapid replication rate and the lack of proofreading mechanism of HIV virus, drug resistance is commonly found and is one of the reasons causing the failure of the treatment. Even though the drug resistance tests are provided to the patients and help choose more efficient drugs, such experiments may take up to two weeks to finish and are expensive. Because of the fast development of the computer, drug resistance prediction using machine learning is feasible. In order to accurately predict the HIV drug resistance, two main tasks need to be solved: how to encode the protein structure, extracting the more useful information and feeding it into the machine learning tools; and which kinds of machine learning tools to choose. In our research, we first proposed a new protein encoding algorithm, which could convert various sizes of proteins into a fixed size vector. This algorithm enables feeding the protein structure information to most state of the art machine learning algorithms. In the next step, we also proposed a new classification algorithm based on sparse representation. Following that, mean shift and quantile regression were included to help extract the feature information from the data. Our results show that encoding protein structure using our newly proposed method is very efficient, and has consistently higher accuracy regardless of type of machine learning tools. Furthermore, our new classification algorithm based on sparse representation is the first application of sparse representation performed on biological data, and the result is comparable to other state of the art classification algorithms, for example ANN, SVM and multiple regression. Following that, the mean shift and quantile regression provided us with the potentially most important drug resistant mutants, and such results might help biologists/chemists to determine which mutants are the most representative candidates for further research

    Analysis of epistatic interactions and fitness landscapes using a new geometric approach

    Get PDF
    BACKGROUND: Understanding interactions between mutations and how they affect fitness is a central problem in evolutionary biology that bears on such fundamental issues as the structure of fitness landscapes and the evolution of sex. To date, analyses of fitness landscapes have focused either on the overall directional curvature of the fitness landscape or on the distribution of pairwise interactions. In this paper, we propose and employ a new mathematical approach that allows a more complete description of multi-way interactions and provides new insights into the structure of fitness landscapes. RESULTS: We apply the mathematical theory of gene interactions developed by Beerenwinkel et al. to a fitness landscape for Escherichia coli obtained by Elena and Lenski. The genotypes were constructed by introducing nine mutations into a wild-type strain and constructing a restricted set of 27 double mutants. Despite the absence of mutants higher than second order, our analysis of this genotypic space points to previously unappreciated gene interactions, in addition to the standard pairwise epistasis. Our analysis confirms Elena and Lenski's inference that the fitness landscape is complex, so that an overall measure of curvature obscures a diversity of interaction types. We also demonstrate that some mutations contribute disproportionately to this complexity. In particular, some mutations are systematically better than others at mixing with other mutations. We also find a strong correlation between epistasis and the average fitness loss caused by deleterious mutations. In particular, the epistatic deviations from multiplicative expectations tend toward more positive values in the context of more deleterious mutations, emphasizing that pairwise epistasis is a local property of the fitness landscape. Finally, we determine the geometry of the fitness landscape, which reflects many of these biologically interesting features. CONCLUSION: A full description of complex fitness landscapes requires more information than the average curvature or the distribution of independent pairwise interactions. We have proposed a mathematical approach that, in principle, allows a complete description and, in practice, can suggest new insights into the structure of real fitness landscapes. Our analysis emphasizes the value of non-independent genotypes for these inferences

    Singletons and their maximal symmetry algebras

    Full text link
    Singletons are those unitary irreducible modules of the Poincare or (anti) de Sitter group that can be lifted to unitary modules of the conformal group. Higher-spin algebras are the corresponding realizations of the universal enveloping algebra of the conformal algebra on these modules. These objects appear in a wide variety of areas of theoretical physics: AdS/CFT correspondence, electric-magnetic duality, higher-spin multiplets, infinite-component Majorana equations, higher-derivative symmetries, etc. Singletons and higher-spin algebras are reviewed through a list of their many equivalent definitions in order to approach them from various perspectives. The focus of this introduction is on the symmetries of a singleton: its maximal algebra and the manifest realization thereof.Comment: 34 pages, published (splitted into two distinct pieces) in the proceedings of the "7th spring school and workshop on quantum field theory & Hamiltonian systems" and of the "6th mathematical physics meeting: summer school and conference on modern mathematical physics", v2: references (and related comments) adde

    Mutation, surface graphs, and alternating links in surfaces

    Full text link
    In this paper, we study alternating links in thickened surfaces in terms of the lattices of integer flows on their Tait graphs. We use this approach to give a short proof of the first two generalised Tait conjectures. We also prove that the flow lattice is an invariant of alternating links in thickened surfaces and is further invariant under disc mutation. For classical links, the flow lattice and dd-invariants are complete invariants of the mutation class of an alternating link. For links in thickened surfaces, we show that this is no longer the case by finding a stronger mutation invariant, namely the Gordon-Litherland linking form. In particular, we find alternating knots in thickened surfaces which have isometric flow lattices but with non-isomorphic linking forms.Comment: 28 pages, 11 figures, and 4 table
    • …
    corecore