Search CORE

10,190 research outputs found

Clustering with shallow trees

Author: A Braunstein
A Flaxman
Altschul S F
Bradde S Braunstein A Flaxman A Zecchina R
L Foini
M Bailly-Bechet
R Zecchina
S Bradde
Publication venue: 'IOP Publishing'
Publication date: 01/01/2009
Field of study

We propose a new method for hierarchical clustering based on the optimisation of a cost function over trees of limited depth, and we derive a message--passing method that allows to solve it efficiently. The method and algorithm can be interpreted as a natural interpolation between two well-known approaches, namely single linkage and the recently presented Affinity Propagation. We analyze with this general scheme three biological/medical structured datasets (human population based on genetic information, proteins based on sequences and verbal autopsies) and show that the interpolation technique provides new insight.Comment: 11 pages, 7 figure

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

PORTO Publications Open Repository TOrino

Recommended from our members

Inference of single-cell phylogenies from lineage tracing data using Cassiopeia.

Author: Chan Michelle M
Hussmann Jeffrey A
Jones Matthew G
Khodaverdian Alex
Quinn Jeffrey J
Wang Robert
Weissman Jonathan S
Xu Chenling
Yosef Nir
Publication venue: eScholarship, University of California
Publication date: 01/04/2020
Field of study

The pairing of CRISPR/Cas9-based gene editing with massively parallel single-cell readouts now enables large-scale lineage tracing. However, the rapid growth in complexity of data from these assays has outpaced our ability to accurately infer phylogenetic relationships. First, we introduce Cassiopeia-a suite of scalable maximum parsimony approaches for tree reconstruction. Second, we provide a simulation framework for evaluating algorithms and exploring lineage tracer design principles. Finally, we generate the most complex experimental lineage tracing dataset to date, 34,557 human cells continuously traced over 15 generations, and use it for benchmarking phylogenetic inference approaches. We show that Cassiopeia outperforms traditional methods by several metrics and under a wide variety of parameter regimes, and provide insight into the principles for the design of improved Cas9-enabled recorders. Together, these should broadly enable large-scale mammalian lineage tracing efforts. Cassiopeia and its benchmarking resources are publicly available at www.github.com/YosefLab/Cassiopeia

eScholarship - University of California

Computational Molecular Biology

Author: Lenhof H.
Mutzel P.
Vingron M.
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/1996
Field of study

Computational Biology is a fairly new subject that arose in response to the computational problems posed by the analysis and the processing of biomolecular sequence and structure data. The field was initiated in the late 60's and early 70's largely by pioneers working in the life sciences. Physicists and mathematicians entered the field in the 70's and 80's, while Computer Science became involved with the new biological problems in the late 1980's. Computational problems have gained further importance in molecular biology through the various genome projects which produce enormous amounts of data. For this bibliography we focus on those areas of computational molecular biology that involve discrete algorithms or discrete optimization. We thus neglect several other areas of computational molecular biology, like most of the literature on the protein folding problem, as well as databases for molecular and genetic data, and genetic mapping algorithms. Due to the availability of review papers and a bibliography this bibliography

MPG.PuRe

Applications of network optimization

Author
Publication venue: Alfred P. Sloan School of Management, Massachusetts Institute of Technology, 1992.
Publication date: 29/04/2003
Field of study

Includes bibliographical references (p. 41-48).Ravindra K. Ahuja ... [et al.]

DSpace@MIT

Nesting Problems and Steiner Tree Problems

Author: Nielsen Benny Kjær
Publication venue
Publication date: 01/01/2008
Field of study

Copenhagen University Research Information System

Pairwise Alignment of Metamorphic Computer Viruses

Author: McGhee Scott
Publication venue: SJSU ScholarWorks
Publication date: 01/01/2007
Field of study

Computer viruses and other forms of malware pose a threat to virtually any software system (with only a few exceptions). A computer virus is a piece of software which takes advantage of known weaknesses in a software system, and usually has the ability to deliver a malicious payload. A common technique that virus writers use to avoid detection is to enable the virus to change itself by having some kind of self-modifying code. This kind of virus is commonly known as a metamorphic virus, and can be particularly difficult to detect [17]. Existing virus detection software is continually being improved upon in order to keep up with the rising complexity of today’s modern computer viruses. A new approach to detecting metamorphic viruses, which is an extension of an idea posed in a student writing project from a previous semester [17], will be considered in this project. If a large set of viruses in one “family” of metamorphic viruses can be treated as simple sequences of op-codes, then sequence analysis techniques used in other fields of study like bioengineering [4] could be used to develop a profile hidden Markov model (HMM). This profile would then be used to score an arbitrary op-code sequence (i.e. a program which may or may not be in the virus family) – if the output score exceeds a designated threshold it could be concluded that the input sequence was likely to have been from that same virus family. One of the most common techniques to detect viruses is called signature detection, which involves an analysis of known viruses to find signatures, or strings of bytes, which are found in viruses and not in most non-malicious code. If the virus is metamorphic it could potentially be difficult to find a single signature that will consistently be found in every version of a metamorphic virus. Since a profile HMM would score the overall similarity in structure to a virus “family”, it could theoretically detect the virus even if a reliable signature cannot be created. In order to develop a profile HMM for a virus family, the first step is to create a multiple sequence alignment (MSA) for the set of family viruses; this can then be used to “train” the profile HMM. This paper will concentrate on the techniques for creating MSA’s for real world virus op-code sequences which will best match the virus family, as well as to discuss the overall plausibility of the idea of using a profile HMM to detect metamorphic viruses. Creating and testing the profile HMM to detect the viruses will be the subject of another student project

SJSU ScholarWorks