174 research outputs found
Independence of the Continuum Hypothesis: an Intuitive Introduction
The independence of the continuum hypothesis is a result of broad impact: it
settles a basic question regarding the nature of N and R, two of the most
familiar mathematical structures; it introduces the method of forcing that has
become the main workhorse of set theory; and it has broad implications on
mathematical foundations and on the role of syntax versus semantics. Despite
its broad impact, it is not broadly taught. A main reason is the lack of
accessible expositions for nonspecialists, because the mathematical structures
and techniques employed in the proof are unfamiliar outside of set theory. This
manuscript aims to take a step in addressing this gap by providing an
exposition at a level accessible to advanced undergraduate mathematicians and
theoretical computer scientists, while covering all the technically challenging
parts of the proof.Comment: - Edited the example in the Reflection definition. - Changed fonts
for rank() and nr() - Changed fonts for CH to \mathrm{CH} - Corrected a few
spurious typo
CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction
CONTRAST is a gene predictor that directly incorporates information from multiple alignments and uses discriminative machine learning techniques to give large improvements in prediction over previous methods
Using multiple alignments to improve seeded local alignment algorithms
Multiple alignments among genomes are becoming increasingly prevalent. This trend motivates the development of tools for efficient homology search between a query sequence and a database of multiple alignments. In this paper, we present an algorithm that uses the information implicit in a multiple alignment to dynamically build an index that is weighted most heavily towards the promising regions of the multiple alignment. We have implemented Typhon, a local alignment tool that incorporates our indexing algorithm, which our test results show to be more sensitive than algorithms that index only a sequence. This suggests that when applied on a whole-genome scale, Typhon should provide improved homology searches in time comparable to existing algorithms
Computational genomics : mapping, comparison, and annotation of genomes
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2000.Includes bibliographical references (leaves 180-191).The field of genomics provides many challenges to computer scientists and mathematicians. The area of computational genomics has been expanding recently, and the timely application of computer science in this field is proving to be an essential component of the large international effort in genomics. In this thesis we address key issues in the different stages of genome research: planning of a genome sequencing project, obtaining and assembling sequence information, and ultimately study, cross-species comparison, and annotation of finished genomic sequence. We present applications of computational techniques to the above areas: (1) In relation to the early stages of a genome project, we address physical mapping, and we present results on the theoretical problem of finding minimum superstrings of hypergraphs, a combinatorial problem motivated by physical mapping. We also present a statistical and simulation study of "walking with clone-end sequences", an important method for sequencing a large genome.(cont.) (2) Turning to the problem of obtaining the finished genomic sequence, we present ARACHNE, a prototype software system for assembling sequence data that are derived from sequencing a genome with the "shotgun" method. (3) Finally, we turn to the computational analysis of finished genomic sequence. We present GLASS, a software system for obtaining global pairwise alignments of orthologous finished sequences. We finally use GLASS to perform a comparative structure and sequence analysis of orthologous human and mouse genomic regions, and develop ROSETTA, the first cross-species comparison-based system for the prediction of protein coding regions in genomic sequences.by Serafin Batzoglou.Ph.D
An equality theorem prover based on grammar rewriting
Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 1996.Includes bibliographical references (p. 61-62).by Serafim Batzoglou.M.Eng
CONTRAfold: RNA secondary structure prediction without physics-based models
doi:10.1093/bioinformatics/btl24
Fast and scalable inference of multi-sample cancer lineages.
Somatic variants can be used as lineage markers for the phylogenetic reconstruction of cancer evolution. Since somatic phylogenetics is complicated by sample heterogeneity, novel specialized tree-building methods are required for cancer phylogeny reconstruction. We present LICHeE (Lineage Inference for Cancer Heterogeneity and Evolution), a novel method that automates the phylogenetic inference of cancer progression from multiple somatic samples. LICHeE uses variant allele frequencies of somatic single nucleotide variants obtained by deep sequencing to reconstruct multi-sample cell lineage trees and infer the subclonal composition of the samples. LICHeE is open source and available at http://viq854.github.io/lichee
- β¦