Search CORE

101 research outputs found

Hot Hands, Streaks and Coin-flips: Numerical Nonsense in the New York Times

Author: Gusfield Dan
Publication venue
Publication date: 29/12/2015
Field of study

The existence of "Hot Hands" and "Streaks" in sports and gambling is hotly debated, but there is no uncertainty about the recent batting-average of the New York Times: it is now two-for-two in mangling and misunderstanding elementary concepts in probability and statistics; and mixing up the key points in a recent paper that re-examines earlier work on the statistics of streaks. In so doing, it's high-visibility articles have added to the general-public's confusion about probability, making it seem mysterious and paradoxical when it needn't be. However, those articles make excellent case studies on how to get it wrong, and for discussions in high-school and college classes focusing on quantitative reasoning, data analysis, probability and statistics. What I have written here is intended for that audience

arXiv.org e-Print Archive

eScholarship - University of California

G\"odel for Goldilocks: A Rigorous, Streamlined Proof of (a variant of) G\"odel's First Incompleteness Theorem

Author: Gusfield Dan
Publication venue
Publication date: 21/09/2014
Field of study

Most discussions of G\"odel's theorems fall into one of two types: either they emphasize perceived philosophical, cultural "meanings" of the theorems, and perhaps sketch some of the ideas of the proofs, usually relating G\"odel's proofs to riddles and paradoxes, but do not attempt to present rigorous, complete proofs; or they do present rigorous proofs, but in the traditional style of mathematical logic, with all of its heavy notation and difficult definitions, and technical issues which reflect G\"odel's original approach and broader logical issues. Many non-specialists are frustrated by these two extreme types of expositions and want a complete, rigorous proof that they can understand. Such an exposition is possible, because many people have realized that variants of G\"odel's first incompleteness theorem can be rigorously proved by a simpler middle approach, avoiding philosophical discussions and hand-waiving at one extreme; and also avoiding the heavy machinery of traditional mathematical logic, and many of the harder detail's of G\"odel's original proof, at the other extreme. This is the just-right Goldilocks approach. In this exposition we give a short, self-contained Goldilocks exposition of G\"odel's first theorem, aimed at a broad, undergraduate audience.Comment: Version 2 corrects typos and one definition in the first version, and expands or contracts parts of the exposition, but the main content remains the same. Version 3 removes an unnecessary comment in Version

arXiv.org e-Print Archive

eScholarship - University of California

19th International Workshop on Algorithms in Bioinformatics (WABI 2019)

Author: Gusfield Dan
Huber Katharina
Publication venue: Schloss Dagstuhl – Leibniz-Zentrum für Informatik GmbH, Dagstuhl Publishing, Saarbrücken/Wadern, Germany.
Publication date: 01/01/2019
Field of study

Front Matter, Table of Contents, Preface, Conference Organizatio

Dagstuhl Research Online Publication Server

University of East Anglia digital repository

Linear time algorithms for finding and representing all the tandem repeats in a string

Author: Gusfield Dan
Stoye Jens
Publication venue: 'Elsevier BV'
Publication date: 01/01/2004
Field of study

Gusfield D, Stoye J. Linear time algorithms for finding and representing all the tandem repeats in a string. Journal of computer and system sciences. 2004;69(4):525-546.A tandem repeat (or square) is a string [alpha][alpha], where [alpha] is a non-empty string. We present an O(|S|)-time algorithm that operates on the suffix tree T(S) for a string S, finding and marking the endpoint in T(S) of every tandem repeat that occurs in S. This decorated suffix tree implicitly represents all occurrences of tandem repeats in S, and can be used to efficiently solve many questions concerning tandem repeats and tandem arrays in S. This improves and generalizes several prior efforts to efficiently capture large subsets of tandem repeats

Elsevier - Publisher Connector

Publications at Bielefeld University

An efficiently computed lower bound on the number of recombinations in phylogenetic networks: Theory and empirical study

Author: Eddhu Satish
Gusfield Dan
Hickerson Dean
Publication venue: Elsevier B.V.
Publication date: 01/04/2007
Field of study

AbstractPhylogenetic networks are models of sequence evolution that go beyond trees, allowing biological operations that are not tree-like. One of the most important biological operations is recombination between two sequences. An established problem [J. Hein, Reconstructing evolution of sequences subject to recombination using parsimony, Math. Biosci. 98 (1990) 185–200; J. Hein, A heuristic method to reconstruct the history of sequences subject to recombination, J. Molecular Evoluation 36 (1993) 396–405; Y. Song, J. Hein, Parsimonious reconstruction of sequence evolution and haplotype blocks: finding the minimum number of recombination events, in: Proceedings of 2003 Workshop on Algorithms in Bioinformatics, Berlin, Germany, 2003, Lecture Notes in Computer Science, Springer, Berlin; Y. Song, J. Hein, On the minimum number of recombination events in the evolutionary history of DNA sequences, J. Math. Biol. 48 (2003) 160–186; L. Wang, K. Zhang, L. Zhang, Perfect phylogenetic networks with recombination, J. Comput. Biol. 8 (2001) 69–78; S.R. Myers, R.C. Griffiths, Bounds on the minimum number of recombination events in a sample history, Genetics 163 (2003) 375–394; V. Bafna, V. Bansal, Improved recombination lower bounds for haplotype data, in: Proceedings of RECOMB, 2005; Y. Song, Y. Wu, D. Gusfield, Efficient computation of close lower and upper bounds on the minimum number of needed recombinations in the evolution of biological sequences, Bioinformatics 21 (2005) i413–i422. Bioinformatics (Suppl. 1), Proceedings of ISMB, 2005, D. Gusfield, S. Eddhu, C. Langley, Optimal, efficient reconstruction of phylogenetic networks with constrained recombination, J. Bioinform. Comput. Biol. 2(1) (2004) 173–213; D. Gusfield, Optimal, efficient reconstruction of root-unknown phylogenetic networks with constrained and structured recombination, J. Comput. Systems Sci. 70 (2005) 381–398] is to find a phylogenetic network that derives an input set of sequences, minimizing the number of recombinations used. No efficient, general algorithm is known for this problem. Several papers consider the problem of computing a lower bound on the number of recombinations needed. In this paper we establish a new, efficiently computed lower bound. This result is useful in methods to estimate the number of needed recombinations, and also to prove the optimality of algorithms for constructing phylogenetic networks under certain conditions [D. Gusfield, S. Eddhu, C. Langley, Optimal, efficient reconstruction of phylogenetic networks with constrained recombination, J. Bioinform. Comput. Biol. 2(1) (2004) 173–213; D. Gusfield, Optimal, efficient reconstruction of root-unknown phylogenetic networks with constrained and structured recombination, J. Comput. Systems Sci. 70 (2005) 381–398; D. Gusfield, Optimal, efficient reconstruction of root-unknown phylogenetic networks with constrained recombination, Technical Report, Department of Computer Science, University of California, Davis, CA, 2004]. The lower bound is based on a structural, combinatorial insight, using only the site conflicts and incompatibilities, and hence it is fundamental and applicable to many biological phenomena other than recombination, for example, when gene conversions or recurrent or back mutations or cross-species hybridizations cause the phylogenetic history to deviate from a tree structure. In addition to establishing the bound, we examine its use in more complex lower bound methods, and compare the bounds obtained to those obtained by other established lower bound methods

Elsevier - Publisher Connector

A simple, practical and complete O-time Algorithm for RNA folding using the Four-Russians Speedup

Author: Dan Gusfield
IL Hofacker
J Kleinberg
M Zuker
M Zuker
MS Waterman
P Clote
R Backofen
R Durbin
R Nussinov
R Nussinov
SE Seemann
SL Graham
T Akutsu
TM Chan
Y Wexler
Yelena Frid
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background The problem of computationally predicting the secondary structure (or folding) of RNA molecules was first introduced more than thirty years ago and yet continues to be an area of active research and development. The basic <it>RNA-folding problem </it>of finding a maximum cardinality, non-crossing, matching of complimentary nucleotides in an RNA sequence of length <it>n</it>, has an <it>O</it>(<it>n</it>3)-time dynamic programming solution that is widely applied. It is known that an <it>o</it>(<it>n</it>3) worst-case time solution is possible, but the published and suggested methods are complex and have not been established to be practical. Significant practical improvements to the original dynamic programming method have been introduced, but they retain the <it>O</it>(<it>n</it>3) worst-case time bound when <it>n </it>is the only problem-parameter used in the bound. Surprisingly, the most widely-used, general technique to achieve a worst-case (and often practical) speed up of dynamic programming, the <it>Four-Russians </it>technique, has not been previously applied to the RNA-folding problem. This is perhaps due to technical issues in adapting the technique to RNA-folding. Results In this paper, we give a simple, complete, and practical Four-Russians algorithm for the basic RNA-folding problem, achieving a worst-case time-bound of <it>O</it>(<it>n</it>3/log(<it>n</it>)). Conclusions We show that this time-bound can also be obtained for richer nucleotide matching scoring-schemes, and that the method achieves consistent speed-ups in practice. The contribution is both theoretical and practical, since the basic RNA-folding problem is often solved multiple times in the inner-loop of more complex algorithms, and for long RNA molecules in the study of RNA virus genomes.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Constructing perfect phylogenies and proper triangulations for three-state characters

Author: A Berry
A Dress
A Parra
A Parra
C Ho
C Semple
D Gusfield
D Gusfield
Dan Gusfield
F Gavril
F Lam
Fumei Lam
GA Dirac
H Bodlaender
J Blair
J Blair
M Golumbic
MA Steel
P Buneman
R Agarwala
R Gysel
R Tarjan
Rob Gysel
S Kannan
S Kannan
T Kloks
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Escândalos, marolas e finanças: para uma sociologia da transformação do ambiente econômico

Author: ADUT Ari.
ALDÉ Alessandra
ANDERSON Jenny
ARGE Arlette
BARBOSA Marina
BECKER Howard S.
BLAIR Margaret M.
BLAIR Margaret M.
BLUM Léon
BOHN C.
BOLTANSKI Luc
BOLTANSKI Luc
BOUCHER Eric Le.
BOURDIEU Pierre
BOURDIEU Pierre
BOURDIEU Pierre
BOURDIEU Pierre
BOURDIEU Pierre
BOURDIEU Pierre
BOURDIEU Pierre
BOURDIEU Pierre
BOURDIEU Pierre
BOURDIEU Pierre
BROWN Dan.
CAMBA Daniele
CAMBA Daniele
CANÇADO Patrícia
CARTER Chris
CARVALHO Denise
CHAGAS Helena
CHAMPAGNE Patrick
CHAMPAGNE Patrick.
CHAMPAGNE Patrick.
CHAMPAGNE Patrick.
CHANTAL François V. de.
CHARLE Christophe
CHARLE Christophe
CHARTIER Roger
COFFEE John C.
COHN Norman.
COLLINS Randall
CONWAY Edmund.
COSTA Luciano Martins
DARNTON Robert
DE BLIC Damien
DE BLIC Damien
DE GOEDE Marieke
DE GOEDE Marieke
DESROSIÈRES Alain
DIMAGGIO Paul
DINES Alberto
DOBRY Michel.
DOUGHERTY Carter
DOUGLAS Mary
DOUGLAS Mary
DOUGLAS Mary
DUBY Georges
ERNER Guillaume
FLECK Ludwik
FROUD Julie
GARRIGOU Alain
GARRIGOU Alain
GASPAR Malu.
GATES Megan N.
GIRARDET Raoul.
GOIS Chico de
GOODY Jack
GRÜN Roberto
GRÜN Roberto
GRÜN Roberto
GRÜN Roberto
GRÜN Roberto
GRÜN Roberto
GRÜN Roberto
GRÜN Roberto
GUERREIRO Gabriela
GUEX Sébastien
GUSFIELD Joseph R.
HACKING Ian.
HACKING Ian.
HACKING Ian.
HACKING Ian.
HUNTER James Davison
JAMIESON Kathleen Hall
JAMIESON Kathleen Hall
JARDIM Maria A. Chaves
KOTCHO Ricardo
KOTCHO Ricardo
LAKOFF George
LAKOFF George
LAMUCCI Sergio
LAWSON-BORDERS Gracie
LDRIN Philippe
LEFEBVRE Georges
LEONEL Rita Chaves
LEWIS Bernard
LIMA Venício A. de
LIRIO Sergio
LIVEIRA Flávia
LUNTZ Frank I.
LUNTZ Frank I.
MACKENZIE Donald
MACKENZIE Donald
MANOFF Robert Karl
MARRUS Michael R.
MARTINS Ricardo Malavazi
MAUGER Gérard
MEDINA João.
MEIRELES A.
MENEZES Maiá
MOLICA Fernando
NASSIF Luis
NOSSA Leonencio
OFFERLÉ Michael
PALHANO A.
PAUWELS Louis
PEREIRA Raimundo Rodrigues
PORTER Theodore M.
PRETE Renata Lo.
RECONDO Felipe
RIEDER Jonathan.
RIEDER Jonathan.
Roberto Grün
RODRIGUES Fernando
SANTOS Chico.
SANTOS Wanderley Guilherme dos
SARDENBERG Carlos Alberto
SCHUDSON Michael.
SCHUDSON Michael.
SCHUDSON Michael.
SCHUDSON Michael.
SCHUDSON Michael.
SHILLER Robert J.
SHIRLEY Mary M.
SMITH Peter.
SORKIN Andrew Ross
SOUZA Leonardo
STERNHELL Zeev.
SUNSTEIN Cass R.
SÁ Nelson de.
TAGUIEFF Pierre-André
TEODORO Plínio
THOMPSON John B.
WELCH Jack
WILLIAMS Rhys H.
Publication venue: 'FapUNIFESP (SciELO)'
Publication date: 01/01/2008
Field of study

Crossref

Simple and flexible detection of contiguous repeats using a suffix tree

Author: Gusfield Dan
Stoye Jens
Publication venue: 'Elsevier BV'
Publication date: 01/01/2002
Field of study

Stoye J, Gusfield D. Simple and flexible detection of contiguous repeats using a suffix tree. Theoretical Computer Science. 2002;270(1-2):843-856.We study the problem of detecting all occurrences of (primitive) tandem repeats and tandem arrays in a string. We first give a simple time- and space-optimal algorithm to find all tandem repeats, and then modify it to become a time and space-optimal algorithm for finding only the primitive tandem repeats. Both of these algorithms are then extended to handle tandem arrays. The contribution of this paper is both pedagogical and practical, giving simple algorithms and implementations based on a suffix tree, using only standard tree traversal techniques

Elsevier - Publisher Connector

Crossref

Publications at Bielefeld University