Search CORE

3,130 research outputs found

Neural CRF Parsing

Author: Durrett Greg
Klein Dan
Publication venue
Publication date: 01/01/2015
Field of study

This paper describes a parsing model that combines the exact dynamic programming of CRF parsing with the rich nonlinear featurization of neural net approaches. Our model is structurally a CRF that factors over anchored rule productions, but instead of linear potential functions based on sparse features, we use nonlinear potentials computed via a feedforward neural network. Because potentials are still local to anchored rules, structured inference (CKY) is unchanged from the sparse case. Computing gradients during learning involves backpropagating an error signal formed from standard CRF sufficient statistics (expected rule counts). Using only dense features, our neural CRF already exceeds a strong baseline CRF model (Hall et al., 2014). In combination with sparse features, our system achieves 91.1 F1 on section 23 of the Penn Treebank, and more generally outperforms the best prior single parser results on a range of languages.Comment: Accepted for publication at ACL 201

arXiv.org e-Print Archive

Crossref

Coexistence in stochastic spatial models

Author: Durrett Rick
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 12/06/2009
Field of study

In this paper I will review twenty years of work on the question: When is there coexistence in stochastic spatial models? The answer, announced in Durrett and Levin [Theor. Pop. Biol. 46 (1994) 363--394], and that we explain in this paper is that this can be determined by examining the mean-field ODE. There are a number of rigorous results in support of this picture, but we will state nine challenging and important open problems, most of which date from the 1990's.Comment: Published in at http://dx.doi.org/10.1214/08-AAP590 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Waiting for regulatory sequences to appear

Author: Durrett Richard
Schmidt Deena
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2007
Field of study

One possible explanation for the substantial organismal differences between humans and chimpanzees is that there have been changes in gene regulation. Given what is known about transcription factor binding sites, this motivates the following probability question: given a 1000 nucleotide region in our genome, how long does it take for a specified six to nine letter word to appear in that region in some individual? Stone and Wray [Mol. Biol. Evol. 18 (2001) 1764--1770] computed 5,950 years as the answer for six letter words. Here, we will show that for words of length 6, the average waiting time is 100,000 years, while for words of length 8, the waiting time has mean 375,000 years when there is a 7 out of 8 letter match in the population consensus sequence (an event of probability roughly 5/16) and has mean 650 million years when there is not. Fortunately, in biological reality, the match to the target word does not have to be perfect for binding to occur. If we model this by saying that a 7 out of 8 letter match is good enough, the mean reduces to about 60,000 years.Comment: Published at http://dx.doi.org/10.1214/105051606000000619 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Asymptotic behavior of Aldous' gossip process

Author: Chatterjee Shirshendu
Durrett Rick
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2010
Field of study

Aldous [(2007) Preprint] defined a gossip process in which space is a discrete

N\times N

torus, and the state of the process at time

t

is the set of individuals who know the information. Information spreads from a site to its nearest neighbors at rate 1/4 each and at rate

N^{-\alpha}

to a site chosen at random from the torus. We will be interested in the case in which

\alpha<3

, where the long range transmission significantly accelerates the time at which everyone knows the information. We prove three results that precisely describe the spread of information in a slightly simplified model on the real torus. The time until everyone knows the information is asymptotically

T=(2-2\alpha/3)N^{\alpha/3}\log N

. If

\rho_s

is the fraction of the population who know the information at time

s

and

\varepsilon

is small then, for large

N

, the time until

\rho_s

reaches

\varepsilon

T(\varepsilon)\approx T+N^{\alpha/3}\log (3\varepsilon /M)

, where

M

is a random variable determined by the early spread of the information. The value of

\rho_s

at time

s=T(1/3)+tN^{\alpha/3}

is almost a deterministic function

h(t)

which satisfies an odd looking integro-differential equation. The last result confirms a heuristic calculation of Aldous.Comment: Published in at http://dx.doi.org/10.1214/10-AAP750 the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref