Search CORE

16 research outputs found

Alignment as the Basis for Successful Communication

Author: A. Meltzoff
A.A. Cleland
A.G. Greenwald
A.J. Sanford
B. Hommel
B. Szmrecsanyi
C.A. Fowler
D. Tannen
D.K. Lewis
E.A. Isaacs
E.V. Clark
G. Jefferson
G. Nunberg
G. Rizzolatti
G.T.M. Altmann
H.H. Clark
H.H. Clark
H.H. Clark
H.H. Clark
H.P. Branigan
J.A. Fodor
K. Aijmer
K. Kuiper
K.A. Ericsson
L. Fadiga
M.F. Schober
M.F. Schober
M.J. Pickering
M.J. Pickering
Martin J. Pickering
P. Brooks
P.N. Johnson-Laird
R. Jackendoff
R. Jackendoff
R.A. Zwaan
R.J. Hartsuiker
S. Garrod
S. Garrod
S. Garrod
S.A. Goldinger
S.C. Levinson
S.E. Brennan
S.L. Haywood
S.T. Gries
Simon Garrod
T.L. Chartrand
W. Tabor
W.J.M. Levelt
W.J.M. Levelt
W.S. Horton
W.S. Horton
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Polytrauma

Author: A. Gries
K.G. Kranz
S.T. Rose
T. Ziegenfuss
U. Kreimeier
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2007
Field of study

Crossref

Analyzing word frequencies in large text corpora using inter-arrival times and bootstrapping

Author: A. Kilgarriff
A.-L. Barabási
B. Szmrecsanyi
B.V. North
D. Biber
G.K. Zipf
J. Kleinberg
P. Rayson
R. Baeza-Yates
R. Kumar
S.T. Gries
S.T. Gries
T. Dunning
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2011
Field of study

Comparing frequency counts over texts or corpora is an important task in many applications and scientific disciplines. Given a text corpus, we want to test a hypothesis, such as “word X is frequent”, “word X has become more frequent over time”, or “word X is more frequent in male than in female speech”. For this purpose we need a null model of word frequencies. The commonly used bag-of-words model, which corresponds to a Bernoulli process with fixed parameter, does not account for any structure present in natural languages. Using this model for word frequencies results in large numbers of words being reported as unexpectedly frequent. We address how to take into account the inherent occurrence patterns of words in significance testing of word frequencies. Based on studies of words in two large corpora, we propose two methods for modeling word frequencies that both take into account the occurrence patterns of words and go beyond the bag-of-words assumption. The first method models word frequencies based on the spatial distribution of individual words in the language. The second method is based on bootstrapping and takes into account only word frequency at the text level. The proposed methods are compared to the current gold standard in a series of experiments on both corpora. We find that words obey different spatial patterns in the language, ranging from bursty to non-bursty/uniform, independent of their frequency, showing that the traditional approach leads to many false positives

Crossref

Ghent University Academic Bibliography

Birkbeck Institutional Research Online

Omissibility of a Preposition in the Omission of a Prepositional Object in English Prepositional Phrases

Author: Aarts B
Browne W
Dixon R.M.W
Elbourne P
Fraser B
Gries S.T
Liu D
Mittwoch A
Morgan J
Publication venue: 'Institute for the Study of Language and Information, Kyung Hee University'
Publication date
Field of study

Crossref

Size matters: finding the most informative set of window lengths

Author: A.J. Gentles
C. Bourgain
E.D. Demaine
E.F. Kirkness
G. Benson
H. Mannila
H. Toivonen
J. Lijffijt
M.K. Das
R. Tang
R.M. Karp
S. Evert
S.M. Katz
S.T.. Gries
T. Calders
Y. Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Event sequences often contain continuous variability at different levels. In other words, their properties and characteristics change at different rates, concurrently. For example, the sales of a product may slowly become more frequent over a period of several weeks, but there may be interesting variation within a week at the same time. To provide an accurate and robust “view” of such multi-level structural behavior, one needs to determine the appropriate levels of granularity for analyzing the underlying sequence. We introduce the novel problem of finding the best set of window lengths for analyzing discrete event sequences. We define suitable criteria for choosing window lengths and propose an efficient method to solve the problem. We give examples of tasks that demonstrate the applicability of the problem and present extensive experiments on both synthetic data and real data from two domains: text and DNA. We find that the optimal sets of window lengths themselves can provide new insight into the data, e.g., the burstiness of events affects the optimal window lengths for measuring the event frequencies

CiteSeerX

Crossref

Ghent University Academic Bibliography

Birkbeck Institutional Research Online