502 research outputs found
HOW MANY WORDS ARE THERE?
The commonsensical assumption that any language has only finitely many words is shown to be false by a combination of formal and empirical arguments. Zipf's Law and related formulas are investigated and a more complex model is offered
Developing academic vocabulary objectives: corpus analysis and word lists
This chapter is concerned with the development of a vocabulary syllabus to accompany and supplement a coursebook series. To further enhance the use of these books, we decided to initiate a project to develop an academic vocabulary syllabus. This chapter will outline the process of analyzing corpus data, deciding on lexis to cover and how to spread this across the course of study, creating support materials to facilitate learning as well as incorporating this into the wider curriculum
Log-log Convexity of Type-Token Growth in Zipf's Systems
It is traditionally assumed that Zipf's law implies the power-law growth of
the number of different elements with the total number of elements in a system
- the so-called Heaps' law. We show that a careful definition of Zipf's law
leads to the violation of Heaps' law in random systems, and obtain alternative
growth curves. These curves fulfill universal data collapses that only depend
on the value of the Zipf's exponent. We observe that real books behave very
much in the same way as random systems, despite the presence of burstiness in
word occurrence. We advance an explanation for this unexpected correspondence
A characterization of horizontal visibility graphs and combinatorics on words
An Horizontal Visibility Graph (for short, HVG) is defined in association
with an ordered set of non-negative reals. HVGs realize a methodology in the
analysis of time series, their degree distribution being a good discriminator
between randomness and chaos [B. Luque, et al., Phys. Rev. E 80 (2009),
046103]. We prove that a graph is an HVG if and only if outerplanar and has a
Hamilton path. Therefore, an HVG is a noncrossing graph, as defined in
algebraic combinatorics [P. Flajolet and M. Noy, Discrete Math., 204 (1999)
203-229]. Our characterization of HVGs implies a linear time recognition
algorithm. Treating ordered sets as words, we characterize subfamilies of HVGs
highlighting various connections with combinatorial statistics and introducing
the notion of a visible pair. With this technique we determine asymptotically
the average number of edges of HVGs.Comment: 6 page
Assessing direct contributions of morphological awareness and prosodic sensitivity to children’s word reading and reading comprehension
We examined the independent contributions of prosodic sensitivity and morphological awareness to word reading, text reading accuracy, and reading comprehension. We did so in a longitudinal study of English-speaking children (N = 70). At 5 to 7 years of age, children completed the metalinguistic measures along with control measures of phonological awareness and vocabulary. Children completed the reading measures two years later. Morphological awareness, but not prosodic sensitivity made a significant independent contribution to word reading, text reading accuracy and reading comprehension. The effects of morphological awareness on reading comprehension remained after controls for word reading. These results suggest that morphological awareness needs to be considered seriously in models of reading development and that prosodic sensitivity might have primarily indirect relations to reading outcomes.
Keywords: Morphological Awareness; Prosody; Word Reading; Reading Comprehension
CCBS – a method to maintain memorability, accuracy of password submission and the effective password space in click-based visual passwords
Text passwords are vulnerable to many security attacks due to a number of reasons such as the insecure practices of end
users who select weak passwords to maintain their long term memory. As such, visual password (VP) solutions were
developed to maintain the security and usability of user authentication in collaborative systems. This paper focuses on the
challenges facing click-based visual password systems and proposes a novel method in response to them. For instance,
Hotspots reveal a serious vulnerability. They occur because users are attracted to specific parts of an image and neglect
other areas. Undertaking image analysis to identify these high probability areas can assist dictionary attacks.
Another concern is that click-based systems do not guide users towards the correct click-point they are aiming to
select. For instance, users might recall the correct spot or area but still fail to include their click within the tolerance
distance around the original click-point which results in more incorrect password submissions.
Nevertheless, the Passpoints study by Wiedenbeck et al., 2005 inspected the retention of their VP in comparison with
text passwords over the long term. Despite being cued-recall the successful rate of their VP submission was not superior
to text passwords as it decreased from 85% (the instant retention on the day of registration) to 55% after 2 weeks. This
result was identical to that of the text password in the same experiment. The successful submission rates after 6 weeks
were also 55% for both VP and text passwords.
This paper addresses these issues, and then presents a novel method (CCBS) as a usable solution supported by an
empirical proof. A user study is conducted and the results are evaluated against a comparative study
Role of radical awareness in the character and word acquisition of Chinese children
Includes bibliographical references
On Hilberg's Law and Its Links with Guiraud's Law
Hilberg (1990) supposed that finite-order excess entropy of a random human
text is proportional to the square root of the text length. Assuming that
Hilberg's hypothesis is true, we derive Guiraud's law, which states that the
number of word types in a text is greater than proportional to the square root
of the text length. Our derivation is based on some mathematical conjecture in
coding theory and on several experiments suggesting that words can be defined
approximately as the nonterminals of the shortest context-free grammar for the
text. Such operational definition of words can be applied even to texts
deprived of spaces, which do not allow for Mandelbrot's ``intermittent
silence'' explanation of Zipf's and Guiraud's laws. In contrast to
Mandelbrot's, our model assumes some probabilistic long-memory effects in human
narration and might be capable of explaining Menzerath's law.Comment: To appear in Journal of Quantitative Linguistic
- …