5,188 research outputs found
Ordering the suggestions of a spellchecker without using context.
Having located a misspelling, a spellchecker generally offers some suggestions for the intended word. Even without using context, a spellchecker can draw on various types of information in ordering its suggestions. A series of experiments is described, beginning with a basic corrector that implements a well-known algorithm for reversing single simple errors, and making successive enhancements to take account of substring matches, pronunciation, known error patterns, syllable structure and word frequency. The improvement in the ordering produced by each enhancement is measured on a large corpus of misspellings. The final version is tested on other corpora against a widely used commercial spellchecker and a research prototype
Devil in Deerskins: My Life with Grey Owl by Anahareo
Review of Anaharea\u27s Devil in Deerskins: My Life with Grey Owl
A study of the teaching of plane geometry in the junior and senior high schools
Thesis (M.A.)--Boston Universit
Masculindians: Conversations About Indigenous Manhood by Sam McKegney
Review of Sam McKegneyâs Masculindians: Conversations About Indigenous Manhood
A large list of confusion sets for spellchecking assessed against a corpus of real-word errors
One of the methods that has been proposed for dealing with real-word errors (errors that occur when a correctly spelled word is substituted for the one intended) is the "confusion-set" approach - a confusion set being a small group of words that are likely to be confused with one another. Using a list of confusion sets drawn up in advance, a spellchecker, on finding one of these words in a text, can assess whether one of the other members of its set would be a better fit and, if it appears to be so, propose that word as a correction. Much of the research using this approach has suffered from two weaknesses. The first is the small number of confusion sets used. The second is that systems have largely been tested on artificial errors. In this paper we address these two weaknesses. We describe the creation of a realistically sized list of confusion sets, then the assembling of a corpus of real-word errors, and then we assess the potential of that list in relation to that corpus
The adaptation of an English spellchecker for Japanese writers
It has been pointed out that the spelling errors made by second-language writers writing in English have features that are to some extent characteristic of their first language, and the suggestion has been made that a spellchecker could be adapted to take account of these features. In the work reported here, a corpus of spelling errors made by Japanese writers writing in English was compared with a corpus of errors made by native speakers. While the great majority of errors were common to the two corpora, some distinctively Japanese error patterns were evident against this common background, notably a difficulty in deciding between the letters b and v, and the letters l and r, and a tendency to add syllables. A spellchecker that had been developed for native speakers of English was adapted to cope with these errors. A brief account is given of the spellcheckerâs mode of operation to indicate how it lent itself to modifications of this kind. The native-speaker spellchecker and the Japanese-adapted version were run over the error corpora and the results show that these adaptations produced a modest but worthwhile improvement to the spellcheckerâs performance in correcting Japanese-made errors
A new perspective on steady-state cosmology: from Einstein to Hoyle
We recently reported the discovery of an unpublished manuscript by Albert
Einstein in which he attempted a 'steady-state' model of the universe, i.e., a
cosmic model in which the expanding universe remains essentially unchanged due
to a continuous formation of matter from empty space. The manuscript was
apparently written in early 1931, many years before the steady-state models of
Fred Hoyle, Hermann Bondi and Thomas Gold. We compare Einstein's steady-state
cosmology with that of Hoyle, Bondi and Gold and consider the reasons Einstein
abandoned his model. The relevance of steady-state models for today's cosmology
is briefly reviewed.Comment: To be published in the 'Proceedings of the 2014 Institute of Physics
International Conference on the History of Physics', Cambridge University
Press. arXiv admin note: substantial text overlap with arXiv:1504.02873,
arXiv:1402.013
A social forecast revisited
In 1971, the authors produced a 30-year forecast of leisure in the UK. In 2001 they obtained survey data for comparison with the forecasts. The paper presents the original forecasts and describes the methods used to produce them, assesses their accuracy in the light of the survey data, and concludes with some reflections on the underlying forecasting methodology and on changes in leisure patterns
BNC! Handle with care! Spelling and tagging errors in the BNC
"You loose your no-claims bonus," instead of "You lose your no-claims bonus," is an example of a real-word spelling error. One way to enable a spellchecker to detect such errors is to prime it with information about likely features of the context for "loose" (verb) as compared with "lose". To this end, we extracted all the examples of "loose" used as a verb from the BNC (World edition, text).
There were, apparently, 159 occurrences of "loose" (VVB or VVI). However, on inspection, well over half of these were not verbs at all (tagging errors) and over half of the rest were misspellings of "lose". Only about 15% were actual occurrences of "loose" as a verb.
This prompted us to undertake a small investigation into errors in the BNC. We report on some words that occur more often as misspellings than in their own right - only one of the 63 occurrences of "ail", for example, is correct (possibly OCR errors) - and some words that are always mistagged, such as "haulier" and "glazier" (never NN), and "hanker" and "loiter" (never VV). We note in particular that, if a rare word resembles a common word (in spelling), it is more likely to appear as a misspelling of the common word than as a correct spelling of the rare word. These cases require some modification of an earlier conclusion (Damerau and Mays, 1989) on misspellings of rare words.
We conclude with a discussion of the desirability, or otherwise, of correcting errors in corpora such as the BNC.
The results may be of interest to people who use the BNC as training data or for teaching
Investability and Firm Value
We study how investability, or openness to foreign equity investors, affects firm value in a sample of over 1,400 firms from 26 emerging markets. We find that, on average, investability is associated with a 9% valuation premium (as measured by Tobin's q). However, in firm-fixed effects regressions this valuation premium disappears, suggesting that investability does not have a causal effect on firm value. Analysis of the components of Tobin's q shows that firms that become investable experience significant increases in both market values and physical investment. These effects are strongest for firms that face country-level or firm-level financial constraints prior to becoming investableFinancial liberalization; Investability; Foreign investors; Tobin's q
- âŠ