20,456 research outputs found

    Fitting Ranked English and Spanish Letter Frequency Distribution in U.S. and Mexican Presidential Speeches

    Full text link
    The limited range in its abscissa of ranked letter frequency distributions causes multiple functions to fit the observed distribution reasonably well. In order to critically compare various functions, we apply the statistical model selections on ten functions, using the texts of U.S. and Mexican presidential speeches in the last 1-2 centuries. Dispite minor switching of ranking order of certain letters during the temporal evolution for both datasets, the letter usage is generally stable. The best fitting function, judged by either least-square-error or by AIC/BIC model selection, is the Cocho/Beta function. We also use a novel method to discover clusters of letters by their observed-over-expected frequency ratios.Comment: 7 figure

    Edsger Wybe Dijkstra (1930 -- 2002): A Portrait of a Genius

    Get PDF
    We discuss the scientific contributions of Edsger Wybe Dijkstra, his opinions and his legacy.Comment: 10 pages. To appear in Formal Aspects of Computin

    Beyond Stemming and Lemmatization: Ultra-stemming to Improve Automatic Text Summarization

    Full text link
    In Automatic Text Summarization, preprocessing is an important phase to reduce the space of textual representation. Classically, stemming and lemmatization have been widely used for normalizing words. However, even using normalization on large texts, the curse of dimensionality can disturb the performance of summarizers. This paper describes a new method for normalization of words to further reduce the space of representation. We propose to reduce each word to its initial letters, as a form of Ultra-stemming. The results show that Ultra-stemming not only preserve the content of summaries produced by this representation, but often the performances of the systems can be dramatically improved. Summaries on trilingual corpora were evaluated automatically with Fresa. Results confirm an increase in the performance, regardless of summarizer system used.Comment: 22 pages, 12 figures, 9 table

    A Computer-Based Method to Improve the Spelling of Children with Dyslexia

    Full text link
    In this paper we present a method which aims to improve the spelling of children with dyslexia through playful and targeted exercises. In contrast to previous approaches, our method does not use correct words or positive examples to follow, but presents the child a misspelled word as an exercise to solve. We created these training exercises on the basis of the linguistic knowledge extracted from the errors found in texts written by children with dyslexia. To test the effectiveness of this method in Spanish, we integrated the exercises in a game for iPad, DysEggxia (Piruletras in Spanish), and carried out a within-subject experiment. During eight weeks, 48 children played either DysEggxia or Word Search, which is another word game. We conducted tests and questionnaires at the beginning of the study, after four weeks when the games were switched, and at the end of the study. The children who played DysEggxia for four weeks in a row had significantly less writing errors in the tests that after playing Word Search for the same time. This provides evidence that error-based exercises presented in a tablet help children with dyslexia improve their spelling skills.Comment: 8 pages, ASSETS'14, October 20-22, 2014, Rochester, NY, US

    Information Outlook, May 2004

    Get PDF
    Volume 8, Issue 5https://scholarworks.sjsu.edu/sla_io_2004/1004/thumbnail.jp

    Assessing candidate preference through web browsing history

    Full text link
    Predicting election outcomes is of considerable interest to candidates, political scientists, and the public at large. We propose the use of Web browsing history as a new indicator of candidate preference among the electorate, one that has potential to overcome a number of the drawbacks of election polls. However, there are a number of challenges that must be overcome to effectively use Web browsing for assessing candidate preference—including the lack of suitable ground truth data and the heterogeneity of user populations in time and space. We address these challenges, and show that the resulting methods can shed considerable light on the dynamics of voters’ candidate preferences in ways that are difficult to achieve using polls.Accepted manuscrip

    Improving a Strong Neural Parser with Conjunction-Specific Features

    Full text link
    While dependency parsers reach very high overall accuracy, some dependency relations are much harder than others. In particular, dependency parsers perform poorly in coordination construction (i.e., correctly attaching the "conj" relation). We extend a state-of-the-art dependency parser with conjunction-specific features, focusing on the similarity between the conjuncts head words. Training the extended parser yields an improvement in "conj" attachment as well as in overall dependency parsing accuracy on the Stanford dependency conversion of the Penn TreeBank

    The Cowl - Vol LNVII - n.4 - Oct 15, 1992

    Get PDF
    The Cowl - student newspaper of Providence College. Volume LNVII - Number 4 - October 15, 1992. 24 pages
    • …
    corecore