11,354 research outputs found

    Measuring text simplification with the crowd

    Full text link
    Text can often be complex and difficult to read, especially for peo ple with cognitive impairments or low literacy skills. Text simplifi cation is a process that reduces the complexity of both wording and structure in a sentence, while retaining its meaning. However, this is currently a challenging task for machines, and thus, providing effective on-demand text simplification to those who need it re mains an unsolved problem. Even evaluating the simplicity of text remains a challenging problem for both computers, which cannot understand the meaning of text, and humans, who often struggle to agree on what constitutes a good simplification. This paper focuses on the evaluation of English text simplifica tion using the crowd. We show that leveraging crowds can result in a collective decision that is accurate and converges to a consen sus rating. Our results from 2,500 crowd annotations show that the crowd can effectively rate levels of simplicity. This may allow sim plification systems and system builders to get better feedback about how well content is being simplified, as compared to standard mea sures which classify content into ‘simplified ’ or ‘not simplified’ categories. Our study provides evidence that the crowd could be used to evaluate English text simplification, as well as to create simplified text in future work

    Social Sensing of Floods in the UK

    Get PDF
    "Social sensing" is a form of crowd-sourcing that involves systematic analysis of digital communications to detect real-world events. Here we consider the use of social sensing for observing natural hazards. In particular, we present a case study that uses data from a popular social media platform (Twitter) to detect and locate flood events in the UK. In order to improve data quality we apply a number of filters (timezone, simple text filters and a naive Bayes `relevance' filter) to the data. We then use place names in the user profile and message text to infer the location of the tweets. These two steps remove most of the irrelevant tweets and yield orders of magnitude more located tweets than we have by relying on geo-tagged data. We demonstrate that high resolution social sensing of floods is feasible and we can produce high-quality historical and real-time maps of floods using Twitter.Comment: 24 pages, 6 figure

    Syn-QG: Syntactic and Shallow Semantic Rules for Question Generation

    Full text link
    Question Generation (QG) is fundamentally a simple syntactic transformation; however, many aspects of semantics influence what questions are good to form. We implement this observation by developing Syn-QG, a set of transparent syntactic rules leveraging universal dependencies, shallow semantic parsing, lexical resources, and custom rules which transform declarative sentences into question-answer pairs. We utilize PropBank argument descriptions and VerbNet state predicates to incorporate shallow semantic content, which helps generate questions of a descriptive nature and produce inferential and semantically richer questions than existing systems. In order to improve syntactic fluency and eliminate grammatically incorrect questions, we employ back-translation over the output of these syntactic rules. A set of crowd-sourced evaluations shows that our system can generate a larger number of highly grammatical and relevant questions than previous QG systems and that back-translation drastically improves grammaticality at a slight cost of generating irrelevant questions.Comment: Some of the results in the paper were incorrec
    • …
    corecore