794,540 research outputs found
How we do things with words: Analyzing text as social and cultural data
In this article we describe our experiences with computational text analysis.
We hope to achieve three primary goals. First, we aim to shed light on thorny
issues not always at the forefront of discussions about computational text
analysis methods. Second, we hope to provide a set of best practices for
working with thick social and cultural concepts. Our guidance is based on our
own experiences and is therefore inherently imperfect. Still, given our
diversity of disciplinary backgrounds and research practices, we hope to
capture a range of ideas and identify commonalities that will resonate for
many. And this leads to our final goal: to help promote interdisciplinary
collaborations. Interdisciplinary insights and partnerships are essential for
realizing the full potential of any computational text analysis that involves
social and cultural concepts, and the more we are able to bridge these divides,
the more fruitful we believe our work will be
Incremental generation of plural descriptions : similarity and partitioning
Approaches to plural reference generation
emphasise descriptive brevity, but often lack
empirical backing. This paper describes
a corpus-based study of plural descriptions,
and proposes a psycholinguisticallymotivated
algorithm for plural reference
generation. The descriptive strategy is based
on partitioning and incorporates corpusderived
heuristics. An exhaustive evaluation
shows that the output closely matches human
data.peer-reviewe
Readers and Reading in the First World War
This essay consists of three individually authored and interlinked sections. In ‘A Digital Humanities Approach’, Francesca Benatti looks at datasets and databases (including the UK Reading Experience Database) and shows how a systematic, macro-analytical use of digital humanities tools and resources might yield answers to some key questions about reading in the First World War. In ‘Reading behind the Wire in the First World War’ Edmund G. C. King scrutinizes the reading practices and preferences of Allied prisoners of war in Mainz, showing that reading circumscribed by the contingencies of a prison camp created an unique literary community, whose legacy can be traced through their literary output after the war. In ‘Book-hunger in Salonika’, Shafquat Towheed examines the record of a single reader in a specific and fairly static frontline, and argues that in the case of the Salonika campaign, reading communities emerged in close proximity to existing centres of print culture. The focus of this essay moves from the general to the particular, from the scoping of large datasets, to the analyses of identified readers within a specific geographical and temporal space. The authors engage with the wider issues and problems of recovering, interpreting, visualizing, narrating, and representing readers in the First World War
Natural language processing
Beginning with the basic issues of NLP, this chapter aims to chart the major research activities in this area since the last ARIST Chapter in 1996 (Haas, 1996), including: (i) natural language text processing systems - text summarization, information extraction, information retrieval, etc., including domain-specific applications; (ii) natural language interfaces; (iii) NLP in the context of www and digital libraries ; and (iv) evaluation of NLP systems
Text content and task performance in the evaluation of a natural language generation system
An important question in the evaluation of Natural Language Generation systems concerns the relationship between textual characteristics and task performance. If the results of task-based evaluation can be correlated to properties of the text, there are better prospects for improving the system. The present paper investigates this relationship by focusing on the outcomes of a task-based evaluation of a system that generates summaries of patient data, attempting to correlate these with the results of an analysis of the system’s texts, compared to a set of gold standard human-authored summaries.peer-reviewe
Robust Processing of Natural Language
Previous approaches to robustness in natural language processing usually
treat deviant input by relaxing grammatical constraints whenever a successful
analysis cannot be provided by ``normal'' means. This schema implies, that
error detection always comes prior to error handling, a behaviour which hardly
can compete with its human model, where many erroneous situations are treated
without even noticing them.
The paper analyses the necessary preconditions for achieving a higher degree
of robustness in natural language processing and suggests a quite different
approach based on a procedure for structural disambiguation. It not only offers
the possibility to cope with robustness issues in a more natural way but
eventually might be suited to accommodate quite different aspects of robust
behaviour within a single framework.Comment: 16 pages, LaTeX, uses pstricks.sty, pstricks.tex, pstricks.pro,
pst-node.sty, pst-node.tex, pst-node.pro. To appear in: Proc. KI-95, 19th
German Conference on Artificial Intelligence, Bielefeld (Germany), Lecture
Notes in Computer Science, Springer 199
- …