7 research outputs found

    Mapping the Bentham Corpus

    Get PDF
    University College London (UCL) owns a large corpus of the philosopher and social reformer Jeremy Bentham (1748-1832). Until recently, these papers were for the most part untranscribed, so that very few people had access to the corpus to evaluate its content and its value. The corpus is now being digitized and transcribed thanks to a large number of volunteers recruited through a crowd-sourcing initiative called Transcribe Bentham (Causer and Terras, 2014a, 2014b). The problem researchers are facing with such a corpus is clear: how to access the content, how to structure these 30,000 files, and how to get relevant access to this mass of data? Our goal has thus been to produce an automatic analysis procedure aiming at providing a general characterization of the content of the corpus. We are more specifically interested in identifying the main topics and their structure so as to provide meaningful static and dynamic representations of their evolution over time

    Knowledge Extraction for Art History: the Case of Vasari’s The Lives of The Artists (1568)

    Get PDF
    Knowledge Extraction (KE) techniques are used to convert unstructured information present in texts to Knowledge Graphs (KGs) which can be queried and explored. Despite their potential for cultural heritage domains, such as Art History, these techniques often encounter limitations if applied to domain-specific data. In this paper we present the main challenges that KE has to face on art-historical texts, by using as case study Giorgio Vasari’s The Lives of The Artists. This paper discusses the following NLP tasks for art-historical texts, namely entity recognition and linking, coreference resolution, time extraction, motif extraction and artwork extraction. Several strategies to annotate art-historical data for these tasks and evaluate NLP models are also proposed

    På sporet av aktørar som skriv : Ein studie av to digitale samskrivingskasus i ein ungdomsskoleklasse

    Get PDF
    On the Trail of Actors Who Write is a study of two digital cases of collaborative writing in a Norwegian lower secondary school class. The study maps, analyzes and discusses the writing process in two collaborative writing groups, consisting of six students – in close collaboration with software, texts from the Internet and other digital actors, during three double lessons in February 2020. The study applies socio-material theory to writing in school contexts. The conceptual framework is based on actor-network theory (ANT), theories of linguistic materiality, visual network analysis (VNA) and case study methodology. The collected material in the study consists of both quantitative and qualitative data: student texts and source texts, video and screen recordings, and also interviews with the teacher and students participating in the project. Central to the study is the question of how human and digital actors interact while writing, and which role technology plays in this process. The study reveals that the student texts are produced through a number of negotiations and trials of strength between students, search engines, digital source texts and writing software. Search engines greatly influence the planning processes in that they select, prioritize and promote other actors' texts, and indeed specific parts of these texts. The source texts affect the composition of student texts by circulating, replicating and, in some cases, mutating the linguistic material into their texts. Writing software affects students' spelling through writing suggestions and corrective interruptions in the digital environment. The production of the student texts can thus be seen as transformations of linguistic material originating in the digital actors that participate in the writing process, and to some extent originating in the students themselves. In several of the situations observed in these two collaborative writing cases, it is the digital actors that seem to have the greatest power of negotiation and impact. A practical implication for writing education can therefore be to strengthen lower secondary school students' critical approach and ability to negotiate with digital actors, so that students can make more independent choices while writing, also when collaboratively writing with each other and through digital technology.publishedVersio

    Mapping the Bentham Corpus: Concept-based Navigation

    No full text
    British philosopher and reformer Jeremy Bentham (1748-1832) left over 60,000 folios of unpublished manuscripts. The Bentham Project, at University College London, is creating a TEI version of the manuscripts, via crowdsourced transcription verified by experts. We present here an interface to navigate these largely unedited manuscripts, and the language technologies the corpus was enriched with to facilitate navigation, i.e Entity Linking against the DBpedia knowledge base and keyphrase extraction. The challenges of tagging a historical domain-specific corpus with a contemporary knowledge base are discussed. The concepts extracted were used to create interactive co-occurrence networks, that serve as a map for the corpus and help navigate it, along with a search index. These corpus representations were integrated in a user interface. The interface was evaluated by domain experts with satisfactory results , e.g. they found the distributional semantics methods exploited here applicable in order to assist in retrieving related passages for scholarly editing of the corpus

    Mapping the Bentham Corpus: Concept-based Navigation

    Get PDF
    International audienceBritish philosopher and reformer Jeremy Bentham (1748-1832) left over 60,000 folios of unpublished manuscripts. The Bentham Project, at University College London, is creating a TEI version of the manuscripts, via crowdsourced transcription verified by experts. We present here an interface to navigate these largely unedited manuscripts, and the language technologies the corpus was enriched with to facilitate navigation, i.e Entity Linking against the DBpedia knowledge base and keyphrase extraction. The challenges of tagging a historical domain-specific corpus with a contemporary knowledge base are discussed. The concepts extracted were used to create interactive co-occurrence networks, that serve as a map for the corpus and help navigate it, along with a search index. These corpus representations were integrated in a user interface. The interface was evaluated by domain experts with satisfactory results , e.g. they found the distributional semantics methods exploited here applicable in order to assist in retrieving related passages for scholarly editing of the corpus

    Mapping the Bentham Corpus: Concept-based Navigation

    No full text
    corecore