Search CORE

10 research outputs found

Entity-Centric Text Mining for Historical Documents

Author: Coll Ardanuy Maria
Publication venue
Publication date: 07/07/2017
Field of study

So What Are You Going to Do with That? The Promises and Pitfalls of Massive Data Sets

Author: Cordell Sigrid Anderson
Gomis Melissa
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2017
Field of study

This article takes as its case study the challenge of data sets for text mining, sources that offer tremendous promise for digital humanities (DH) methodology but present specific challenges for humanities scholars. These text sets raise a range of issues: What skills do you train humanists to have? What is the library’s role in enabling and supporting use of those materials? How do you allocate staff? Who oversees sustainability and data management? By addressing these questions through a specific use case scenario, this article shows how these questions are central to mapping out future directions for a range of library services

DigitalCommons@University of Nebraska

So What Are You Going to Do with That? The Promises and Pitfalls of Massive Data Sets

Author: Cordell Sigrid Anderson
Gomis Melissa
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2017
Field of study

This article takes as its case study the challenge of data sets for text mining, sources that offer tremendous promise for DH methodology but present specific challenges for humanities scholars. These text sets raise a range of issues: What skills do you train humanists to have? What is the library's role in enabling and supporting use of those materials? How do you allocate staff? Who oversees sustainability and data management? By addressing these questions through a specific use case scenario, this article shows how these questions are central to mapping out future directions for a range of library services.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/139596/1/So what are you going to do with that The promises and pitfalls of massive data sets.pdf-

Crossref

DigitalCommons@University of Nebraska

Deep Blue Documents at the University of Michigan

Searching for Unknown Allusions: A Need to be Filled

Author: Combs Rose
Publication venue: University of North Carolina at Chapel Hill
Publication date: 01/12/2015
Field of study

In the growing world of technology, there are tools that have been developed to explore texts for those in the social sciences or humanities. These tools allow searching and analysis to occur that previously had to be done manually. While the tools that are available meet many needs, there is one need that is not being met. The ability to locate unknown allusions has not yet been addressed. This paper explains the benefits of having the ability to locate unknown allusions. In addition, it examines some of the tools that are available and what they are capable to producing. In conclusion, a description of the needed ability of a future tool is provided.Master of Science in Information Scienc

Carolina Digital Repository

Dansk betydningsinventar i et datalingvistisk perspektiv

Author: Nimb Sanni
Olsen Sussi
Pedersen Bolette Sandford
Publication venue: Universitets-Jubilæets Danske Samfund
Publication date: 01/01/2021
Field of study

In this paper we investigate the Danish sense inventory from a paradigmatic and a syntagmatic perspective, respectively, and we present a collection of related lexical semantic resources that we have developed in collaboration between The Society for Danish Language and Literature and The University of Copenhagen. The resources comprise a Danish wordnet (DanNet), The Danish FrameNet Lexicon, and The Danish Sentiment Lexicon. All three resources are designed to enable semantic processing to be used in digital humanities research as well as more broadly in language-centric technology development. Finally, in order to illustrate the use of the resources when processing running text, we provide some annotation examples of each resource

Copenhagen University Research Information System

Tidsskrift.dk (Det Kongelige Bibliotek)

Langzeitarchivierung von Forschungsdaten : eine Bestandsaufnahme

Author: Klump Jens
Ludwig Jens
Neuroth Heike
Oßwald Achim
Scheffel Regine
Strathmann Stefan
Publication venue
Publication date: 01/01/2012
Field of study

The relevance of research data today and for the future is well documented and discussed, in Germany as well as internationally. Ensuring that research data are accessible, sharable, and re-usable over time is increasingly becoming an essential task for researchers and research infrastructure institutions. Some reasons for this development include the following: - research data are documented and could therefore be validated - research data could be the basis for new research questions - research data could be re-analyzed by using innovative digital methods - research data could be used by other disciplines Therefore, it is essential that research data are curated, which means they are kept accessible and interpretable over time. In Germany, a baseline study was undertaken analyzing the situation in eleven research disciplines in 2012. The results were then published in a German-language edition. To address an international audience, the German-language edition of the study has been translated and abridged

E-LIS

Hochschulschriftenserver - Universität Frankfurt am Main

Meaning construction in popular science : an investigation into cognitive, digital, and empirical approaches to discourse reification

Author: Alexander Marc Gabriel
Publication venue
Publication date: 01/01/2011
Field of study

This thesis uses cognitive linguistics and digital humanities techniques to analyse abstract conceptualization in a corpus of popular science texts. Combining techniques from Conceptual Integration Theory, corpus linguistics, data-mining, cognitive pragmatics and computational linguistics, it presents a unified approach to understanding cross-domain mappings in this area, and through case studies of key extracts, describes how concept integration in these texts operates. In more detail, Part I of the thesis describes and implements a comprehensive procedure for semantically analysing large bodies of text using the recently- completed database of the Historical Thesaurus of English. Using log-likelihood statistical measures and semantic annotation techniques on a 600,000 word corpus of abstract popular science, this part establishes both the existence and the extent of significant analogical content in the corpus. Part II then identifies samples which are particularly high in analogical content from the corpus, and proposes an adaptation of empirical and corpus methods to support and enhance conceptual integration (sometimes called conceptual blending) analyses, informed by Part I’s methodologies for the study of analogy on a wider scale. Finally, the thesis closes with a detailed analysis, using this methodology, of examples taken from the example corpus. This analysis illustrates those conclusions which can be drawn from such work, completing the methodological chain of reasoning from wide-scale corpora to narrow-focus semantics, and providing data about the nature of highly-abstract popular science as a genre. The thesis’ original contribution to knowledge is therefore twofold; while contributing to the understanding of the reification of abstractions in discourse, it also focuses on methodological enhancements to existing tools and approaches, aiming to contribute to the established tradition of both analytic and procedural work advancing the digital humanities in the area of language and discourse

Glasgow Theses Service

eine Bestandsaufnahme

Author: Ludwig Jens
Publication venue
Publication date: 01/01/2012
Field of study

Institutional Repository of the Freie Universität Berlin

Langzeitarchivierung von Forschungsdaten : eine Bestandsaufnahme

Author: Neuroth Heike
Publication venue
Publication date: 01/01/2012
Field of study

Kein Abstract vorhanden

Institut für Informationswissenschaft

Visualization for Text Mining in the Digital Humanities

Author: Hocker Julian
Publication venue: Humboldt-Universität zu Berlin
Publication date: 01/01/2017
Field of study

In this PhD thesis, a visual interface for text analysis and text mining in the digital humanities (DH) will be developed. Text analysis is a crucial task in the DH, but advanced text mining technologies like topic modeling or clustering are difficult to use for most researchers. My work bridges this gap using visualizations. To ensure an adequate usability of visualizations for epistemological practices, the visualizations will be realized with researchers in an agile and participatory approach

Dokumenten-Publikationsserver der Humboldt-Universität zu Berlin