10 research outputs found

    So What Are You Going to Do with That? The Promises and Pitfalls of Massive Data Sets

    Get PDF
    This article takes as its case study the challenge of data sets for text mining, sources that offer tremendous promise for digital humanities (DH) methodology but present specific challenges for humanities scholars. These text sets raise a range of issues: What skills do you train humanists to have? What is the library’s role in enabling and supporting use of those materials? How do you allocate staff? Who oversees sustainability and data management? By addressing these questions through a specific use case scenario, this article shows how these questions are central to mapping out future directions for a range of library services

    So What Are You Going to Do with That? The Promises and Pitfalls of Massive Data Sets

    Get PDF
    This article takes as its case study the challenge of data sets for text mining, sources that offer tremendous promise for DH methodology but present specific challenges for humanities scholars. These text sets raise a range of issues: What skills do you train humanists to have? What is the library's role in enabling and supporting use of those materials? How do you allocate staff? Who oversees sustainability and data management? By addressing these questions through a specific use case scenario, this article shows how these questions are central to mapping out future directions for a range of library services.Peer Reviewedhttps://deepblue.lib.umich.edu/bitstream/2027.42/139596/1/So what are you going to do with that The promises and pitfalls of massive data sets.pdf-

    Searching for Unknown Allusions: A Need to be Filled

    Get PDF
    In the growing world of technology, there are tools that have been developed to explore texts for those in the social sciences or humanities. These tools allow searching and analysis to occur that previously had to be done manually. While the tools that are available meet many needs, there is one need that is not being met. The ability to locate unknown allusions has not yet been addressed. This paper explains the benefits of having the ability to locate unknown allusions. In addition, it examines some of the tools that are available and what they are capable to producing. In conclusion, a description of the needed ability of a future tool is provided.Master of Science in Information Scienc

    Dansk betydningsinventar i et datalingvistisk perspektiv

    Get PDF
    In this paper we investigate the Danish sense inventory from a paradigmatic and a syntagmatic perspective, respectively, and we present a collection of related lexical semantic resources that we have developed in collaboration between The Society for Danish Language and Literature and The University of Copenhagen. The resources comprise a Danish wordnet (DanNet), The Danish FrameNet Lexicon, and The Danish Sentiment Lexicon. All three resources are designed to enable semantic processing to be used in digital humanities research as well as more broadly in language-centric technology development. Finally, in order to illustrate the use of the resources when processing running text, we provide some annotation examples of each resource

    Langzeitarchivierung von Forschungsdaten : eine Bestandsaufnahme

    Get PDF
    The relevance of research data today and for the future is well documented and discussed, in Germany as well as internationally. Ensuring that research data are accessible, sharable, and re-usable over time is increasingly becoming an essential task for researchers and research infrastructure institutions. Some reasons for this development include the following: - research data are documented and could therefore be validated - research data could be the basis for new research questions - research data could be re-analyzed by using innovative digital methods - research data could be used by other disciplines Therefore, it is essential that research data are curated, which means they are kept accessible and interpretable over time. In Germany, a baseline study was undertaken analyzing the situation in eleven research disciplines in 2012. The results were then published in a German-language edition. To address an international audience, the German-language edition of the study has been translated and abridged

    Meaning construction in popular science : an investigation into cognitive, digital, and empirical approaches to discourse reification

    Get PDF
    This thesis uses cognitive linguistics and digital humanities techniques to analyse abstract conceptualization in a corpus of popular science texts. Combining techniques from Conceptual Integration Theory, corpus linguistics, data-mining, cognitive pragmatics and computational linguistics, it presents a unified approach to understanding cross-domain mappings in this area, and through case studies of key extracts, describes how concept integration in these texts operates. In more detail, Part I of the thesis describes and implements a comprehensive procedure for semantically analysing large bodies of text using the recently- completed database of the Historical Thesaurus of English. Using log-likelihood statistical measures and semantic annotation techniques on a 600,000 word corpus of abstract popular science, this part establishes both the existence and the extent of significant analogical content in the corpus. Part II then identifies samples which are particularly high in analogical content from the corpus, and proposes an adaptation of empirical and corpus methods to support and enhance conceptual integration (sometimes called conceptual blending) analyses, informed by Part I’s methodologies for the study of analogy on a wider scale. Finally, the thesis closes with a detailed analysis, using this methodology, of examples taken from the example corpus. This analysis illustrates those conclusions which can be drawn from such work, completing the methodological chain of reasoning from wide-scale corpora to narrow-focus semantics, and providing data about the nature of highly-abstract popular science as a genre. The thesis’ original contribution to knowledge is therefore twofold; while contributing to the understanding of the reification of abstractions in discourse, it also focuses on methodological enhancements to existing tools and approaches, aiming to contribute to the established tradition of both analytic and procedural work advancing the digital humanities in the area of language and discourse

    Langzeitarchivierung von Forschungsdaten : eine Bestandsaufnahme

    Get PDF
    Kein Abstract vorhanden

    Visualization for Text Mining in the Digital Humanities

    Get PDF
    In this PhD thesis, a visual interface for text analysis and text mining in the digital humanities (DH) will be developed. Text analysis is a crucial task in the DH, but advanced text mining technologies like topic modeling or clustering are difficult to use for most researchers. My work bridges this gap using visualizations. To ensure an adequate usability of visualizations for epistemological practices, the visualizations will be realized with researchers in an agile and participatory approach
    corecore