14,141 research outputs found

    TOME: Interactive TOpic Model and MEtadata Visualization

    Get PDF
    As archives are being digitized at an increasing rate, scholars will require new tools to make sense of this expanding amount of material. We propose to build TOME, a tool to support the interactive exploration and visualization of text-based archives. Drawing upon the technique of topic modeling--a computational method for identifying themes that recur across a collection--TOME will visualize the topics that characterize each archive, as well as the relationships between specific topics and related metadata, such as publication date. An archive of 19th-century antislavery newspapers, characterized by diverse authors and shifting political alliances, will serve as our initial dataset; it promises to motivate new methods for visualizing topic models and extending their impact. In turn, by applying our new methods to these texts, we will illuminate how issues of gender and racial identity affect the development of political ideology in the nineteenth century, and into the present day

    Reading Thomas Jefferson with TopicViz: Towards a Thematic Method for Exploring Large Cultural Archives

    Get PDF
    In spite of what Ed Folsom has called the “epic transformation of archives,” referring to the shift from print to digital archival form, methods for exploring these digitized collections remain underdeveloped. One method prompted by digitization is the application of automated text mining techniques such as topic modeling -- a computational method for identifying the themes that recur across an archive of documents. We review the nascent literature on topic modeling of literary archives, and present a case study, applying a topic model to the Papers of Thomas Jefferson. The lessons from this work suggest that the way forward is to provide scholars with more holistic support for visualization and exploration of topic model output, while integrating topic models with more traditional workflows oriented around assembling and refining sets of relevant documents. We describe our ongoing effort to develop a novel software system that implements these ideas

    ‘Dying Irish’: eulogising the Irish in Scotland in Glasgow Observer obituaries

    Get PDF
    The Glasgow Observer newspaper, founded in 1885 by and for the Irish community in Scotland regularly published both lengthy and brief funereal and elegiac obituaries of the Irish in Scotland in the nineteenth and early twentieth centuries. They marshal an impressive, emotive and oftentimes contradictory body of evidence and anecdote of immigrant lives of the kind utilised, and as often passed over, by historians of the Irish in Britain. They contain, however, a unique perspective on the march of a migrant people bespoke of their experiences and, perhaps more importantly, the perception of their experiences in passage, in the host society and ultimately in death. Moreover, the changing sense of Victorian sensibilities over the solemnity, purpose and ritual of death into the Edwardian era finds a moot reflection in the key staples of Irish immigrant obsequies with their stress on thrift, endeavour, piety, charity and gratitude. This article explores Glasgow Observer obituaries from the 1880s to the 1920s to see what they say about the immigrants, their lives, work and culture, the Scots, migration itself, the wider relations between Britain and Ireland, and the place where Irish and British attitudes to death meet in this period. It does so by drawing upon recent sociological perspectives on obituaries and their relationship with the formation and articulation of collective memory

    Exploratory analysis of textual data streams

    Get PDF
    In this paper, we address exploratory analysis of textual data streams and we propose a bootstrapping process based on a combination of keyword similarity and clustering techniques to: (i) classify documents into fine-grained similarity clusters, based on keyword commonalities; (ii) aggregate similar clusters into larger document collections sharing a richer, more user-prominent keyword set that we call topic; (iii) assimilate newly extracted topics of current bootstrapping cycle with existing topics resulting from previous bootstrapping cycles, by linking similar topics of different time periods, if any, to highlight topic trends and evolution. An analysis framework is also defined enabling the topic-based exploration of the underlying textual data stream according to a thematic perspective and a temporal perspective. The bootstrapping process is evaluated on a real data stream of about 330.000 newspaper articles about politics published by the New York Times from Jan 1st 1900 to Dec 31st 2015

    A Preliminary Survey of Chinook Jargon Lexical Item Use in the Pacific Northwest

    Get PDF
    This paper sets out to explore the distribution and use of Chinook Jargon (CJ) lexicalitems in the Pacific Northwest. Examples of English loanwords originating in CJ were collectedfrom general web and periodical database searches, as well as queries within the orpus of Contemporary American English (COCA). Content analysis of the collected data in reference to the most frequently encountered CJ term (skookum) revealed that current usage fell primarily into three broad categories: commercial, geographic, and daily use. A range of examples of skookum from these categories is discussed to provide an overview of the use of this particular CJ lexical item in current Pacific Northwest English. These findings are also considered in respect to their geographical distribution and the role such CJ lexical items serve in specific communities

    'It's like the space shuttle blows up every day':Digital television heritage as memory of European crises in the age of information overload

    Get PDF
    Television is a public mediator of what constitutes 'crises' in Europe. Audio-visual archives and researchers are facing new complexities and 'information bubbles' when telling stories and reusing televised materials. I reflect on these practices, among others, via a comparative case analysis of the EUscreen portal offering access to thousands of items of European audio-visual heritage. I question how practices of selection and curation can support comparative interpretations of such representations. This approach aims to understand and support (1) interpretations of digitized/digital audio-visual sources in the era of information overload; (2) user interaction with digital search technologies - especially researchers as platform users; and (3) contextualization for reuse of audio-visual texts. Support for cultural memory research is crucial as television's audio-visual heritage can help us to recognize which cultural practices result in the production of specific texts in European societies, representing conditions of the multiple crises that European citizens are experiencing today

    Digitization and the Changing Roles of Libraries in Support of Humanities Research: The Case of the Harrison Forman Collection

    Get PDF
    Objective – this article examines the role of libraries in expanding access to primary sources through digitization and in providing support for humanities research. Research method – the author analyzes the literature on information behavior of humanist scholars in light of the increased use of digitized primary sources. Next, using the example of the digitized photographs and diaries from the Harrison Forman Collection, the author explores the emerging role of libraries in creating a new source of scholarly materials and supporting research in humanities. Results and conclusion – digitization increasingly matters not only for practical reasons of ease of use and access but also by offering a new potential for humanistic research. Digitization projects provide enhanced intellectual control of primary resources, offer an opportunity to uncover hidden collections, and bring together scattered materials. Digital collections in their present design demonstrate some limitations in supporting scholars’ browsing behavior and in providing contextual information. Creating digital collections in support of humanities research requires the transformation of library roles and collaboration with digital humanities scholars

    Archives and AI: An Overview of Current Debates and Future Perspectives

    Get PDF
    The digital transformation is turning archives, both old and new, into data. As a consequence, automation in the form of artificial intelligence techniques is increasingly applied both to scale traditional recordkeeping activities, and to experiment with novel ways to capture, organise, and access records. We survey recent developments at the intersection of Artificial Intelligence and archival thinking and practice. Our overview of this growing body of literature is organised through the lenses of the Records Continuum model. We find four broad themes in the literature on archives and artificial intelligence: theoretical and professional considerations, the automation of recordkeeping processes, organising and accessing archives, and novel forms of digital archives. We conclude by underlining emerging trends and directions for future work, which include the application of recordkeeping principles to the very data and processes that power modern artificial intelligence and a more structural - yet critically aware - integration of artificial intelligence into archival systems and practice
    corecore