6,483 research outputs found
Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers
The massive amounts of digitized historical documents acquired over the last
decades naturally lend themselves to automatic processing and exploration.
Research work seeking to automatically process facsimiles and extract
information thereby are multiplying with, as a first essential step, document
layout analysis. If the identification and categorization of segments of
interest in document images have seen significant progress over the last years
thanks to deep learning techniques, many challenges remain with, among others,
the use of finer-grained segmentation typologies and the consideration of
complex, heterogeneous documents such as historical newspapers. Besides, most
approaches consider visual features only, ignoring textual signal. In this
context, we introduce a multimodal approach for the semantic segmentation of
historical newspapers that combines visual and textual features. Based on a
series of experiments on diachronic Swiss and Luxembourgish newspapers, we
investigate, among others, the predictive power of visual and textual features
and their capacity to generalize across time and sources. Results show
consistent improvement of multimodal models in comparison to a strong visual
baseline, as well as better robustness to high material variance
A fragmentising interface to a large corpus of digitized text: (Post)humanism and non-consumptive reading via features
While the idea of distant reading does not rule out the possibility of close reading of the individual components of the corpus of digitized text that is being distant-read, this ceases to be the case when parts of the corpus are, for reasons relating to intellectual property, not accessible for consumption through downloading followed by close reading. Copyright restrictions on material in collections of digitized text such as the HathiTrust Digital Library (HTDL) necessitates providing facilities for non-consumptive reading, one of the approaches to which consists of providing users with features from the text in the form of small fragments of text, instead of the text itself. We argue that, contrary to expectation, the fragmentary quality of the features generated by the reading interface does not necessarily imply that the mode of reading enabled and mediated by these features points in an anti-humanist direction. We pose the fragmentariness of the features as paradigmatic of the fragmentation with which digital techniques tend, more generally, to trouble the humanities. We then generalize our argument to put our work on feature-based non-consumptive reading in dialogue with contemporary debates that are currently taking place in philosophy and in cultural theory and criticism about posthumanism and agency. While the locus of agency in such a non-consumptive practice of reading does not coincide with the customary figure of the singular human subject as reader, it is possible to accommodate this fragmentising practice within the terms of an ampler notion of agency imagined as dispersed across an entire technosocial ensemble. When grasped in this way, such a practice of reading may be considered posthumanist but not necessarily antihumanist.Ope
Crowds for Clouds: Recent Trends in Humanities Research Infrastructures
Humanities have convincingly argued that they need transnational research
opportunities and through the digital transformation of their disciplines also
have the means to proceed with it on an up to now unknown scale. The digital
transformation of research and its resources means that many of the artifacts,
documents, materials, etc. that interest humanities research can now be
combined in new and innovative ways. Due to the digital transformations, (big)
data and information have become central to the study of culture and society.
Humanities research infrastructures manage, organise and distribute this kind
of information and many more data objects as they becomes relevant for social
and cultural research
HIL: designing an exokernel for the data center
We propose a new Exokernel-like layer to allow mutually untrusting physically deployed services to efficiently share the resources of a data center. We believe that such a layer offers not only efficiency gains, but may also enable new economic models, new applications, and new security-sensitive uses. A prototype (currently in active use) demonstrates that the proposed layer is viable, and can support a variety of existing provisioning tools and use cases.Partial support for this work was provided by the MassTech Collaborative Research Matching Grant Program, National Science Foundation awards 1347525 and 1149232 as well as the several commercial partners of the Massachusetts Open Cloud who may be found at http://www.massopencloud.or
A framework to maximise the communicative power of knowledge visualisations
Knowledge visualisation, in the field of information systems, is both a process and a product, informed by the closely aligned fields of information visualisation and knowledg management. Knowledge visualisation has untapped potential within the purview of knowledge communication. Even so, knowledge visualisations are infrequently deployed due to a lack of evidence-based guidance. To improve this situation, we carried out a systematic literature review to derive a number of âlensesâ that can be used to reveal the essential perspectives to feed into the visualisation production process.We propose a conceptual framework which incorporates these lenses to guide producers of knowledge visualisations. This framework uses the different lenses to reveal critical perspectives that need to be considered during the design process. We conclude by demonstrating how this framework could be used to produce an effective knowledge visualisation
- âŠ