22,491 research outputs found
Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers
The massive amounts of digitized historical documents acquired over the last
decades naturally lend themselves to automatic processing and exploration.
Research work seeking to automatically process facsimiles and extract
information thereby are multiplying with, as a first essential step, document
layout analysis. If the identification and categorization of segments of
interest in document images have seen significant progress over the last years
thanks to deep learning techniques, many challenges remain with, among others,
the use of finer-grained segmentation typologies and the consideration of
complex, heterogeneous documents such as historical newspapers. Besides, most
approaches consider visual features only, ignoring textual signal. In this
context, we introduce a multimodal approach for the semantic segmentation of
historical newspapers that combines visual and textual features. Based on a
series of experiments on diachronic Swiss and Luxembourgish newspapers, we
investigate, among others, the predictive power of visual and textual features
and their capacity to generalize across time and sources. Results show
consistent improvement of multimodal models in comparison to a strong visual
baseline, as well as better robustness to high material variance
Sentiment Analysis for Words and Fiction Characters From The Perspective of Computational (Neuro-)Poetics
Two computational studies provide different sentiment analyses for text segments (e.g., âfearfulâ passages) and figures (e.g., âVoldemortâ) from the Harry Potter books (Rowling, 1997 - 2007) based on a novel simple tool called SentiArt. The tool uses vector space models together with theory-guided, empirically validated label lists to compute the valence of each word in a text by locating its position in a 2d emotion potential space spanned by the > 2 million words of the vector space model. After testing the toolâs accuracy with empirical data from a neurocognitive study, it was applied to compute emotional figure profiles and personality figure profiles (inspired by the so-called âbig fiveâ personality theory) for main characters from the book series. The results of comparative analyses using different machine-learning classifiers (e.g., AdaBoost, Neural Net) show that SentiArt performs very well in predicting the emotion potential of text passages. It also produces plausible predictions regarding the emotional and personality profile of fiction characters which are correctly identified on the basis of eight character features, and it achieves a good cross-validation accuracy in classifying 100 figures into âgoodâ vs. âbadâ ones. The results are discussed with regard to potential applications of SentiArt in digital literary, applied reading and neurocognitive poetics studies such as the quantification of the hybrid hero potential of figures
A Bibliography on the Application of GIS in Archaeology and Cultural Heritage
Geographical Information Systems (GIS) applications to archaeological projects of different scales, chronological contexts and cultural milieux has accrued by now a long history and bibliography. Hopefully the phases of experimentation and almost blind testing are over, even if GIS applications are still sometimes being labeled as ânew technologiesâ
Recommended from our members
Hostile gatekeeping: The strategy of engaging with journalists in extremism reporting
This article broadly examines the relationship between strategic communications and journalism with specific reference to the issue of violent extremism. Using a case study of reporting on the Boko Haram conflict in Nigeria, it analyses the nature and consequences of engagement among the various communicators involved. The primary data were drawn from focus groups and individual interviews with thirty-two journalists and strategic communicators, and from analysis of Boko Haram videos and Nigerian security forcesâ press releases. The findings suggest that journalists have a tense but interdependent relationship with strategic communicators that is characterised by conflict and cooperation, harassment and intimidation. Strategic communicatorsâ control of the conflict theatre and use of the Internet to reach audiences directly give them leverage in the relationship. They, however, rely on journalists to help enhance the reach and credibility of their narratives, while journalists depend significantly on their media releases
The use of digital tools for spatial analysis in population geography
Digital tools, and in particular GIS, have enormously increased the possibilities for analysis in historical geography. In this article, we shall explain how these tools can be used to study the evolution of population density over a significant period. The territorial units used will be municipalities, as they allow detailed territorial analysis. However, research projects that take municipalities as their points of reference tend to be complex because their territorial boundaries have often undergone numerous changes over the course of modern history. The same has occurred, to a greater or lesser degree, in all of the countries in Europe (Bennett, 1989). The countries that have had the most stable municipal boundaries over the past 150 years include France, Italy, and Spain, though the modifications to their boundaries have also been notable. However, like all relevant challenges, these changes also offer us new opportunities, if we are able to cope with them. In this particular case, the challenge will be to achieve the territorial homogenization of the historical municipal series. In other words, when the municipal limits have changed, it will be necessary to adapt the data from the old municipal territories to the new ones. This exercise will have a number of applications. In this article, we present just one of these: the possibility of detecting areas and periods in which, over the course of history, there has been population growth, decline, or stagnation. This will serve as a relevant indicator, or proxy, for organizing research in other fields. For example, in the case of economic history, it is clear that variations in the density of population provide clues for interpreting the territorial distribution of economic activity. We also understand that it will be possible to apply our research about Spain to other countries and that this will make it possible to evaluate the interest and results that we can expect from the homogenized work. We think that, despite its interest, this type of study has, until now, been very rare on account of the methodological difficulties involved. However, these new digital tools in the field of historical GIS, as spatial aggregation and Moran I techniques, have helped to provide solutions to assume this challenge.Partial funding was provided by the Spanish Ministry of Education
(CSO2015-65733-P), the EU (Jean Monnet 562390-EPP-1-
2015-1-ES-EPPJMO), and ICREA-Academia
Recommended from our members
Geographic Availability of Assistance Dogs: Dogs Placed in 2013-2014 by ADI- or IGDF-Accredited or Candidate Facilities in the United States and Canada, and Non-accredited U.S. Facilities.
Assistance dogs' roles have diversified to support people with various disabilities, especially in the U.S. Data presented here are from the U.S. and Canada non-profit facilities (including both accredited and candidate members that fulfilled partial requirements: all here termed "accredited") of Assistance Dogs International (ADI) and the International Guide Dog Federation (IGDF), and from non-accredited U.S. assistance dog training facilities, on the numbers and types of dogs they placed in 2013 and 2014 with persons who have disabilities. ADI categories of assistance dogs are for guide, hearing, and service (including for assistance with mobility, autism, psychiatric, diabetes, seizure disabilities). Accredited facilities in 28 states and 3 provinces responded; accredited non-responding facilities were in 22 states and 1 province (some in states/provinces with responding accredited facilities). Non-accredited facilities in 16 states responded. U.S./Canada responding accredited facilities (55 of 96: 57%) placed 2,374 dogs; non-accredited U.S. facilities (22 of 133: 16.5%) placed 797 dogs. Accredited facilities placed similar numbers of dogs for guiding (n = 918) or mobility (n = 943), but many more facilities placed mobility service dogs than guide dogs. Autism service dogs were third most for accredited (n = 205 placements) and U.S. non-accredited (n = 72) facilities. Psychiatric service dogs were fourth most common in accredited placements (n = 119) and accounted for most placements (n = 526) in non-accredited facilities. Other accredited placements were for: hearing (n = 109); diabetic alert (n = 69), and seizure response (n = 11). Responding non-accredited facilities placed 17 hearing dogs, 30 diabetic alert dogs, and 18 seizure response dogs. Non-accredited facilities placed many dogs for psychiatric assistance, often for veterans, but ADI accreditation is required for veterans to have financial reimbursement. Twenty states and several provinces had no responding facilities; 17 of these states had no accredited facilities. In regions lacking facilities, some people with disabilities may find it inconvenient living far from any supportive facility, even if travel costs are provided. Despite accelerated U.S./Canada placements, access to well-trained assistance dogs continues to be limited and inconvenient for many people with disabilities, and the numerous sources of expensive, poorly trained dogs add confusion for potential handlers
The Gutenberg English Poetry Corpus: Exemplary Quantitative Narrative Analyses
This paper describes a corpus of about 3,000 English literary texts with about
250 million words extracted from the Gutenberg project that span a range of
genres from both fiction and non-fiction written by more than 130 authors
(e.g., Darwin, Dickens, Shakespeare). Quantitative narrative analysis (QNA) is
used to explore a cleaned subcorpus, the Gutenberg English Poetry Corpus
(GEPC), which comprises over 100 poetic texts with around two million words
from about 50 authors (e.g., Keats, Joyce, Wordsworth). Some exemplary QNA
studies show author similarities based on latent semantic analysis,
significant topics for each author or various text-analytic metrics for George
Eliotâs poem âHow Lisa Loved the Kingâ and James Joyceâs âChamber Music,â
concerning, e.g., lexical diversity or sentiment analysis. The GEPC is
particularly suited for research in Digital Humanities, Computational
Stylistics, or Neurocognitive Poetics, e.g., as training and test corpus for
stimulus development and control in empirical studies
Archaeological practices, knowledge work and digitalisation
Defining what constitute archaeological practices is a prerequisite for understanding where and how archaeological and archaeologically relevant information and knowledge are made, what counts as archaeological information, and where the limits are situated. The aim of this position paper, developed as a part of the COST action Archaeological practices and knowledge work in the digital environment (www.arkwork.eu), is to highlight the need for at least a relative consensus on the extents of archaeological practices in order to be able to understand and develop archaeological practices and knowledge work in the contemporary digital context. The text discusses approaches to study archaeological practices and knowledge work including Nicoliniâs notions of zooming in and zooming out, and proposes that a distinction between archaeological and archaeology-related practices could provide a way to negotiate the âarchaeologicalityâ of diverse practices
- âŠ