80 research outputs found

    A diachronic study of historiography

    Full text link
    The humanities are often characterized by sociologists as having a low mutual dependence among scholars and high task uncertainty. According to Fuchs' theory of scientific change, this leads over time to intellectual and social fragmentation, as new scholarship accumulates in the absence of shared unifying theories. We consider here a set of specialisms in the discipline of history and measure the connectivity properties of their bibliographic coupling networks over time, in order to assess whether fragmentation is indeed occurring. We construct networks using both reference overlap and textual similarity. It is shown that the connectivity of reference overlap networks is gradually and steadily declining over time, whilst that of textual similarity networks is stable. Author bibliographic coupling networks also show signs of a decline in connectivity, in the absence of an increasing propensity for collaborations. We speculate that, despite the gradual weakening of ties among historians as mapped by references, new scholarship might be continually integrated through shared vocabularies and narratives. This would support our belief that citations are but one kind of bibliometric data to consider --- perhaps even of secondary importance --- when studying the humanities, while text should play a more prominent role

    The structural role of the core literature in history

    Full text link
    The intellectual landscapes of the humanities are mostly uncharted territory. Little is known on the ways published research of humanist scholars defines areas of intellectual activity. An open question relates to the structural role of core literature: highly cited sources, naturally playing a disproportionate role in the definition of intellectual landscapes. We introduce four indicators in order to map the structural role played by core sources into connecting different areas of the intellectual landscape of citing publications (i.e. communities in the bibliographic coupling network). All indicators factor out the influence of degree distributions by internalizing a null configuration model. By considering several datasets focused on history, we show that two distinct structural actions are performed by the core literature: a global one, by connecting otherwise separated communities in the landscape, or a local one, by rising connectivity within communities. In our study, the global action is mainly performed by small sets of scholarly monographs, reference works and primary sources, while the rest of the core, and especially most journal articles, acts mostly locally

    Seller-buyer networks in NFT art are driven by preferential ties

    Get PDF
    Non-Fungible Tokens (NFTs) have recently surged to mainstream attention by allowing the exchange of digital assets via blockchains. NFTs have also been adopted by artists to sell digital art. One of the promises of NFTs is broadening participation to the art market, a traditionally closed and opaque system, to sustain a wider and more diverse set of artists and collectors. A key sign of this effect would be the disappearance or at least reduction in importance of seller-buyer preferential ties, whereby the success of an artist is strongly dependent on the patronage of a single collector. We investigate NFT art seller-buyer networks considering several galleries and a large set of nearly 40,000 sales for over 230 M USD in total volume. We find that NFT art is a highly concentrated market driven by few successful sellers and even fewer systematic buyers. High concentration is present in both the number of sales and, even more strongly, in their priced volume. Furthermore, we show that, while a broader-participation market was present in the early phase of NFT art adoption, preferential ties have dominated during market growth, peak and recent decline. We consistently find that the top buyer accounts on average for over 80% of buys for a given seller. Similar trends apply to buyers and their top seller. We conclude that NFT art constitutes, at the present, a highly concentrated market driven by preferential seller-buyer ties

    Index-Driven Digitization and Indexation of Historical Archives

    Get PDF
    The promise of digitization of historical archives lies in their indexation at the level of contents. Unfortunately, this kind of indexation does not scale, if done manually. In this article we present a method to bootstrap the deployment of a content-based information system for digitized historical archives, relying on historical indexing tools. Commonly prepared to search within homogeneous records when the archive was still current, such indexes were as widespread as they were disconnected, that is to say situated in the very records they were meant to index. We first present a conceptual model to describe and manipulate historical indexing tools. We then introduce a methodological framework for their use in order to guide digitization campaigns and index digitized historical records. Finally, we exemplify the approach with a case study on the indexation system of the X Savi alle Decime in Rialto, a Venetian magistracy in charge for the exaction—and related record keeping—of a tax on real estate in early modern Venice

    Transfer learning for historical corpora: An assessment on post-OCR correction and named entity recognition

    Get PDF
    Transfer learning in Natural Language Processing, mainly in the form of pre-trained language models, has recently delivered substantial gains across a range of tasks. Scholars and practitioners working with OCRed historical corpora are thus increasingly exploring the use of pre-trained language models. Nevertheless, the specific challenges posed by historical documents, including OCR quality and linguistic change, call for a critical assessment of the use of pre-trained language models in this setting. We consider two shared tasks, ICDAR2019 (post-OCR correction) and CLEF-HIPE-2020 (Named Entity Recognition, NER), and systematically assess using pre-trained language models with data in French, German and English. We find that using pre-trained language models helps with NER but less so with post-OCR correction. Pre-trained language models should therefore be used critically when working with OCRed historical corpora. We release our code base, in order to allow replicating our results and testing other pre-trained representations

    TimeRank: A dynamic approach to rate scholars using citations

    Get PDF
    Rating has become a common practice of modern science. No rating system can be considered as final, but instead several approaches can be taken, which magnify different aspects of the fabric of science. We introduce an approach for rating scholars which uses citations in a dynamic fashion, allocating ratings by considering the relative position of two authors at the time of the citation among them. Our main goal is to introduce the notion of citation timing as a complement to the usual suspects of popularity and prestige. We aim to produce a rating able to account for a variety of interesting phenomena, such as positioning raising stars on a more even footing with established researchers. We apply our method on the bibliometrics community using data from the Web of Science from 2000 to 2016, showing how the dynamic method is more effective than alternatives in this respect

    Clustering citation histories in the Physical Review

    Get PDF
    We investigate publications trough their citation histories -- the history events are the citations given to the article by younger publications and the time of the event is the date of publication of the citing article. We propose a methodology, based on spectral clustering, to group citation histories, and the corresponding publications, into communities and apply multinomial logistic regression to provide the revealed communities with semantics in terms of publication features. We study the case of publications from the full Physical Review archive, covering 120 years of physics in all its domains. We discover two clear archetypes of publications -- marathoners and sprinters -- that deviate from the average middle-of-the-roads behaviour, and discuss some publication features, like age of references and type of publication, that are correlated with the membership of a publication into a certain community

    A multilayer exploration of the cognitive structure of publications in history

    Get PDF
    Citation networks among journal articles are perhaps the most common object of investigation in bibliometrics. For example, citation networks are widely used for science mapping as a way to explore the cognitive structure of scientific fields. Within this framework, the disciplines traditionally part of the humanities fare differently. Their main trait being the interplay of a broader array of publication typologies \u2013 monographs, edited volumes, journal articles \u2013 with a richer set of cited objects, including primary evidence. Consequently, when considered from a science mapping perspective, a community, field or specialism in the humanities might be represented as a multilayer network. We consider here a specialism in history, the history of Venice, and represent it using a set of publications including both books (edited and monographs) and journal articles. This set of publications is interconnected using three similarity measures: bibliographic coupling over references to books, bibliographic coupling over references to primary sources and textual similarity. The result is a multi-relation network with three distinct dimensions (that we will call layers), one per similarity measure, connecting the same publications. Given this representation, we proceed to analyse the different communities emerging from the three layers, to qualify them and consider to what extent they overlap or instead provide for orthogonal conceptual spaces

    Assessing Simulations of Imperial Dynamics and Conflict in the Ancient World

    Get PDF
    The development of models to capture large-scale dynamics in human history is one of the core contributions of cliodynamics. Most often, these models are assessed by their predictive capability on some macro-scale and aggregated measure and compared to manually curated historical data. In this report, we consider the model from Turchin et al. (2013), where the evaluation is done on the prediction of "imperial density": the relative frequency with which a geographical area belonged to large-scale polities over a certain time window. We implement the model and release both code and data for reproducibility. We then assess its behaviour against three historical data sets: the relative size of simulated polities vs historical ones; the spatial correlation of simulated imperial density with historical population density; the spatial correlation of simulated conflict vs historical conflict. At the global level, we show good agreement with population density (R2<0.75R^2 < 0.75), and some agreement with historical conflict in Europe (R2<0.42R^2 < 0.42). The model instead fails to reproduce the historical shape of individual polities. Finally, we tweak the model to behave greedily by having polities preferentially attacking weaker neighbours. Results significantly degrade, suggesting that random attacks are a key trait of the original model. We conclude by proposing a way forward by matching the probabilistic imperial strength from simulations to inferred networked communities from real settlement data

    A View on Venetian Apprenticeship through the Garzoni Database

    Get PDF
    A sample of contracts of apprenticeship from three periods in the history of early modern Venice is analysed, as recorded in the archive of the Giustizia Vecchia, a venetian magistracy. The periods are the end of the 16th century, the 1620s and the 1650s. A set of findings is discussed. First, the variety of professions represented in the dataset reduces over time, as the proportion of venetian apprentices increases, in accordance with previous literature highlighting the decline of the venetian economy during the 17th century. Secondly, apprenticeships are found to be divided into two broad groups: those who stipulated a payment to be given by the master to the apprentice (circa 80%), and those who did not. The first group is suggested to represent contracts used in part, sometimes exclusively, to hire cheap workforce as well as to provide training. Lastly, professional profiles are introduced, as a combination of statistics which provide evidence of three typologies of professions with respect to apprenticeship market dynamics
    • …
    corecore