15 research outputs found

    Identifying Virtues and Values Through Obituary Data-Mining

    Get PDF
    Because obituaries are succinct and explicitly intended to summarize their subjects’ lives, they may be expected to include only the features that the author finds most salient but also to signal to others in the community the socially-recognized aspects of the deceased’s character. We begin by reviewing studies 1 and 2, in which obituaries were carefully read and labeled. We then report study 3, which further develops these results with a semi-automated, large-scale semantic analysis of several thousand obituaries. Geography, gender, and elite status all turn out to be associated with the virtues and values associated with the deceased

    The Axiology of Necrologies: Using Natural Language Processing to Examine Values in Obituaries

    Get PDF
    This dissertation is centrally concerned with exploring obituaries as repositories of values. Obituaries are a publicly-available natural language source that are variably written for members of communities that are wide (nation-level) and narrow (city-level, or at the level of specific groups therein). Because they are explicitly summative, limited in size, and written for consumption by a public audience, obituaries may be expected to express concisely the aspects of their subjects' lives that the authors (often family members living in the same communities) found most salient or worthy of featuring. 140,599 obituaries nested in 832 newspapers from across the USA were scraped with permission from *Legacy.com,* an obituaries publisher. Obituaries were coded for the age at death and gender (female/male) of the deceased using automated algorithms. For each publishing newspaper, county-level median income, educational achievement (operationalized as percent of the population with a Bachelor's degree or higher), and race and ethnicity were averaged across counties, weighting by population size. A Neo4J graph database was constructed using WordNet and the University of South Florida Free Association Norms datasets. Each word in each obituary in the corpus was lemmatized. The shortest path through the WordNet graph from each lemma to 30 Schwartz value prototype words published by Bardi, Calogero, and Mullen (2008) was then recorded. From these path lengths, a new measure, "word-by-hop," was calculated for each Schwartz value to reflect the relative lexical distance between each obituary and that Schwartz value. Of the Schwartz values, Power, Conformity, and Security were most indicated in the corpus, while Universalism, Hedonism, and Stimulation were least indicated. A series of nine two-level regression models suggested that, across Schwartz values, newspaper community accounted for the greatest amount of word-by-hop variability in the corpus. The best-fitting model indicated a small, negative effect of female status across Schwartz values. Unexpectedly, Hedonism and Conformity, which had conceptually opposite prototype words, were highly correlated, possibly indicating that obituary authors "compensate" for describing the deceased in a hedonistic way by concurrently emphasizing restraint. Future research could usefully further expand word-by-hop and incorporate individual-level covariates that match the newspaper-level covariates used here

    Sci-Hub provides access to nearly all scholarly literature

    Full text link
    The website Sci-Hub enables users to download PDF versions of scholarly articles, including many articles that are paywalled at their journal\u27s site. Sci-Hub has grown rapidly since its creation in 2011, but the extent of its coverage was unclear. Here we report that, as of March 2017, Sci-Hub\u27s database contains 68.9% of the 81.6 million scholarly articles registered with Crossref and 85.1% of articles published in toll access journals. We find that coverage varies by discipline and publisher, and that Sci-Hub preferentially covers popular, paywalled content. For toll access articles, we find that Sci-Hub provides greater coverage than the University of Pennsylvania, a major research university in the United States. Green open access to toll access articles via licit services, on the other hand, remains quite limited. Our interactive browser at https://greenelab.github.io/scihub allows users to explore these findings in more detail. For the first time, nearly all scholarly literature is available gratis to anyone with an Internet connection, suggesting the toll access business model may become unsustainable

    The Axiology of Necrologies: Using Natural Language Processing to Examine Values in Obituaries (Dissertation Code and Limited Data)

    No full text
    This dissertation is centrally concerned with exploring obituaries as repositories of values. Obituaries are a publicly-available natural language source that are variably written for members of communities that are wide (nation- level) and narrow (city-level, or at the level of specific groups therein). Because they are explicitly summative, limited in size, and written for consumption by a public audience, obituaries may be expected to express concisely the aspects of their subjects’ lives that the authors (often family members living in the same communities) found most salient or worthy of featuring. 140,599 obituaries nested in 832 newspapers from across the USA were scraped with permission from Legacy.com, an obituaries publisher. Obituaries were coded for the age at death and gender (female/male) of the deceased using automated algorithms. For each publishing newspaper, county-level median income, educational achievement (operationalized as percent of the population with a Bachelor’s degree or higher), and race and ethnicity were averaged across counties, weighting by population size. A Neo4J graph database was constructed using WordNet and the University of South Florida Free Association Norms datasets. Each word in each obituary inthe corpus was lemmatized. The shortest path through the WordNet graph from each lemma to 30 Schwartz value prototype words published by Bardi, Calogero, and Mullen (2008) was then recorded. From these path lengths, a new measure, “word-by-hop,” was calculated for each Schwartz value to reflect the relative lexical distance between each obituary and that Schwartz value. Of the Schwartz values, Power, Conformity, and Security were most indicated in the corpus, while Universalism, Hedonism, and Stimulation were least indicated. A series of seven two-level regression models suggested that, across Schwartz values, newspaper community accounted for the greatest amount of word-by-hop variability in the corpus. The best-fitting model indicated a small, negative effect of female status across Schwartz values. Unexpectedly, Hedonism and Conformity, which had conceptually opposite prototype words, were highly correlated, possibly indicating that obituary authors “compensate” for describing the deceased in a hedonistic way by concurrently emphasizing restraint. Future research could usefully further expand word-by-hop and incorporate individual-level covariates that match the newspaper-level covariates used here

    Mapping Human Values: Enhancing Social Marketing through Obituary Data-Mining

    No full text
    Obituaries are an especially rich resource for identifying people’s values. Because obituaries are succinct and explicitly intended to summarize their subjects’ lives, they may be expected to include only the features that the author(s) find most salient, not only for themselves as relatives or friends of the deceased, but also to signal to others in the community the socially-recognized aspects of the deceased’s character. We report three approaches to the scientific study of virtue and value through obituaries. We begin by reviewing studies 1 and 2, in which obituaries were carefully read and labeled. We then report study 3, which further develops these results with a semi-automated, large-scale semantic analysis of several thousand obituaries. Finally, we present the results of study 4 in which individuals were asked to write prospective obituaries. Geography, gender, and elite status all turn out to influence the virtues and values associated with the deceased

    publicus/r-veccompare: version 0.1.0

    No full text
    <ul> <li> <p>This version is (as of this writing) <a href="https://CRAN.R-project.org/package=veccompare">listed on CRAN</a>, and can be installed from there with</p> <code>install.packages('veccompare') </code> <p>or, to target this release specifically,</p> <code># install.packages('devtools') devtools::install_version('veccompare', version = '0.1.0', repos = 'http://cran.us.r-project.org') </code> </li> <li> <p>It can also be installed directly from GitHub with</p> <code># install.packages('devtools') devtools::install_github("publicus/[email protected]") </code> </li> </ul

    Visualizing Values

    Get PDF
    Digital humanities research has developed haphazardly, with substantive contributions in some disciplines and only superficial uses in others. It has made almost no inroads in philosophy; for example, of the nearly two million articles, chapters, and books housed at philpapers.org, only sixteen pop up when one searches for ‘digital humanities’. In order to make progress in this field, we demonstrate that a hypothesis-driven method, applied by experts in data-collection, -aggregation, -analysis, and -visualization, yields philosophical fruits. “Call no one happy until they are dead.” “De mortuis nil nisi bonum.” These ancient norms still inform how we speak of the dead. From their beginnings in the nineteenth century, modern obituaries have been both practical and honorary. They are written with two aims: to pay respect to the deceased and to inspire the living to follow in their footsteps. The obituary of William Custis of Virginia exemplifies these aims, say that “it is due to his memory publicly to record his virtues,” and that “There is in the life a noble, independent and honest man, something so worthy of imitation, something that so strongly commends itself to the approbation of the virtuous mind, that his name should not be left to oblivion” (Daily National Intelligencer, 18 November 1838). Likewise, philosophers have emphasized that we can determine what counts as a virtue for a given type of person in a given community by analyzing what people say about the dead (Zagzebski, L. Virtues of the Mind. Cambridge University Press, 1996, p. 135). By adhering to norms of praise, admiration, and respect, obituaries reveal what counts as a virtue, value, or constituent of wellbeing (VVC) for a particular type of person in a particular community. One way to explore this insight would be to close-read a small number of obituaries of people from intersecting demographics, crossing dimensions such as gender, age, race, class, educational attainment, veteran status, religion, and location. Another, which we demonstrate in this chapter, is to harness digital humanities resources to distance-read a much larger number of obituaries, extracting both demographic meta-data and first-order content expressive of VVC. We then analyze and visualize this information to establish the structures of VVC associated with different types of people in different communities, arguing that data-science measures such as eigenvector centrality can be used to identify cardinal virtues. We also explain the rudiments of our text- and meta-data mining, our method of aggregating and comparing texts, and our visualization techniques. In so doing, we pave the way for researchers in philosophy and related fields to generate similar analyses and visualizations
    corecore