3,194 research outputs found

    Synopsizing “literature review” for scientific publications

    Get PDF
    Because the number of scientific publications in most disciplines is expanding rapidly, traditional academic search engines can hardly satisfy scholars’ need to retrieve and assimilate the information they are looking for. In this study we investigate a new summarization problem: creating a synopsis “Literature Review” of a collection of candidate cited papers in response to a query, via different methods and indicators. In more detail, we compare the use of different methods and indicators for generating citation clusters and summarized reviews by analyzing publication abstracts, citation contexts, and co-cite relationships. We also validate the usefulness of a user’s query during this process by comparing query-dependent and query-independent clustering and summarization. One interesting outcome of this study is that citation contexts are more suitable for clustering related papers, whereas abstracts are more accurate for generating longer review-like summaries. The initial user query is also helpful for enhancing clustering and summarization performance

    unarXive: a large scholarly data set with publications’ full-text, annotated in-text citations, and links to metadata

    Get PDF
    In recent years, scholarly data sets have been used for various purposes, such as paper recommendation, citation recommendation, citation context analysis, and citation context-based document summarization. The evaluation of approaches to such tasks and their applicability in real-world scenarios heavily depend on the used data set. However, existing scholarly data sets are limited in several regards. Here, we propose a new data set based on all publications from all scientific disciplines available on arXiv.org. Apart from providing the papers' plain text, in-text citations were annotated via global identifiers. Furthermore, citing and cited publications were linked to the Microsoft Academic Graph, providing access to rich metadata. Our data set consists of over one million documents and 29.2 million citation contexts. The data set, which is made freely available for research purposes, not only can enhance the future evaluation of research paper-based and citation context-based approaches but also serve as a basis for new ways to analyze in-text citations. See https://github.com/IllDepence/unarXive for the source code which has been used for creating the data set. For citing our data set and for further information we can refer to our journal article Tarek Saier, Michael FĂ€rber: "unarXive: A Large Scholarly Data Set with Publications’ Full-Text, Annotated In-Text Citations, and Links to Metadata", Scientometrics, 2020, http://dx.doi.org/10.1007/s11192-020-03382-z

    Classified by Genre: Rhetorical Genrefication in Cinema

    Get PDF
    This dissertation argues for a rethinking and expansion of film genre theory. As the variety of media exhibition platforms expands and as discourse about films permeates a greater number of communication media, the use of generic terms has never been more multiform or observable. Fundamental problems in the very conception of film genre have yet to be addressed adequately, and film genre study has carried on despite its untenable theoretical footing. Synthesizing pragmatic genre theory, constructivist film theory, Bourdieusian fan studies, and rhetorical genre studies, the dissertation aims to work through the radical implications of pragmatic genre theory and account for genres role in interpretation, evaluation, and rhetorical framing as part of broader, recurring social activities. This model rejects textualist and realist foundations for film genre; only pragmatic genre use can serve as a foundation for understanding film genres. From this perspective, the concept of genre is reconstructed according to its interpretive and rhetorical functions rather than a priori assumptions about the text or transtextual structures. Genres are not independent structures or relations among texts but performative speech acts about textual relationships and are functions of the rhetorical conditions of their use. This use is not only denotative, but connotative, as well, insofar as certain genre labels evoke aesthetic or moral judgments for certain users. This dissertation proposes the concept of meta-genres, or the sum total of textual and extra-textual attributes plus the evaluative valances a given user associates with a generic label. Meta-genres help guide interpretation and serve as a shorthand for evaluative judgments about certain kinds of films, and are thus central to the kinds of taste politics negotiated through film texts. The rhetorical conditions of genre use can be typified, and this dissertation adapts concepts and methods from the field of rhetorical genre studies to show that the film genre use is most readily observable through its uptake rhetorical genres. These rhetorical genres, in turn, index the social groups and recurring situations that they are called upon to meet. By studying examples like academic writing, popular press reviews, filmmaker interviews, internet message board comments, and digital media recommendation systems, one can identify how specific deployments of generic terms serve as a nexus of text, user, group, and social activities, and can develop a methodology for studying genre as use relative to those dimensions

    How and Why do Researchers Reference Data? A Study of Rhetorical Features and Functions of Data References in Academic Articles

    Full text link
    Data reuse is a common practice in the social sciences. While published data play an essential role in the production of social science research, they are not consistently cited, which makes it difficult to assess their full scholarly impact and give credit to the original data producers. Furthermore, it can be challenging to understand researchers' motivations for referencing data. Like references to academic literature, data references perform various rhetorical functions, such as paying homage, signaling disagreement, or drawing comparisons. This paper studies how and why researchers reference social science data in their academic writing. We develop a typology to model relationships between the entities that anchor data references, along with their features (access, actions, locations, styles, types) and functions (critique, describe, illustrate, interact, legitimize). We illustrate the use of the typology by coding multidisciplinary research articles (n=30) referencing social science data archived at the Inter-university Consortium for Political and Social Research (ICPSR). We show how our typology captures researchers' interactions with data and purposes for referencing data. Our typology provides a systematic way to document and analyze researchers' narratives about data use, extending our ability to give credit to data that support research.Comment: 35 pages, 2 appendices, 1 tabl

    A decade of in-text citation analysis based on natural language processing and machine learning techniques: an overview of empirical studies

    Get PDF
    In-text citation analysis is one of the most frequently used methods in research evaluation. We are seeing significant growth in citation analysis through bibliometric metadata, primarily due to the availability of citation databases such as the Web of Science, Scopus, Google Scholar, Microsoft Academic, and Dimensions. Due to better access to full-text publication corpora in recent years, information scientists have gone far beyond traditional bibliometrics by tapping into advancements in full-text data processing techniques to measure the impact of scientific publications in contextual terms. This has led to technical developments in citation classifications, citation sentiment analysis, citation summarisation, and citation-based recommendation. This article aims to narratively review the studies on these developments. Its primary focus is on publications that have used natural language processing and machine learning techniques to analyse citations
    • 

    corecore