3,260 research outputs found

    Bringing order into the realm of Transformer-based language models for artificial intelligence and law

    Full text link
    Transformer-based language models (TLMs) have widely been recognized to be a cutting-edge technology for the successful development of deep-learning-based solutions to problems and applications that require natural language processing and understanding. Like for other textual domains, TLMs have indeed pushed the state-of-the-art of AI approaches for many tasks of interest in the legal domain. Despite the first Transformer model being proposed about six years ago, there has been a rapid progress of this technology at an unprecedented rate, whereby BERT and related models represent a major reference, also in the legal domain. This article provides the first systematic overview of TLM-based methods for AI-driven problems and tasks in the legal sphere. A major goal is to highlight research advances in this field so as to understand, on the one hand, how the Transformers have contributed to the success of AI in supporting legal processes, and on the other hand, what are the current limitations and opportunities for further research development.Comment: Please refer to the published version: Greco, C.M., Tagarelli, A. (2023) Bringing order into the realm of Transformer-based language models for artificial intelligence and law. Artif Intell Law, Springer Nature. November 2023. https://doi.org/10.1007/s10506-023-09374-

    Synopsizing “literature review” for scientific publications

    Get PDF
    Because the number of scientific publications in most disciplines is expanding rapidly, traditional academic search engines can hardly satisfy scholars’ need to retrieve and assimilate the information they are looking for. In this study we investigate a new summarization problem: creating a synopsis “Literature Review” of a collection of candidate cited papers in response to a query, via different methods and indicators. In more detail, we compare the use of different methods and indicators for generating citation clusters and summarized reviews by analyzing publication abstracts, citation contexts, and co-cite relationships. We also validate the usefulness of a user’s query during this process by comparing query-dependent and query-independent clustering and summarization. One interesting outcome of this study is that citation contexts are more suitable for clustering related papers, whereas abstracts are more accurate for generating longer review-like summaries. The initial user query is also helpful for enhancing clustering and summarization performance

    unarXive: a large scholarly data set with publications’ full-text, annotated in-text citations, and links to metadata

    Get PDF
    In recent years, scholarly data sets have been used for various purposes, such as paper recommendation, citation recommendation, citation context analysis, and citation context-based document summarization. The evaluation of approaches to such tasks and their applicability in real-world scenarios heavily depend on the used data set. However, existing scholarly data sets are limited in several regards. Here, we propose a new data set based on all publications from all scientific disciplines available on arXiv.org. Apart from providing the papers' plain text, in-text citations were annotated via global identifiers. Furthermore, citing and cited publications were linked to the Microsoft Academic Graph, providing access to rich metadata. Our data set consists of over one million documents and 29.2 million citation contexts. The data set, which is made freely available for research purposes, not only can enhance the future evaluation of research paper-based and citation context-based approaches but also serve as a basis for new ways to analyze in-text citations. See https://github.com/IllDepence/unarXive for the source code which has been used for creating the data set. For citing our data set and for further information we can refer to our journal article Tarek Saier, Michael FĂ€rber: "unarXive: A Large Scholarly Data Set with Publications’ Full-Text, Annotated In-Text Citations, and Links to Metadata", Scientometrics, 2020, http://dx.doi.org/10.1007/s11192-020-03382-z

    Classified by Genre: Rhetorical Genrefication in Cinema

    Get PDF
    This dissertation argues for a rethinking and expansion of film genre theory. As the variety of media exhibition platforms expands and as discourse about films permeates a greater number of communication media, the use of generic terms has never been more multiform or observable. Fundamental problems in the very conception of film genre have yet to be addressed adequately, and film genre study has carried on despite its untenable theoretical footing. Synthesizing pragmatic genre theory, constructivist film theory, Bourdieusian fan studies, and rhetorical genre studies, the dissertation aims to work through the radical implications of pragmatic genre theory and account for genres role in interpretation, evaluation, and rhetorical framing as part of broader, recurring social activities. This model rejects textualist and realist foundations for film genre; only pragmatic genre use can serve as a foundation for understanding film genres. From this perspective, the concept of genre is reconstructed according to its interpretive and rhetorical functions rather than a priori assumptions about the text or transtextual structures. Genres are not independent structures or relations among texts but performative speech acts about textual relationships and are functions of the rhetorical conditions of their use. This use is not only denotative, but connotative, as well, insofar as certain genre labels evoke aesthetic or moral judgments for certain users. This dissertation proposes the concept of meta-genres, or the sum total of textual and extra-textual attributes plus the evaluative valances a given user associates with a generic label. Meta-genres help guide interpretation and serve as a shorthand for evaluative judgments about certain kinds of films, and are thus central to the kinds of taste politics negotiated through film texts. The rhetorical conditions of genre use can be typified, and this dissertation adapts concepts and methods from the field of rhetorical genre studies to show that the film genre use is most readily observable through its uptake rhetorical genres. These rhetorical genres, in turn, index the social groups and recurring situations that they are called upon to meet. By studying examples like academic writing, popular press reviews, filmmaker interviews, internet message board comments, and digital media recommendation systems, one can identify how specific deployments of generic terms serve as a nexus of text, user, group, and social activities, and can develop a methodology for studying genre as use relative to those dimensions

    How and Why do Researchers Reference Data? A Study of Rhetorical Features and Functions of Data References in Academic Articles

    Full text link
    Data reuse is a common practice in the social sciences. While published data play an essential role in the production of social science research, they are not consistently cited, which makes it difficult to assess their full scholarly impact and give credit to the original data producers. Furthermore, it can be challenging to understand researchers' motivations for referencing data. Like references to academic literature, data references perform various rhetorical functions, such as paying homage, signaling disagreement, or drawing comparisons. This paper studies how and why researchers reference social science data in their academic writing. We develop a typology to model relationships between the entities that anchor data references, along with their features (access, actions, locations, styles, types) and functions (critique, describe, illustrate, interact, legitimize). We illustrate the use of the typology by coding multidisciplinary research articles (n=30) referencing social science data archived at the Inter-university Consortium for Political and Social Research (ICPSR). We show how our typology captures researchers' interactions with data and purposes for referencing data. Our typology provides a systematic way to document and analyze researchers' narratives about data use, extending our ability to give credit to data that support research.Comment: 35 pages, 2 appendices, 1 tabl

    A decade of in-text citation analysis based on natural language processing and machine learning techniques: an overview of empirical studies

    Get PDF
    In-text citation analysis is one of the most frequently used methods in research evaluation. We are seeing significant growth in citation analysis through bibliometric metadata, primarily due to the availability of citation databases such as the Web of Science, Scopus, Google Scholar, Microsoft Academic, and Dimensions. Due to better access to full-text publication corpora in recent years, information scientists have gone far beyond traditional bibliometrics by tapping into advancements in full-text data processing techniques to measure the impact of scientific publications in contextual terms. This has led to technical developments in citation classifications, citation sentiment analysis, citation summarisation, and citation-based recommendation. This article aims to narratively review the studies on these developments. Its primary focus is on publications that have used natural language processing and machine learning techniques to analyse citations
    • 

    corecore