8,732 research outputs found

    LawBench: Benchmarking Legal Knowledge of Large Language Models

    Full text link
    Large language models (LLMs) have demonstrated strong capabilities in various aspects. However, when applying them to the highly specialized, safe-critical legal domain, it is unclear how much legal knowledge they possess and whether they can reliably perform legal-related tasks. To address this gap, we propose a comprehensive evaluation benchmark LawBench. LawBench has been meticulously crafted to have precise assessment of the LLMs' legal capabilities from three cognitive levels: (1) Legal knowledge memorization: whether LLMs can memorize needed legal concepts, articles and facts; (2) Legal knowledge understanding: whether LLMs can comprehend entities, events and relationships within legal text; (3) Legal knowledge applying: whether LLMs can properly utilize their legal knowledge and make necessary reasoning steps to solve realistic legal tasks. LawBench contains 20 diverse tasks covering 5 task types: single-label classification (SLC), multi-label classification (MLC), regression, extraction and generation. We perform extensive evaluations of 51 LLMs on LawBench, including 20 multilingual LLMs, 22 Chinese-oriented LLMs and 9 legal specific LLMs. The results show that GPT-4 remains the best-performing LLM in the legal domain, surpassing the others by a significant margin. While fine-tuning LLMs on legal specific text brings certain improvements, we are still a long way from obtaining usable and reliable LLMs in legal tasks. All data, model predictions and evaluation code are released in https://github.com/open-compass/LawBench/. We hope this benchmark provides in-depth understanding of the LLMs' domain-specified capabilities and speed up the development of LLMs in the legal domain

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Predicting Outcomes in Investment Treaty Arbitration

    Get PDF
    Crafting appropriate dispute settlement processes is challenging for any conflict-management system, particularly for politically sensitive international economic law disputes. As the United States negotiates investment treaties with Asian and European countries, the terms of dispute settlement have become contentious. There is a vigorous debate about whether investment treaty arbitration (ITA) is an appropriate dispute settlement mechanism. While some sing the praises of ITA, others offer a spirited critique. Some critics claim that ITA is biased against states, while others suggest ITA is predictable but unfair due to factors like arbitrator identity or venue. Using data from 159 final cases derived from 272 publicly available ITA awards, this Article examines outcomes of ITA cases to explore those concerns. Key descriptive findings demonstrate that states reliably won a greater proportion of cases than investors; and for the subset of cases investors won, the mean award was US$45.6 million with mean investor success rate of 35%. State success rates were roughly similar to respondent-favorable or state-favorable results in whistleblowing, qui tam, and medical-malpractice litigation in U.S. courts. The Article then explores whether ITA outcomes varied depending upon investor identity, state identity, the presence of repeat-player counsel, arbitrator-related, or venue variables. Models using case-based variables always predicted outcomes whereas arbitrator-venue models did not. The results provide initial evidence that the most critical variables for predicting outcomes involved some form of investor identity and the experience of parties’ lawyers. For investor identity, the most robust predictor was whether investors were human beings, with cases brought by people exhibiting greater success than corporations; and when at least one named investor or corporate parent was ranked in the Financial Times 500, investors sometimes secured more favorable outcomes. Following Marc Galanter’s scholarship demonstrating that repeat-player lawyers are critical to litigation outcomes, attorney experience also affected ITA outcomes. Investors with experienced counsel were more likely to obtain a damage award against a state, whereas states retaining experienced counsel were only reliably associated with decreased levels of relative investor success. Although there was variation in outcomes, ultimately, the data did not support a conclusion that ITA was completely unpredictable; rather, the results called into question some critiques of ITA and did not prove that ITA is a wholly unacceptable form of dispute settlement. Instead, the results suggest the vital debate about ITA’s future would be well served by focusing on evidence-based insights and reliance on data rather than nonreplicable intuition

    JURI SAYS:An Automatic Judgement Prediction System for the European Court of Human Rights

    Get PDF
    In this paper we present the web platform JURI SAYS that automatically predicts decisions of the European Court of Human Rights based on communicated cases, which are published by the court early in the proceedings and are often available many years before the final decision is made. Our system therefore predicts future judgements of the court. The platform is available at jurisays.com and shows the predictions compared to the actual decisions of the court. It is automatically updated every month by including the prediction for the new cases. Additionally, the system highlights the sentences and paragraphs that are most important for the prediction (i.e. violation vs. no violation of human rights)

    Institutional Purposes of Chinese Courts: Examining Judicial Guiding Cases in China Through a New Analytic Framework

    Get PDF
    This Article seeks to answer the question “what Chinese courts, as institutions are looking for” through empirically examining institutional purposes in judicial Guiding Cases published by the Supreme People’s Court (SPC) in China. This article has proposed a new analytic framework to interpret institutional purposes of courts in People’s Republic of China (PRC) authoritarian context. Under such new analytic framework, we have divided institutional purposes of Chinese courts into self/institutional interests and preferring values/public policies. Contrast with hyper-political cases, where PRC courts focus on protecting self-interest and institutional integrity of the courts as third-party dispute resolving institution, in judicial guiding cases system we have validated our theoretical model regarding institutional purposes of PRC courts. One the one hand, in a number of judicial guiding cases, we have identified vital self-interests of judges, and courts’ institutional interest to increase professionalism to attain more power and enhance socio-politico status. On the other hand, some other guiding cases reflect strong institutional tendency of Chinese courts, both the SPC and lower courts, to pursue traditional, activist and restraining values. In short, this article not only seeks to a new empirical way to examine PRC court in the most sophisticated authoritarian environment in the world, but also aims to contribute to our understanding regarding institutional characters of judiciaries by testing general theory via judicial behaviors and judicial politics in China’s context.preprin

    Legal Knowledge and Information Systems - JURIX 2017: The Thirtieth Annual Conference

    Get PDF
    The proceedings of the 30th International Conference on Legal Knowledge and Information Systems – JURIX 2017. For three decades, the JURIX conferences have been held under the auspices of the Dutch Foundation for Legal Knowledge Based Systems (www.jurix.nl). In the time, it has become a European conference in terms of the diverse venues throughout Europe and the nationalities of participants

    Keyword Assisted Topic Models

    Full text link
    For a long time, many social scientists have conducted content analysis by using their substantive knowledge and manually coding documents. In recent years, however, fully automated content analysis based on probabilistic topic models has become increasingly popular because of their scalability. Unfortunately, applied researchers find that these models often fail to yield topics of their substantive interest by inadvertently creating multiple topics with similar content and combining different themes into a single topic. In this paper, we empirically demonstrate that providing topic models with a small number of keywords can substantially improve their performance. The proposed keyword assisted topic model (keyATM) offers an important advantage that the specification of keywords requires researchers to label topics prior to fitting a model to the data. This contrasts with a widespread practice of post-hoc topic interpretation and adjustments that compromises the objectivity of empirical findings. In our applications, we find that the keyATM provides more interpretable results, has better document classification performance, and is less sensitive to the number of topics than the standard topic models. Finally, we show that the keyATM can also incorporate covariates and model time trends. An open-source software package is available for implementing the proposed methodology

    Exploring Text Mining and Analytics for Applications in Public Security: An in-depth dive into a systematic literature review

    Get PDF
    Text mining and related analytics emerge as a technological approach to support human activities in extracting useful knowledge through texts in several formats. From a managerial point of view, it can help organizations in planning and decision-making processes, providing information that was not previously evident through textual materials produced internally or even externally. In this context, within the public/governmental scope, public security agencies are great beneficiaries of the tools associated with text mining, in several aspects, from applications in the criminal area to the collection of people's opinions and sentiments about the actions taken to promote their welfare. This article reports details of a systematic literature review focused on identifying the main areas of text mining application in public security, the most recurrent technological tools, and future research directions. The searches covered four major article bases (Scopus, Web of Science, IEEE Xplore, and ACM Digital Library), selecting 194 materials published between 2014 and the first half of 2021, among journals, conferences, and book chapters. There were several findings concerning the targets of the literature review, as presented in the results of this article
    • …
    corecore