11 research outputs found

    Refugee status determination: how cooperation with machine learning tools can lead to more justice

    Full text link
    Previous research on refugee status adjudications has shown that prediction of the outcome of an application can be derived from very few features with satisfactory accuracy. Recent research work has achieved between 70 and 90% accuracy using text analytics on various legal fields among which refugee status determination. Some studies report predictions derived from the judge identity only. Additionally most features used for prediction are non-substantive and external features ranging from news reports, date and time of the hearing or weather. On the other hand, literature shows that noise is ubiquitous in human judgments and significantly affects the outcome of decisions. It has been demonstrated that noise is a significant factor impacting legal decisions. We use the term "noise" in the sense described by D. Kahneman, as a measure of how human beings are unavoidably influenced by external factors when making a decision. In the context of refugee status determination, it means for instance that two judges would take different decisions when presented with the same application. This article explores ways that machine learning can help reduce noise in refugee law decision making. We are not suggesting that this proposed methodology should be exclusive from other approaches to improve decisions such as training of decision makers, skills acquisition or judgment aggregation, but rather that it is a path worth exploring. We investigate how artificial intelligence and specifically data-driven applications can be used to benefit all parties involved in refugee status adjudications. We specifically look at decisions taken in Canada and in the United States. Our research aims at reducing arbitrariness and unfairness that derive from noisy decisions, based on the assumption that if two cases or applications are alike they should be treated in the same way and induce the same outcome.Comment: Scottish Law and Innovation Network (SCOTLIN) 2022, Early Career Scholars Symposiu

    Empowering Refugee Claimants and their Lawyers: Using Machine Learning to Examine Decision-Making in Refugee Law

    Full text link
    Our project aims at helping and supporting stakeholders in refugee status adjudications, such as lawyers, judges, governing bodies, and claimants, in order to make better decisions through data-driven intelligence and increase the understanding and transparency of the refugee application process for all involved parties. This PhD project has two primary objectives: (1) to retrieve past cases, and (2) to analyze legal decision-making processes on a dataset of Canadian cases. In this paper, we present the current state of our work, which includes a completed experiment on part (1) and ongoing efforts related to part (2). We believe that NLP-based solutions are well-suited to address these challenges, and we investigate the feasibility of automating all steps involved. In addition, we introduce a novel benchmark for future NLP research in refugee law. Our methodology aims to be inclusive to all end-users and stakeholders, with expected benefits including reduced time-to-decision, fairer and more transparent outcomes, and improved decision quality.Comment: 19th International Conference on Artificial Intelligence and Law - ICAIL 2023, Doctoral Consortium. arXiv admin note: substantial text overlap with arXiv:2305.1553

    Automated Refugee Case Analysis: An NLP Pipeline for Supporting Legal Practitioners

    Full text link
    In this paper, we introduce an end-to-end pipeline for retrieving, processing, and extracting targeted information from legal cases. We investigate an under-studied legal domain with a case study on refugee law in Canada. Searching case law for past similar cases is a key part of legal work for both lawyers and judges, the potential end-users of our prototype. While traditional named-entity recognition labels such as dates provide meaningful information in legal work, we propose to extend existing models and retrieve a total of 19 useful categories of items from refugee cases. After creating a novel data set of cases, we perform information extraction based on state-of-the-art neural named-entity recognition (NER). We test different architectures including two transformer models, using contextual and non-contextual embeddings, and compare general purpose versus domain-specific pre-training. The results demonstrate that models pre-trained on legal data perform best despite their smaller size, suggesting that domain matching had a larger effect than network architecture. We achieve a F1 score above 90% on five of the targeted categories and over 80% on four further categories.Comment: 9 pages, preprint of long paper accepted to Findings of the Annual Meeting of the Association for Computational Linguistics (ACL) 202

    Do Language Models Learn about Legal Entity Types during Pretraining?

    Get PDF
    Language Models (LMs) have proven their ability to acquire diverse linguistic knowledge during the pretraining phase, potentially serving as a valuable source of incidental supervision for downstream tasks. However, there has been limited research conducted on the retrieval of domain-specific knowledge, and specifically legal knowledge. We propose to explore the task of Entity Typing, serving as a proxy for evaluating legal knowledge as an essential aspect of text comprehension, and a foundational task to numerous downstream legal NLP applications. Through systematic evaluation and analysis and two types of prompting (cloze sentences and QA-based templates) and to clarify the nature of these acquired cues, we compare diverse types and lengths of entities both general and domain-specific entities, semantics or syntax signals, and different LM pretraining corpus (generic and legal-oriented) and architectures (encoder BERT-based and decoder-only with Llama2). We show that (1) Llama2 performs well on certain entities and exhibits potential for substantial improvement with optimized prompt templates, (2) law-oriented LMs show inconsistent performance, possibly due to variations in their training corpus, (3) LMs demonstrate the ability to type entities even in the case of multi-token entities, (4) all models struggle with entities belonging to sub-domains of the law (5) Llama2 appears to frequently overlook syntactic cues, a shortcoming less present in BERT-based architectures. The code of the experiments is available at https://github.com/clairebarale/ probing_legal_entity_types

    AsyLex: A Dataset for Legal Language Processing of Refugee Claims

    Get PDF
    Advancements in natural language processing (NLP) and language models have demonstrated immense potential in the legal domain, enabling automated analysis and comprehension of legal texts. However, developing robust models in Legal NLP is significantly challenged by the scarcity of resources. This paper presents AsyLex, the first dataset specifically designed for Refugee Law applications to address this gap. The dataset introduces 59,112 documents on refugee status determination in Canada from 1996 to 2022, providing researchers and practitioners with essential material for training and evaluating NLP models for legal research and case review. Case review is defined as entity extraction and outcome prediction tasks. The dataset includes 19,115 gold-standard human-labeled annotations for 20 legally relevant entity types curated with the help of legal experts and 1,682 gold-standard labeled documents for the case outcome. Furthermore, we supply the corresponding trained entity extraction models and the resulting labeled entities generated through the inference process on AsyLex. Four supplementary features are obtained through rule-based extraction. We demonstrate the usefulness of our dataset on the legal judgment prediction task to predict the binary outcome and test a set of baselines using the text of the documents and our annotations. We observe that models pretrained on similar legal documents reach better scores, suggesting that acquiring more datasets for specialized domains such as law is crucial. The dataset is available at https://huggingface. co/datasets/clairebarale/AsyLex

    Choroidal and peripapillary changes in high myopic eyes with Stickler syndrome

    No full text
    International audienceBackground: To compare different clinical and Spectral-Domain Optical Coherence Tomography (SD-OCT) features of high myopic eyes with Stickler syndrome (STL) with matched controls.Methods: Patients with genetically confirmed STL with axial length ≥ 26 mm and controls matched for axial length were included. The following data were obtained from SD-OCT scans and fundus photography: choroidal and retinal thickness (respectively, CT and RT), peripapillary atrophy area (PAA), presence of posterior staphyloma (PS).Results: Twenty-six eyes of 17 patients with STL and 25 eyes of 19 controls were evaluated. Compared with controls, patients with STL showed a greater CT subfoveally, at 1000 μm from the fovea at both nasal and temporal location, and at 2000 and 3000 μm from the fovea in nasal location (respectively, 188.7±72.8 vs 126.0±88.7 μm, 172.5±77.7 vs 119.3±80.6 μm, 190.1±71.9 vs 134.9±79.7 μm, 141.3±56.0 vs 98.1±68.5 μm, and 110.9±51.0 vs 67.6±50.7 μm, always P< 0.05). Furthermore, patients with STL showed a lower prevalence of PS (11.5% vs 68%, P< 0.001) and a lower PAA (2.2±2.1 vs 5.4±5.8 mm2, P=0.03), compared with controls.Conclusions: This study shows that high myopic patients with STL show a greater CT, a lower PAA and a lower prevalence of PS, compared with controls matched for axial length. These findings could be relevant for the development and progression of myopic maculopathy in patients with STL

    Functional characterization of the propeptide of Plasmodium falciparum subtilisin-like protease-1.

    No full text
    Erythrocyte invasion by the malaria merozoite is prevented by serine protease inhibitors. Various aspects of the biology of Plasmodium falciparum subtilisin-like protease-1 (PfSUB-1), including the timing of its expression and its apical location in the merozoite, suggest that this enzyme is involved in invasion. Recombinant PfSUB-1 expressed in a baculovirus system is secreted in the p54 form, noncovalently bound to its cognate propeptide, p31. To understand the role of p31 in PfSUB-1 maturation, we examined interactions between p31 and both recombinant and native enzymes. CD analyses revealed that recombinant p31 (rp31) possesses significant secondary structure on its own, comparable with that of folded propeptides of some bacterial subtilisins. Kinetic studies demonstrated that rp31 is a fast binding, high affinity inhibitor of PfSUB-1. Inhibition of two bacterial subtilisins by rp31 was much less effective, with inhibition constants 49-60-fold higher than that for PfSUB-1. Single (at the P4 or P1 position) or double (at P4 and P1 positions) point mutations of residues within the C-terminal region of rp31 had little effect on its inhibitory activity, and truncation of 11 residues from the rp31 C terminus substantially reduced, but did not abolish, inhibition. None of these modifications prevented binding to the PfSUB-1 catalytic domain or rendered the propeptide susceptible to proteolytic digestion by PfSUB-1. These studies provide new insights into the function of the propeptide in PfSUB-1 activation and shed light on the structural requirements for interaction with the catalytic domain

    A molecular marker of artemisinin-resistant Plasmodium falciparum malaria

    No full text
    Plasmodium falciparum resistance to artemisinin derivatives in southeast Asia threatens malaria control and elimination activities worldwide. To monitor the spread of artemisinin resistance, a molecular marker is urgently needed. Here, using whole-genome sequencing of an artemisinin-resistant parasite line from Africa and clinical parasite isolates from Cambodia, we associate mutations in the PF3D7_1343700 kelch propeller domain ('K13-propeller') with artemisinin resistance in vitro and in vivo. Mutant K13-propeller alleles cluster in Cambodian provinces where resistance is prevalent, and the increasing frequency of a dominant mutant K13-propeller allele correlates with the recent spread of resistance in western Cambodia. Strong correlations between the presence of a mutant allele, in vitro parasite survival rates and in vivo parasite clearance rates indicate that K13-propeller mutations are important determinants of artemisinin resistance. K13-propeller polymorphism constitutes a useful molecular marker for large-scale surveillance efforts to contain artemisinin resistance in the Greater Mekong Subregion and prevent its global spread

    A molecular marker of artemisinin-resistant Plasmodium falciparum malaria.

    No full text
    Plasmodium falciparum resistance to artemisinin derivatives in southeast Asia threatens malaria control and elimination activities worldwide. To monitor the spread of artemisinin resistance, a molecular marker is urgently needed. Here, using whole-genome sequencing of an artemisinin-resistant parasite line from Africa and clinical parasite isolates from Cambodia, we associate mutations in the PF3D7_1343700 kelch propeller domain ('K13-propeller') with artemisinin resistance in vitro and in vivo. Mutant K13-propeller alleles cluster in Cambodian provinces where resistance is prevalent, and the increasing frequency of a dominant mutant K13-propeller allele correlates with the recent spread of resistance in western Cambodia. Strong correlations between the presence of a mutant allele, in vitro parasite survival rates and in vivo parasite clearance rates indicate that K13-propeller mutations are important determinants of artemisinin resistance. K13-propeller polymorphism constitutes a useful molecular marker for large-scale surveillance efforts to contain artemisinin resistance in the Greater Mekong Subregion and prevent its global spread
    corecore