1,432 research outputs found

    Is linguistic information relevant for the classification of legal texts?

    Get PDF
    Text classification is an important task in the legal domain. In fact, most of the legal information is stored as text in a quite unstructured format and it is important to be able to automatically classify these texts into a predefined set of concepts. Support Vector Machines (SVM), a machine learning al- gorithm, has shown to be a good classifier for text bases [Joachims, 2002]. In this paper, SVMs are applied to the classification of European Portuguese legal texts – the Por- tuguese Attorney General’s Office Decisions – and the rele- vance of linguistic information in this domain, namely lem- matisation and part-of-speech tags, is evaluated. The obtained results show that some linguistic information (namely, lemmatisation and the part-of-speech tags) can be successfully used to improve the classification results and, simultaneously, to decrease the number of features needed by the learning algorithm

    A deep learning framework for contingent liabilities risk management : predicting Brazilian labor court decisions

    Get PDF
    Estimar o resultado de um processo em litígio é crucial para muitas organizações. Uma aplicação específica são os "Passivos Contingenciais", que se referem a passivos que podem ou não ocorrer dependendo do resultado de um processo judicial em litígio. A metodologia tradicional para estimar essa probabilidade baseia-se na opinião de um advogado quem determina a possibilidade de um processo judicial ser perdido a partir de uma avaliação quantitativa. Esta tese apresenta a um modelo matemático baseado numa arquitetura de Deep Learning cujo objetivo é estimar a probabilidade de ganho ou perda de um processo de litígio, principalmente para ser utilizada na estimação de Passivos Contingenciais. A arquitetura, diferentemente do método tradicional, oferece um maior grau de confiança ao prever o resultado de um processo legal em termos de probabilidade e com um tempo de processamento de segundos. Além do resultado primário, a arquitetura estima uma amostra dos casos mais semelhantes ao processo estimado, que servem de apoio para a realização de estratégias de litígio. Nossa arquitetura foi testada em duas bases de dados de processos legais: (1) o Tribunal Europeu de Direitos Humanos (ECHR) e (2) o 4º Tribunal Regional do Trabalho brasileiro (4TRT). Ela estimou de acordo com nosso conhecimento, o melhor desempenho já publicado (precisão = 0,906) na base de dados da ECHR, uma coleção amplamente utilizada de processos legais, e é o primeiro trabalho a aplicar essa metodologia em um tribunal de trabalho brasileiro. Os resultados mostram que a arquitetura é uma alternativa adequada a ser utilizada contra o método tradicional de estimação do desfecho de um processo em litígio realizado por advogados. Finalmente, validamos nossos resultados com especialistas que confirmaram as possibilidades promissoras da arquitetura. Assim, nos incentivamos os académicos a continuar desenvolvendo pesquisas sobre modelagem matemática na área jurídica, pois é um tema emergente com um futuro promissor e aos usuários a utilizar ferramentas baseadas como a desenvolvida em nosso trabalho, pois fornecem vantagens substanciais em termos de precisão e velocidade sobre os métodos convencionais.Estimating the likely outcome of a litigation process is crucial for many organizations. A specific application is the “Contingents Liabilities,” which refers to liabilities that may or may not occur depending on the result of a pending litigation process (lawsuit). The traditional methodology for estimating this likelihood is based on the opinion from the lawyer’s experience which is based on a qualitative appreciation. This dissertation presents a mathematical modeling framework based on a Deep Learning architecture that estimates the probability outcome of a litigation process (accepted & not accepted) with a particular use on Contingent Liabilities. The framework offers a degree of confidence by describing how likely an event will occur in terms of probability and provides results in seconds. Besides the primary outcome, it offers a sample of the most similar cases to the estimated lawsuit that serve as support to perform litigation strategies. We tested our framework in two litigation process databases from: (1) the European Court of Human Rights (ECHR) and (2) the Brazilian 4th regional labor court. Our framework achieved to our knowledge the best-published performance (precision = 0.906) on the ECHR database, a widely used collection of litigation processes, and it is the first to be applied in a Brazilian labor court. Results show that the framework is a suitable alternative to be used against the traditional method of estimating the verdict outcome from a pending litigation performed by lawyers. Finally, we validated our results with experts who confirmed the promising possibilities of the framework. We encourage academics to continue developing research on mathematical modeling in the legal area as it is an emerging topic with a promising future and practitioners to use tools based as the proposed, as they provides substantial advantages in terms of accuracy and speed over conventional methods

    Using attention methods to predict judicial outcomes

    Full text link
    Legal Judgment Prediction is one of the most acclaimed fields for the combined area of NLP, AI, and Law. By legal prediction we mean an intelligent systems capable to predict specific judicial characteristics, such as judicial outcome, a judicial class, predict an specific case. In this research, we have used AI classifiers to predict judicial outcomes in the Brazilian legal system. For this purpose, we developed a text crawler to extract data from the official Brazilian electronic legal systems. These texts formed a dataset of second-degree murder and active corruption cases. We applied different classifiers, such as Support Vector Machines and Neural Networks, to predict judicial outcomes by analyzing textual features from the dataset. Our research showed that Regression Trees, Gated Recurring Units and Hierarchical Attention Networks presented higher metrics for different subsets. As a final goal, we explored the weights of one of the algorithms, the Hierarchical Attention Networks, to find a sample of the most important words used to absolve or convict defendants

    International market selection: analysis of internationalization projects in a Portuguese SME

    Get PDF
    Internationalization of the companies is a phenomenon that is more and more common, considering the constant growth on the world’s network. The innumerous possibilities and options offered in international business directed the market selection process as a complex decision for the managers. Although the selection of an international market is defined as crucial in the literature, there are still a large section of the companies that do not address the proper attention to the International Market Selection (IMS) process. It occurs not only because of the peculiarities of the process, but also for the high resources and expensive information demanded to analyze the massive amount of data available. Having in sight that not all companies have access to a comprehensive IMS process, we analyze 4 internationalization projects of a Portuguese SME in a case study. In this report it is consulted the Market Potential Index (MPI) and suggested proper variables to verify the market selection for each project. Besides that, it is also suggested a few country clusters for future businesses abroad

    Framing the frontier - Tracing issues related to soybean expansion in transnational public spheres

    Get PDF
    Unidad de excelencia María de Maeztu CEX2019-000940-MAltres ajuts: Acord transformatiu CRUE-CSICRapid soybean expansion in South America has been linked to numerous socio-environmental problems, including deforestation in sensitive biomes. As a major importing region of soybeans, wider public awareness has also put pressure on the European Union. Different governance initiatives involving various groups of stakeholders have sought to address these issues. However, what is identified as a relevant problem, as a region of interest or which actors are mentioned in this context are all matters of claims-making processes between different groups and mediated through various channels of communication. This study uses a text-mining approach to trace the construction of socio-ecological problems related to soybean expansion and the actors and regions linked with these issues in public discourse. The focus lies on print media from the European Union, but several additional sources are included to investigate the similarities and differences between various communication channels and regions. These include newspaper articles from producing countries and international news agencies, scientific abstracts, corporate statements, and reports from advocacy groups gathered from the mid-1990s to 2020. The results show that European mass media have shifted their focus from consumer labeling, health, and concerns over genetically modified organisms towards more distant or abstract phenomena, such as deforestation and climate change. This has been accompanied with a broader view on different stakeholders, but also with a strong regional focus on the Amazon biome. There has also been much less attention on direct concerns for communities in producing regions, such as land conflicts or disputes over intellectual property rights. We conclude that while European public spheres appear to become more receptive to issues related to impacts in sourcing regions, there remains a narrow focus on specific problems and regions, which reflects a fundamental asymmetry in different stakeholders' ability to shape transnational deliberations and resulting governance processes

    A model to improve the Evaluation and Selection of Public Contest´s Candidates (Police Officers) based on AI technologies

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsThe number of candidates applying to Public Contests is increasing compared to the number of Human Resources employees required for selecting them for Police Forces. This work intends to perceive how those Public Institutions can evaluate and select their candidates efficiently during the different phases of the recruitment process, and for achieving this purpose AI approaches will be studied. This paper presents two research questions and introduces a corresponding systematic literature review, focusing on AI technologies, so the reader is able to understand which are most used and more appropriate to be applied to Police Forces as a complementary recruitment strategy of the National Criminal Investigation Police agency of Portugal – Polícia Judiciária. Design Science Research (DSR) was the methodological approach chosen. The suggestion of a theoretical framework is the main contribution of this study in pair with the segmentation of the candidates (future Criminal Inspectors). It also helped to comprehend the most important facts facing Public Institutions regarding the usage of AI technologies, to make decisions about evaluating and selecting candidates. Following the PRISMA methodology guidelines, a systematic literature review and meta-analyses method was adopted to identify how can the usage and exploitation of transparent AI have a positive impact on the recruitment process of a Public Institution, resulting in an analysis of 34 papers published between 2017 and 2021. The AI-based theoretical framework, applicable within the analysis of literature papers, solves the problem of how the Institutions can gain insights about their candidates while profiling them; how to obtain more accurate information from the interview phase; and how to reach a more rigorous assessment of their emotional intelligence providing a better alignment of moral values. This way, this work aims to advise the improvement of the decision making to be taken by a recruiter of a Police Force Institution, turning it into a more automated and evidence-based decision when it comes to recruiting the adequate candidate for the place

    Portuguese patent classification: A use case of text classification using machine learning and transfer learning approaches

    Get PDF
    Project Work presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsPatent classification is one of the areas in Intellectual Property Analytics (IPA), and a growing use case since the number of patent applications has been increasing through the years worldwide. Patents are more than ever being used as financial protection for companies that also use patent databases to raise researches and leverage product innovations. Instituto Nacional de Propriedade Industrial, INPI, is the government agency responsible for protecting Industrial Property rights in Portugal. INPI has promoted a competition to explore technologies to solve some challenges related to Industrial Properties, including the classification of patents, one of the critical phases of the grant patent process. In this work project, we used the dataset put available by INPI to explore traditional machine learning algorithms to classify Portuguese patents and evaluate the performance of transfer learning methodologies to solve this task. BERTTimbau, a BERT architecture model pre-trained on a large Portuguese corpus, presented the best results to the task, even though with a performance only 4% superior to a LinearSVC model using TF-IDF feature engineering. In general, the model presents a good performance, despite the low score when classes had few training samples. However, the analysis of misclassified samples showed that the specificity of the context has more influence on the learning than the number of samples itself. Patent classification is a challenging task not just because of 1) the hierarchical structure of the classification but also because of 2) the way a patent is described, 3) the overlap of the contexts, and 4) the underrepresentation of the classes. Nevertheless, it is an area of growing interest, and that can be leveraged by the new researches that are revolutionizing machine learning applications, especially text mining

    Using NLP to Model U.S. Supreme Court Cases

    Get PDF
    The advantages of employing text analysis to uncover policy positions, generate legal predictions, and inform or evaluate reform practices are multifold. Given the far-reaching effects of legislation at all levels of society these insights and their continued improvement are impactful. This research explores the use of natural language processing (NLP) and machine learning to predictively model U.S. Supreme Court case outcomes based on textual case facts. The final model achieved an F1-score of .324 and an AUC of .68. This suggests that the model can distinguish between the two target classes; however, further research is needed before machine learning models are used in the Supreme Court
    corecore