1 research outputs found
Legal Document Classification: An Application to Law Area Prediction of Petitions to Public Prosecution Service
In recent years, there has been an increased interest in the application of
Natural Language Processing (NLP) to legal documents. The use of convolutional
and recurrent neural networks along with word embedding techniques have
presented promising results when applied to textual classification problems,
such as sentiment analysis and topic segmentation of documents. This paper
proposes the use of NLP techniques for textual classification, with the purpose
of categorizing the descriptions of the services provided by the Public
Prosecutor's Office of the State of Paran\'a to the population in one of the
areas of law covered by the institution. Our main goal is to automate the
process of assigning petitions to their respective areas of law, with a
consequent reduction in costs and time associated with such process while
allowing the allocation of human resources to more complex tasks. In this
paper, we compare different approaches to word representations in the
aforementioned task: including document-term matrices and a few different word
embeddings. With regards to the classification models, we evaluated three
different families: linear models, boosted trees and neural networks. The best
results were obtained with a combination of Word2Vec trained on a
domain-specific corpus and a Recurrent Neural Network (RNN) architecture (more
specifically, LSTM), leading to an accuracy of 90\% and F1-Score of 85\% in the
classification of eighteen categories (law areas)