Models based on bidirectional encoder representations from transformers
(BERT) produce state of the art (SOTA) results on many natural language
processing (NLP) tasks such as named entity recognition (NER), part-of-speech
(POS) tagging etc. An interesting phenomenon occurs when classifying long
documents such as those from the US supreme court where BERT-based models can
be considered difficult to use on a first-pass or out-of-the-box basis. In this
paper, we experiment with several BERT-based classification techniques for US
supreme court decisions or supreme court database (SCDB) and compare them with
the previous SOTA results. We then compare our results specifically with SOTA
models for long documents. We compare our results for two classification tasks:
(1) a broad classification task with 15 categories and (2) a fine-grained
classification task with 279 categories. Our best result produces an accuracy
of 80\% on the 15 broad categories and 60\% on the fine-grained 279 categories
which marks an improvement of 8\% and 28\% respectively from previously
reported SOTA results