With the rapid proliferation of textual data, predicting long texts has
emerged as a significant challenge in the domain of natural language
processing. Traditional text prediction methods encounter substantial
difficulties when grappling with long texts, primarily due to the presence of
redundant and irrelevant information, which impedes the model's capacity to
capture pivotal insights from the text. To address this issue, we introduce a
novel approach to long-text classification and prediction. Initially, we employ
embedding techniques to condense the long texts, aiming to diminish the
redundancy therein. Subsequently,the Bidirectional Encoder Representations from
Transformers (BERT) embedding method is utilized for text classification
training. Experimental outcomes indicate that our method realizes considerable
performance enhancements in classifying long texts of Preferential Trade
Agreements. Furthermore, the condensation of text through embedding methods not
only augments prediction accuracy but also substantially reduces computational
complexity. Overall, this paper presents a strategy for long-text prediction,
offering a valuable reference for researchers and engineers in the natural
language processing sphere.Comment: AI4TS Workshop@AAAI 2024 accepted publicatio