2 research outputs found

    A Robust Deep Model for Improved Categorization of Legal Documents for Predictive Analytics

    Get PDF
    Predictive legal analytics is a technology used to predict the chances of successful and unsuccessful outcomes in a particular case. Predictive legal analytics is performed through automated document classification for facilitating legal experts in their classification of court documents to retrieve and understand the details of specific legal factors from legal judgments for accurate document analysis. However, extracting these factors from legal texts document is a time-consuming process. In order to facilitate the task of classifying documents, a robust method namely Distributed Stochastic Keyword Extraction based Ensemble Theil-Sen Regressive Deep Belief Reweight Boost Classification (DSKE-TRDBRBC) is proposed. The DSKE-TRDBRBC technique consists of two major processes namely Keyword Extraction and Classification. At first, the t-distributed stochastic neighbor embedding technique is applied to DSKE-TRDBRBC for keyword extraction. This in turn minimizes the time consumption for document classification. After that, the Ensemble Theil-Sen Regressive Deep Belief Reweight Boosting technique is applied for document classification. The Ensemble boosting algorithm initially constructs’ set of Theil-Sen Regressive Deep Belief neural networks to classify the input legal documents. Then the results of the Deep Belief neural network are combined to built a strong classifier by reducing the error. This aids in improving the classification accuracy. The proposed method is experimentally evaluated with various metrics such as F-measure , recall, accuracy, precision, , and computational time. The experimental results quantitatively confirm that the proposed DSKE-TRDBRBC technique achieves better accuracy with lowest computation time as compared to the conventional approaches

    Revisiting the challenges and surveys in text similarity matching and detection methods

    Get PDF
    The massive amount of information from the internet has revolutionized the field of natural language processing. One of the challenges was estimating the similarity between texts. This has been an open research problem although various studies have proposed new methods over the years. This paper surveyed and traced the primary studies in the field of text similarity. The aim was to give a broad overview of existing issues, applications, and methods of text similarity research. This paper identified four issues and several applications of text similarity matching. It classified current studies based on intrinsic, extrinsic, and hybrid approaches. Then, we identified the methods and classified them into lexical-similarity, syntactic-similarity, semantic-similarity, structural-similarity, and hybrid. Furthermore, this study also analyzed and discussed method improvement, current limitations, and open challenges on this topic for future research directions
    corecore