Application of Penalized Logistic Regression Models in Text Classification

Abstract

随着互联网科技的迅猛发展,人类社会所记录的信息呈现“指数级”的增长。海量信息的快速准确分类、查询及个性化推荐等,有着非常迫切的需求。为了更好地解决文本分类任务中的高维稀疏数据分类问题,本文主要做了以下研究工作。 首先,对文本分类各流程所涉及的技术进行了全面的梳理,以准确把握文本分类亟需突破的难点。接着,对惩罚Logistic模型进行了理论发展概述;同时,从文献综述的角度探讨了惩罚Logistic模型在解决文本分类问题中的可行性,并结合词向量理论和惩罚Logistic模型提出一种新的文本分类算法。然后,为了验证惩罚Logistic模型在特征选择和分类准确率两方面的能力,将多种惩罚Logis...With the rapid development of Internet technology, the information in human society increases exponentially. Fast and accurately classification and recommendation of text information are in great demand. In order to solve the high dimension and sparse data problem faced by the text classification task, our paper mainly do the following research work. Firstly, we carry out a comprehensive introd...学位:应用统计硕士院系专业:经济学院_应用统计硕士学号:1542014115197

    Similar works