Location of Repository

some improvements on the training process. There are two possible ways to accomplish this, one is to find some more effective converge algorithms for SVM as Joachims had done in reference [9], the other is to remove a large number of redundancy samples so as to reduce the training time. In this paper we adopt the second way for our Chinese word segmentation task. The rest of this paper is organized as follows. Section 2 gives a brief introduction of SVM. Section 3 describes our Chinese word segmentation task. Section 4 introduces our algorithm and gives a theoretical analysis. Section 5 gives experiments. Finally, we draw the conclusion in section 6. 2. Support Vector Machine SVM is a statistical learning theory for classification, based on the structural risk minimization principle. Its class rule is shown in Fig. 1. It tries to find a maximal margin from the positive class and the negative class. Fig. 1 Class Rule of SVM In SVM’s basic form (that is linear separable case), the decision rule is formulated by the function r rr hx ( ) = sign { wx+ b}, where w r is a weight vector and b is a threshold. Given training sample set n ( x, y),..., ( x, y), x ∈R, y ∈+ {1, − 1} , a SV

Year: 2009

OAI identifier:
oai:CiteSeerX.psu:10.1.1.134.7568

Provided by:
CiteSeerX

Download PDF:To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.