Search CORE

149 research outputs found

Principal Components Analysis on a mixture of quantitative and qualitative data based on generalized correlation coefficients

Author: Kiers Henk A.L.
Publication venue: RION, Groningen
Publication date: 01/01/1988
Field of study

Dissertations of the University of Groningen

Some alternatives to PLS

Author: Kiers Henk A.L.
Publication venue: RCE Edizioni
Publication date: 01/01/2003
Field of study

ARTS repository - University of Groningen

Some alternatives to PLS

Author: Kiers Henk A.L.
Publication venue: RCE Edizioni
Publication date: 01/01/2003
Field of study

Proceedings - University of Groningen

Principal Components Analysis on a mixture of quantitative and qualitative data based on generalized correlation coefficients

Author: Kiers Henk A.L.
Publication venue: RION, Groningen
Publication date: 01/01/1988
Field of study

Proceedings - University of Groningen

Principal Components Analysis on a mixture of quantitative and qualitative data based on generalized correlation coefficients

Author: Kiers Henk A.L.
Publication venue: RION, Groningen
Publication date: 01/01/1988
Field of study

ARTS repository - University of Groningen

Improving Classification of Documents by Semi-supervised Clustering in a Semantic Space

Author: Dobša Jasminka
Kiers Henk A.L.
Publication venue: Springer Science and Business Media Deutschland GmbH
Publication date: 01/01/2023
Field of study

In the paper we propose a method for representation of documents in a semantic lower-dimensional space based on the modified Reduced k-means method which penalizes clusterings that are distant from classification of training documents given by experts. Reduced k-means (RKM) enables simultaneously clustering of documents and extraction of factors. By projection of documents represented in the vector space model on extracted factors, documents are clustered in the semantic space in a semi-supervised way (using penalization) because clustering is guided by classification given by experts, which enables improvement of classification performance of test documents. Classification performance is tested for classification by logistic regression and support vector machines (SVMs) for classes of Reuters-21578 data set. It is shown that representation of documents by the RKM method with penalization improves the average precision of classification by SVMs for the 25 largest classes of Reuters collection for about 5,5% with the same level of average recall in comparison to the basic representation in the vector space model. In the case of classification by logistic regression, representation by the RKM with penalization improves average recall for about 1% in comparison to the basic representation.</p

Proceedings - University of Groningen