research

Feature analysis for web forum question post detection

Abstract

A web forum which is also known as discussion board or Internet forum is an online community of users with a common interest. It is a problem-solving platform that engages experts across the globe. Both technical and non-technical problems are resolved on a daily basis within web forums. Research activities in this domain have been concentrated on answer detection with the assumption that the initial post of a thread is a question post. The quality of web forum question posts varies from excellent to mediocre or even spam. Detecting good question posts require utilization of salient features. In this paper, we implement a bag-of-words (BoW) model to mine web forum question posts. We empirically address the following questions in the paper. Can BoW model effectively detect web forum question post? What feature selection method is most appropriate for BoW model in this domain? Is choice of classifier influenced by web forum genre? We used three publicly available datasets of varying technical degrees for the experiments. The experimental results revealed that BoW can perform better than complex techniques that implement higher N-gram with part-of-speech tagging

    Similar works