In this thesis, we address the problem of opinion question generation. The motivation behind this task is to provide users with more question samples related to their query when using search engines. In our view, one of the datasets that is closest to peoples’ thoughts, informal and casual speech are Community Question Answering (CQA) forums, where one can post questions, and other users can answer them. Specifically, we perform experiments on the Amazon question/answer dataset.
Unlike the conventional approaches that have tackled the question generation problem with hand-crafted rules, our approach is entirely data-driven. We model our problem with the sequence to sequence approach using an encoder-decoder structure, which has shown significant improvement in different natural language processing research areas in recent years. Our model benefits from the attention mechanism, which assists the model in fo- cusing on a specific part of the input sentence. Furthermore, we provide solutions to the following problems: repetition of words and generating outside of vocabulary tokens. We provide a detailed explanation of the performance of the system. Experimental results show an improvement in automatic evaluation metrics such as the BLEU score over the state-of- the-art question generation system