Semantic parsing shines at analyzing complex natural language that involves
composition and computation over multiple pieces of evidence. However, datasets
for semantic parsing contain many factoid questions that can be answered from a
single web document. In this paper, we propose to evaluate semantic
parsing-based question answering models by comparing them to a question
answering baseline that queries the web and extracts the answer only from web
snippets, without access to the target knowledge-base. We investigate this
approach on COMPLEXQUESTIONS, a dataset designed to focus on compositional
language, and find that our model obtains reasonable performance (35 F1
compared to 41 F1 of state-of-the-art). We find in our analysis that our model
performs well on complex questions involving conjunctions, but struggles on
questions that involve relation composition and superlatives.Comment: *sem 201