Despite recent breakthroughs in Machine Learning for Natural Language
Processing, the Natural Language Inference (NLI) problems still constitute a
challenge. To this purpose we contribute a new dataset that focuses exclusively
on the factivity phenomenon; however, our task remains the same as other NLI
tasks, i.e. prediction of entailment, contradiction or neutral (ECN). The
dataset contains entirely natural language utterances in Polish and gathers
2,432 verb-complement pairs and 309 unique verbs. The dataset is based on the
National Corpus of Polish (NKJP) and is a representative sample in regards to
frequency of main verbs and other linguistic features (e.g. occurrence of
internal negation). We found that transformer BERT-based models working on
sentences obtained relatively good results (≈89% F1 score). Even
though better results were achieved using linguistic features (≈91% F1
score), this model requires more human labour (humans in the loop) because
features were prepared manually by expert linguists. BERT-based models
consuming only the input sentences show that they capture most of the
complexity of NLI/factivity. Complex cases in the phenomenon - e.g. cases with
entitlement (E) and non-factive verbs - remain an open issue for further
research