Additional file 12. The active molecules from AID 2559 and 2561 were considered as the test set. These were high throughput screened confirmatory bioassay dataset. AID 2559 was consisting of 58 active and 67 inactive molecules whereas, AID 2561 was having 37 actives and 148 inactive molecules. The actives from both were combined to get the test set as ARFF file