Comparison of logistic regression, SVM and random forest performance in the plasma training data set. Table S2. Pathway significance and relative log fold changes in our metabolomics data and TCGA breast cancer RNA-Seq data. Table S3. Detected metabolites and their differential test results among the two models. a All-stage diagnosis model. b Early-stage diagnosis model. Table S4. Single-variate logistic analysis of metabolites or pathways selected as features in the metabolite-based or pathway-based early-stage diagnosis model. Table S5. Comparison of pathway features in the full-size (101 input pathways) and half-size (51 input pathways) pathway-based early-stage diagnosis models. (DOCX 34 kb