A Study of the Learnability of Relational Properties: Model Counting
  Meets Machine Learning (MCML)

Baluta Teodora; Blumer Anselm; Chavira Mark; Cormen Thomas H.; Demsky Brian; Fierens Daan; Galeotti J. P.; GarcÃŋa Salvador; Gopinath Divya; Gopinath Divya; Heule Marijn J. H.; Håstad Johan; Iman Ronald L.; Jackson Daniel; Katz G.; Ke Yalin; Khurshid Sarfraz; Kim Moonzoo; Korel B.; Narodytska Nina; Samimi Hesam; Shalev-Shwartz Shai; Soos Mate; Spivey J. M.; Trippel Caroline; Vapnik V. N.; Vasic Marko; Wickerson John; Zave P.; Zave Pamela

A Study of the Learnability of Relational Properties: Model Counting Meets Machine Learning (MCML)

Authors: Baluta Teodora
Blumer Anselm
Chavira Mark
Cormen Thomas H.
Demsky Brian
Fierens Daan
Galeotti J. P.
GarcÃŋa Salvador
Gopinath Divya
Gopinath Divya
Heule Marijn J. H.
Håstad Johan
Iman Ronald L.
Jackson Daniel
Katz G.
Ke Yalin
Khurshid Sarfraz
Kim Moonzoo
Korel B.
Narodytska Nina
Samimi Hesam
Shalev-Shwartz Shai
Soos Mate
Spivey J. M.
Trippel Caroline
Vapnik V. N.
Vasic Marko
Wickerson John
Zave P.
Zave Pamela
Publication date: 6 September 2020
Publisher: 'Association for Computing Machinery (ACM)'
Doi

Abstract

This paper introduces the MCML approach for empirically studying the learnability of relational properties that can be expressed in the well-known software design language Alloy. A key novelty of MCML is quantification of the performance of and semantic differences among trained machine learning (ML) models, specifically decision trees, with respect to entire (bounded) input spaces, and not just for given training and test datasets (as is the common practice). MCML reduces the quantification problems to the classic complexity theory problem of model counting, and employs state-of-the-art model counters. The results show that relatively simple ML models can achieve surprisingly high performance (accuracy and F1-score) when evaluated in the common setting of using training and test datasets - even when the training dataset is much smaller than the test dataset - indicating the seeming simplicity of learning relational properties. However, MCML metrics based on model counting show that the performance can degrade substantially when tested against the entire (bounded) input space, indicating the high complexity of precisely learning these properties, and the usefulness of model counting in quantifying the true performance

Similar works

Full text

Available Versions

Crossref

Last time updated on 10/08/2021