This paper presents a method to explain how input information is discarded
through intermediate layers of a neural network during the forward propagation,
in order to quantify and diagnose knowledge representations of pre-trained deep
neural networks. We define two types of entropy-based metrics, i.e., the strict
information discarding and the reconstruction uncertainty, which measure input
information of a specific layer from two perspectives. We develop a method to
enable efficient computation of such entropy-based metrics. Our method can be
broadly applied to various neural networks and enable comprehensive comparisons
between different layers of different networks. Preliminary experiments have
shown the effectiveness of our metrics in analyzing benchmark networks and
explaining existing deep-learning techniques