This work presents two astonishing findings on neural networks learned for
large-scale image classification. 1) Given a well-trained model, the logits
predicted for some category can be directly obtained by linearly combining the
predictions of a few other categories, which we call \textbf{neural
dependency}. 2) Neural dependencies exist not only within a single model, but
even between two independently learned models, regardless of their
architectures. Towards a theoretical analysis of such phenomena, we demonstrate
that identifying neural dependencies is equivalent to solving the Covariance
Lasso (CovLasso) regression problem proposed in this paper. Through
investigating the properties of the problem solution, we confirm that neural
dependency is guaranteed by a redundant logit covariance matrix, which
condition is easily met given massive categories, and that neural dependency is
highly sparse, implying that one category correlates to only a few others. We
further empirically show the potential of neural dependencies in understanding
internal data correlations, generalizing models to unseen categories, and
improving model robustness with a dependency-derived regularizer. Code for this
work will be made publicly available