This paper studies in particular an aspect of the estimation of conditional probability distributions by maximum likelihood that seems to have been overlooked in the literature on Bayesian networks: The information conveyed by the conditioning event should be included in the likelihood function as well