Automated methods have been widely used to identify and analyze mental healthconditions (e.g., depression) from various sources of information, includingsocial media. Yet, deployment of such models in real-world healthcareapplications faces challenges including poor out-of-domain generalization andlack of trust in black box models. In this work, we propose approaches fordepression detection that are constrained to different degrees by the presenceof symptoms described in PHQ9, a questionnaire used by clinicians in thedepression screening process. In dataset-transfer experiments on three socialmedia datasets, we find that grounding the model in PHQ9's symptomssubstantially improves its ability to generalize to out-of-distribution datacompared to a standard BERT-based approach. Furthermore, this approach canstill perform competitively on in-domain data. These results and ourqualitative analyses suggest that grounding model predictions inclinically-relevant symptoms can improve generalizability while producing amodel that is easier to inspect.<br