This paper describes a general architecture for an interactive model-based vision system. A human specifies a limited amount of information which establishes a context for autonomous interpretation of images. Object models are described by constraints specifying necessary geometrical properties and relationships between objects. The use of constraints allows for flexible object instantiation. A user can indicate an object in a scene and this directs perceptual processing routines as well as constraining future object instantiations. This interactive model-based concept has been applied to the domain of vehicle tracking, and this paper concludes with several processing examples from this domain