Despite detailed knowledge about the anatomy and physiology of the primary visual cortex (V1), the immense number of feed-forward and recurrent connections onto a given V1 neuron make it difficult to understand how the physiological details relate to a given neuron’s functional properties. Here, we focus on a well-known functional property of many V1 complex cells: phase-invariant direction selectivity (DS). While the energy model explains its construction at the conceptual level, it remains unclear how the mathematical operations described in this model are implemented by cortical circuits. To understand how DS of complex cells is constructed in cortex, we apply a nonlinear modeling framework to extracellular data from macaque V1. We use a modification of spike-triggered covariance (STC) analysis to identify multiple biologically plausible "spatiotemporal features" that either excite or suppress a cell. We demonstrate that these features represent the true inputs to the neuron more accurately, and the resulting nonlinear model compactly describes how these inputs are combined to result in the functional properties of the cell. In a population of 59 neurons, we find that both simple and complex V1 cells are selective to combinations of excitatory and suppressive motion features. Because the strength of DS and simple/complex classification is well predicted by our models, we can use simulations with inputs matching thalamic and simple cells to assess how individual model components contribute to these measures. Our results unify experimental observations regarding the construction of DS from thalamic feed-forward inputs to V1: based on the differences between excitatory and inhibitory inputs, they suggest a connectivity diagram for simple and complex cells that sheds light on the mechanism underlying the DS of cortical cells. More generally, they illustrate how stage-wise nonlinear combination of multiple features gives rise to the processing of more abstract visual information