Abstract Processing multiple complex features to create co-hesive representations of objects is an essential aspect of both the visual and auditory systems. It is currently unclear whether these processes are entirely modality specific or whether there are amodal processes that contribute to complex object pro-cessing in both vision and audition. We investigated this using a dual-stream target detection task in which two concurrent streams of novel visual or auditory stimuli were presented.We manipulated the degree to which each stream taxed processing conjunctions of complex features. In two experiments, we found that concurrent visual tasks that both taxed conjunctive processing strongly interfered with each other but that con-current auditory and visual tasks that both taxed conjunctive processing did not. These results suggest that resources for processing conjunctions of complex features within vision and audition are modality specific