Voices are arguably among the most relevant sounds in humans' everyday life and several studies have demonstrated the existence of voice-selective regions in the human brain. However, whether this preference is merely driven by physical (i.e., acoustic) properties specific to voices, or whether it reflects a higher-level categorical response is still under debate. Here, we address this fundamental issue with Fast Periodic Auditory Stimulation combined with electroencephalography (EEG) to measure objective, direct, fast and automatic voice- selective responses in the human brain. Participants were tested with stimulation sequences containing heterogeneous non-vocal sounds from different categories presented at 4 Hz (i.e., 4 stimuli/second), with vocal sounds appearing every 3 stimuli (1.33 Hz). A few minutes of stimulation are sufficient to elicit robust 1.33 Hz voice-selective focal brain responses over superior temporal regions of individual participants. This response is virtually absent for sequences using frequency-scrambled sounds, but is clearly observed when voices are inserted in sounds from musical instruments matched in pitch and harmonicity-to-noise ratio. Overall, our Fast Periodic Auditory Stimulation paradigm demonstrates high-level categorization of human voices, and could be a powerful and versatile tool to understand human auditory categorization in general