The distractive effects on cognitive processes ascribed to the nature of sound havebeen studied in the paradigm of ”irrelevant sound,” where test participants perform cognitivetasks in the presence of background noise. By comparing the test scores for different acousticstimulus conditions in such experiments, the ”irrelevant sound (speech) effect” (ISE) can bequantified. The ISE is often explained by the changing state hypothesis: the distinctivesegmentation of sound tokens; where tokens may be understood as sound segments that canbe distinguished from each other in temporal and/or spectral characteristics. A sequence ofsounds consisting of differing tokens produces much more disruption than a steady-statesound. The present work investigates the relationship between the features from bothtemporal and spectral domains and the ISE, predicting separately the magnitude of the effectwith two estimators: The Average Modulation Transfer Function (AMTF) and the FrequencyDomain Correlation Coefficient (FDCC). The first parameter is a measure for temporalvariations in a sound, whilst the latter measures spectral variability in the sounds. Backgroundstimuli are synthesized from a pulse train in which modified and unmodified pulses alternate.In order to manipulate the temporal and spectral features in the stimuli, a numericaloptimization method was used to generate two sets of background stimuli where one of thetwo descriptors was always kept constant and the other was varied in a systematic way.Therefore, stimulus sets used in this study allow the separate estimation of the role of the twoestimators on cognitive performance in tasks involving serial ordering of short-term memorycontent.\u3cbr/\u3e\u3cbr/\u3