The type III secretion system is an essential component for virulence in many Gram-negative bacteria. Though components of the secretion system apparatus are conserved, its substrates—effector proteins—are not. We have used a novel computational approach to confidently identify new secreted effectors by integrating protein sequence-based features, including evolutionary measures such as the pattern of homologs in a range of other organisms, G+C content, amino acid composition, and the N-terminal 30 residues of the protein sequence. The method was trained on known effectors from the plant pathogen Pseudomonas syringae and validated on a set of effectors from the animal pathogen Salmonella enterica serovar Typhimurium (S. Typhimurium) after eliminating effectors with detectable sequence similarity. We show that this approach can predict known secreted effectors with high specificity and sensitivity. Furthermore, by considering a large set of effectors from multiple organisms, we computationally identify a common putative secretion signal in the N-terminal 20 residues of secreted effectors. This signal can be used to discriminate 46 out of 68 total known effectors from both organisms, suggesting that it is a real, shared signal applicable to many type III secreted effectors. We use the method to make novel predictions of secreted effectors in S. Typhimurium, some of which have been experimentally validated. We also apply the method to predict secreted effectors in the genetically intractable human pathogen Chlamydia trachomatis, identifying the majority of known secreted proteins in addition to providing a number of novel predictions. This approach provides a new way to identify secreted effectors in a broad range of pathogenic bacteria for further experimental characterization and provides insight into the nature of the type III secretion signal
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.