Patterns of nucleotides that flank substitutions in human orthologous genes

Abstract

<p>Abstract</p> <p>Background</p> <p>Sequence context is an important aspect of base mutagenesis, and three-base periodicity is an intrinsic property of coding sequences. However, how three-base periodicity is influenced in the vicinity of substitutions is still unclear. The effect of context on mutagenesis should be revealed in the usage of nucleotides that flank substitutions. Relative entropy (also known as Kullback-Leibler divergence) is useful for finding unusual patterns in biological sequences.</p> <p>Results</p> <p>Using relative entropy, we visualized the periodic patterns in the context of substitutions in human orthologous genes. Neighbouring patterns differed both among substitution categories and within a category that occurred at three codon positions. Transition tended to occur in periodic sequences relative to transversion. Periodic signals were stronger in a set of flanking sequences of substitutions that occurred at the third-codon positions than in those that occurred at the first- or second-codon positions. To determine how the three-base periodicity was affected near the substitution sites, we fitted a sine model to the values of the relative entropy. A sine of period equal to 3 is a good approximation for the three-base periodicity at sites not in close vicinity to some substitutions. These periods were interrupted near the substitution site and then reappeared away from substitutions. A comparative analysis between the native and codon-shuffled datasets suggested that the codon usage frequency was not the sole origin of the three-base periodicity, implying that the native order of codons also played an important role in this periodicity. Synonymous codon shuffling revealed that synonymous codon usage bias was one of the factors responsible for the observed three-base periodicity.</p> <p>Conclusions</p> <p>Our results offer an efficient way to illustrate unusual periodic patterns in the context of substitutions and provide further insight into the origin of three-base periodicity. This periodicity is a result of the native codon order in the reading frame. The length of the period equal to 3 is caused by the usage bias of nucleotides in synonymous codons. The periodic features in nucleotides surrounding substitutions aid in further understanding genetic variation and nucleotide mutagenesis.</p

    Similar works