Gene expression analysis by means of microarrays is based on the sequence
specific binding of mRNA to DNA oligonucleotide probes and its measurement
using fluorescent labels. The binding of RNA fragments involving other
sequences than the intended target is problematic because it adds a "chemical
background" to the signal, which is not related to the expression degree of the
target gene. The paper presents a molecular signature of specific and non
specific hybridization with potential consequences for gene expression
analysis. We analyzed the signal intensities of perfect match (PM) and mismatch
(MM) probes of GeneChip microarrays to specify the effect of specific and non
specific hybridization. We found that these events give rise to different
relations between the PM and MM intensities as function of the middle base of
the PMs, namely a triplet- (C>G=T>A>0) and a duplet-like (C=T>0>G=A) pattern of
the PM-MM log-intensity difference upon binding of specific and non specific
RNA fragments, respectively. The systematic behaviour of the intensity
difference can be rationalized on the level of base pairings of DNA/RNA
oligonucleotide duplexes in the middle of the probe sequence. Non-specific
binding is characterized by the reversal of the central Watson Crick (WC)
pairing for each PM/MM probe pair, whereas specific binding refers to the
combination of a WC and a self complementary (SC) pairing in PM and MM probes,
respectively. The intensity of complementary MM introduces a systematic source
of variation which decreases the precision of expression measures based on the
MM intensities