Our goal is to identify the features that predict the occurrence and
placement of discourse cues in tutorial explanations in order to aid in the
automatic generation of explanations. Previous attempts to devise rules for
text generation were based on intuition or small numbers of constructed
examples. We apply a machine learning program, C4.5, to induce decision trees
for cue occurrence and placement from a corpus of data coded for a variety of
features previously thought to affect cue usage. Our experiments enable us to
identify the features with most predictive power, and show that machine
learning can be used to induce decision trees useful for text generation.Comment: 10 pages, 2 Postscript figures, uses aclap.sty, psfig.te