XML documents are described by a document type definition (DTD). An
XML-grammar is a formal grammar that captures the syntactic features of a DTD.
We investigate properties of this family of grammars. We show that every
XML-language basically has a unique XML-grammar. We give two characterizations
of languages generated by XML-grammars, one is set-theoretic, the other is by a
kind of saturation property. We investigate decidability problems and prove
that some properties that are undecidable for general context-free languages
become decidable for XML-languages. We also characterize those XML-grammars
that generate regular XML-languages.Comment: 24 page