Ensembl's human non-coding and protein coding genes are used to automatically
find DNA pattern motifs. The Backus-Naur form (BNF) grammar for regular
expressions (RE) is used by genetic programming to ensure the generated strings
are legal. The evolved motif suggests the presence of Thymine followed by one
or more Adenines etc. early in transcripts indicate a non-protein coding gene.
Keywords: pseudogene, short and microRNAs, non-coding transcripts, systems
biology, machine learning, Bioinformatics, motif, regular expression, strongly
typed genetic programming, context-free grammar.Comment: 12 pages, 2 figure