An Iterative Learning Algorithm for Deciphering Stegoscripts: a Grammatical Approach for Motif Discovery

Wang, Guandong; Zhang, Weixiong

An Iterative Learning Algorithm for Deciphering Stegoscripts: a Grammatical Approach for Motif Discovery

Authors: Guandong Wang
Weixiong Zhang
Publication date: 15 April 2005
Publisher: Washington University Open Scholarship

Abstract

Steganography, or information hiding, is to conceal the existence of messages so as to protect their conﬁdentiality. We consider de-ciphering a stegoscript, a text with secret messages embedded within a covertext, and identifying the vocabularies used in the mes-sages, with no knowledge of the vocabularies and grammar in which the script was writ-ten. Our research was motivated by the prob-lem of identifying conserved non-coding func-tional elements (motifs) in regulatory regions of genome sequences, which we view as stego-scripts constructed by nature with a statis-tical model consisting of a dictionary and a grammar. We develop an iterative learning algorithm, WordSpy, to learn such a model from a stegoscript. The model then can be applied to identify the embedded secret mes-sages, i.e., the functional motifs. Our algo-rithm can successfully recover the most pos-sible text of the ﬁrst ten chapters of a novel embedded in a stegoscript and identify the transcription factor binding motifs in the up-stream regions of ∼ 800 yeast genes

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Washington University St. Louis: Open Scholarship

oai:openscholarship.wustl.edu:...

Last time updated on 29/10/2019