1 research outputs found
RegExpBlasting (REB), a Regular Expression Blasting algorithm based on multiply aligned sequences
Background: One of the most frequent uses of bioinformatics tools
concerns functional characterization of a newly produced nucleotide
sequence (a query sequence) by applying Blast or FASTA against a set of
sequences (the subject sequences).
However, in some specific contexts, it is useful to compare the query
sequence against a cluster such as a MultiAlignment (MA). We present
here the RegExpBlasting (REB) algorithm, which compares an unclassified
sequence with a dataset of patterns defined by application of Regular
Expression rules to a given-as-input MA datasets.
The REB algorithm workflow consists in
i. the definition of a dataset of multialignments
ii. the association of each MA to a pattern, defined by application of
regular expression rules;
iii. automatic characterization of a submitted biosequence according to
the function of the sequences described by the pattern best matching the
query sequence.
Results: An application of this algorithm is used in the "characterize
your sequence" tool available in the PPNEMA resource. PPNEMA is a
resource of Ribosomal Cistron sequences from various species, grouped
according to nematode genera. It allows the retrieval of plant nematode
multialigned sequences or the classification of new nematode rDNA
sequences by applying REB. The same algorithm also supports automatic
updating of the PPNEMA database. The present paper gives examples of the
use of REB within PPNEMA.
Conclusion: The use of REB in PPNEMA updating, the PPNEMA "characterize
your sequence" option clearly demonstrates the power of the method.
Using REB can also rapidly solve any other bioinformatics problem, where
the addition of a new sequence to a pre-existing cluster is required.
The statistical tests carried out here show the powerful flexibility of
the method