1 research outputs found
Effect of the Side Effect Machines in Edit Metric Decoding
The development of general edit metric decoders is a challenging problem, especially with
the inclusion of additional biological restrictions that can occur in DNA error correcting
codes. Side effect machines (SEMs), an extension of finite state machines, can provide
efficient decoding algorithms for such edit metric codes. However, finding a good machine
poses its own set of challenges and is itself considered as an open problem with no general
solution. Previous studies utilizing evolutionary computation techniques, such as genetic
algorithms and evolutionary programming to search for good SEMs have found success in
terms of decoding accuracy. However, they all worked with extremely constricted problem
spaces i.e. a single code or codes of the same length. Therefore a general approach that
works well across codes of different lengths is yet to be formalized.
In this research, several codes of varying lengths are used to study the effectiveness of
evolutionary programming (EP) as a general approach for finding efficient edit metric decoders.
Two classification methods—direct and fuzzy—are compared while also changing
some of the EP settings to observe how the decoding accuracy is affected. The final
SEMs are verified against an additional dataset to test their general effectiveness. Regardless
of the code length, the best results are found using the fuzzy classification methods. For
codes of length 10, a maximum accuracy of up to 99.4% is achieved for distance 1 whereas
distance 2 and 3 achieve up to 97.1% and 85.9%, respectively. Unsurprisingly, the accuracy
suffers for longer codes, as the maximum accuracies achieved by codes of length 14 were
92.4%, 85.7% and 69.2% for distance 1, 2, and 3 respectively. Additionally, the machines
are examined for potential bloat by comparing the number of visited states against the number
of total states. The study has found some machines with at least one unvisited state.
The bloat is seen more in larger machines than it is in smaller machines. Furthermore, the
results are analyzed to find potential trends and relationships among the parameters. The
trend that is most consistently noticed is that — when allowed, the longer codes generally
show a propensity for larger machines