3 research outputs found

    A computational framework for nucleic acid sub-sequence identification

    Full text link
    Identification of nucleic acid sub-sequences within larger background sequences is a fundamental need of the biology community. The applicability correlates to research studies looking for homologous regions, diagnostic purposes and many other related activities. This paper serves to detail the approaches taken leading to sub-sequence identification through the use of hidden Markov models and associated scoring optimisations. The investigation of techniques for locating conserved basal promoter elements correlates to promoter thus gene identification techniques. The case study centred on the TATA box basal promoter element, as such the background is a gene sequence with the TATA box the target. Outcomes from the research conducted, highlights generic algorithms for sub-sequence identification, as such these generic processes can be transposed to any case study where identification of a target sequence is required. Paths extending from the work conducted in this investigation have led to the development of a generic framework for the future applicability of hidden Markov models to biological sequence analysis in a computational context
    corecore