9 research outputs found

    Species motif extraction using LPBS

    Get PDF
    This paper presents the use of the ̳Linear-PSO with Binary Search‘ (LPBS) algorithm for discovering motifs, especially species specific motifs.In this study, fragments of mitochondrial cytochrome C oxidase subunit I (COI/COX1) and genome of COI were collected from the Genbank online database.For the first experiment, the genome of COI was used as a reference set and other DNA sequences were used as a comparison set.All the collected DNA sequences are from the same species.The results show that the LPBS algorithm is able to discover motifs. For the second experiment, all the discovered motifs were used as a reference set and the genome of COI from other species were used as a comparison set.The results show that the LPBS algorithm is able to identify correct motifs for species identification

    DNA BARCODING DENGAN ALGORITMA PARTICLE SWARM OPTIMIZATION MENGGUNAKAN APACHE SPARK SQL

    Get PDF
    Terdapat salah satu tahap dalam DNA barcoding yang masih menggunakan metode manual seperti similarity check yang mengakibatkan tahap ini ketelitian dan waktu yang cukup lama. Data sekuens DNA makhluk hidup merupakan data yang sangat banyak pada bidang biologi. Untuk itu penelitian in membuat sebuah model komputasi untuk mendapatkan DNA barcode secara cepat dan efektif dengan mengimplementasikan algoritma particle swarm optimization pada big data platform yaitu Apache Hadoop dan Apache Spark . Data yang digunakan pada penelitian kali ini adalah data RNA SARS-CoV-2. Hasil dari program yang dibangun berupa DNA barcode yang ditemukan dari sampel yang ada berserta waktu yang dibutuhkan untuk menyelesaikan kalkulasi. Dilakukan 2 skenario pengujian, skenario pertama yaitu dengan menggunakan 4 cores dan beberapa worker nodes dan yang kedua yaitu penggunaan cluster dengan 2 worker nodes dan beberapa cores. Hasil dari penelitian ini membuktikan bahwa model komputasi yang dibangun pada big data platform menunjukan adanya perkembangan fitur dan percepatan terhadap penelitian terdahulu. There is one stage in DNA barcoding that still uses manual methods such as similarity check which results in this stage of accuracy and quite a long time. DNA sequence data of living things is very much data in the field of biology. For this reason, this research creates a computational model to obtain DNA barcodes quickly and effectively by implementing the particle swarm optimization algorithm on the big data platform, Apache Hadoop, and Apache Spark. The data used in this study is SARS-CoV-2 RNA data. The results of the program that were built consisted of DNA barcodes found from the existing sample of time needed to complete calculations. The results of this study indicate that there is a significant acceleration between standalone and big data platform with 2 experimental scenarios. The first scenario is to use 4 cores and some worker nodes and the second is to use a cluster with 2 worker nodes and several cores. This research proves that the computational model built on the big data platform shows the development of features and acceleration of previous research

    Identifying the definite base of COI for extraction of DNA sequences using LPBS

    Get PDF
    This paper presents the use of the `Linear-PSO with Binary Search' (LPBS) algorithm for discovering motifs, especially species-specific motifs.In this study, two samples from different fragments of `mitochondrial cytochrome C oxidase subunit I' (COI/COX1) were collected from the Genbank online database.DNA sequences for the first sample are a mix of different fragments of COI and the second sample is from the same fragment of COI.The genome of COI was used as a reference set and other DNA sequences were used as a comparison set.All the collected DNA sequences are from the same species.The results show that the LPBS algorithm is able to discover motifs of greater length when using DNA sequences from the same fragment of COI.The experiment also found that 139 can be used as a starting base for COI DNA sequences extraction to discover species-specific motifs

    A modified algorithm for species specific motif discovery

    Get PDF
    Motif discovery can be used to categorize unknown DNA sequences into their corresponding families. For this study, PSO was modified for discovering motif.The modified Linear-PSO is chosen even though it is a slower because linear search is not a choice but a necessary criteria for identifying motif of pig (Sus Scrofa).Pig motif identification is a critical for halal authentication.The modified Linear-PSO algorithm used linear number for population initializing and next position updating.For each cycle, only a particle called ‘target motif’ was selected and compared with other DNA sequences for fitness calculation. Motif discovered can be used as a standard motif for species identification. Experimental results show that the modified algorithm is able to identify motifs as expected. This study showed that a slower algorithm is still needed and has value based on how critical the problem is

    rMotifGen: random motif generator for DNA and protein sequences

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Detection of short, subtle conserved motif regions within a set of related DNA or amino acid sequences can lead to discoveries about important regulatory domains such as transcription factor and DNA binding sites as well as conserved protein domains. In order to help assess motif detection algorithms on motifs with varying properties and levels of conservation, we have developed a computational tool, rMotifGen, with the sole purpose of generating a number of random DNA or protein sequences containing short sequence motifs. Each motif consensus can be user-defined, randomly generated, or created from a position-specific scoring matrix (PSSM). Insertions and mutations within these motifs are created according to user-defined parameters and substitution matrices. The resulting sequences can be helpful in mutational simulations and in testing the limits of motif detection algorithms.</p> <p>Results</p> <p>Two implementations of rMotifGen have been created, one providing a graphical user interface (GUI) for random motif construction, and the other serving as a command line interface. The second implementation has the added advantages of platform independence and being able to be called in a batch mode. rMotifGen was used to construct sample sets of sequences containing DNA motifs and amino acid motifs that were then tested against the Gibbs sampler and MEME packages.</p> <p>Conclusion</p> <p>rMotifGen provides an efficient and convenient method for creating random DNA or amino acid sequences with a variable number of motifs, where the instance of each motif can be incorporated using a position-specific scoring matrix (PSSM) or by creating an instance mutated from its corresponding consensus using an evolutionary model based on substitution matrices. rMotifGen is freely available at: <url>http://bioinformatics.louisville.edu/brg/rMotifGen/</url>.</p

    DNA motif identification using LPBS

    Get PDF
    In recent years, several deoxyribonucleic acid (DNA)-based approaches have been developed for species identification including DNA sequencing. The search for motif or patterns in DNA sequences is important in many fields especially in biology. In this paper, a new particle swarm optimization (PSO) approach for discovering species-specific motifs was proposed. The new method named as Linear-PSO with Binary Search (LPBS) is developed to discover motifs of specific species through DNA sequences. This enhanced method integrates Linear-PSO and binary search technique to minimize the execution time and to increase the correctness in identifying the motif.In this study, two fragments samples of ‘mitochondrial cytochrome C oxidase subunit I’ (COI or COX1) were collected from the Genbank online database. DNA sequences for the first sample are fragments of COI for one species and the second samples are a complete COI from a different species. The genome of COI was used as a reference set and other DNA sequences were used as a comparison set. The results show that the LPBS algorithm is able to discover motifs of a species when using DNA sequences from the same fragment of COI

    ANALISIS KODE BATANG DNA DAERAH INTERNAL TRANSCRIBED SPACER (ITS) TANAMAN TIMUN APEL SECARA IN SILICO

    Get PDF
    Timun apel merupakan salah satu komoditas lokal hortikultura yang banyak dibudidayakan di Karawang bagian utara yaitu tepatnnya di daerah Pakis Jaya. Informasi ilmiah mengenai timun apel masih sangat terbatas, terutama pada taksonomi dan kekerabatan tanaman timun apel. Untuk mendapatkan informasi mengenai kekerabatan tanaman timun apel dapat menggunakan berupa metode identifikasi molekuler. Tujuan penelitian ini untuk menganalisis motif sekuen DNA pada tanaman timun apel untuk dikembangkan menjadi kode batang DNA. Metode dilakukan secara molekuler dengan menggunakan teknik DNA Barcoding. Berdasarkan penelitian ini dengan menggunakan analisis fenetik, melon (Cucumis melo) merupakan kerabat terdekat dengan tanaman timun apel sehingga diduga merupakan subspecies melon (Cucumis melo) yang baru. Pada hasil konsensus sekuen tanaman timun apel dapat dijadikan sebuah kode batang DNA dengan menghasilkan kandidat primer DELTOP 17. Konsensus sekuen tanaman timun apel dapat dijadikan sebuah kode batang DNA dengan menghasilkan primer DELTOP17. Apple cucumber is one of the local horticultural commodities that is widely cultivated in the northern part of Karawang, namely in the Pakis Jaya area. Scientific information about cucumber apples is still very limited, especially on the taxonomy and genetic relationship of cucumber apples. Information about genetic relationships in apple cucumber can use molecular identification methods. The purpose of this study was to analyze the motives of the DNA sequences in apple cucumber to be developed into DNA barcode. The method is done using the DNA barcoding technique. Based on this research using phenetic analysis, melon (Cucumis melo) the ancestors of the cucumber apple plant, so they are thought to be a new subspecies of melon (Cucumis melo). In the consensus sequences results, consensus sequences apple cucumber can be used as a DNA barcode by producing DELTOP 17 primers

    A particle swarm optimization-based algorithm for finding gapped motifs

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Identifying approximately repeated patterns, or motifs, in DNA sequences from a set of co-regulated genes is an important step towards deciphering the complex gene regulatory networks and understanding gene functions.</p> <p>Results</p> <p>In this work, we develop a novel motif finding algorithm (PSO+) using a population-based stochastic optimization technique called Particle Swarm Optimization (PSO), which has been shown to be effective in optimizing difficult multidimensional problems in continuous domains. We propose a modification of the standard PSO algorithm to handle discrete values, such as characters in DNA sequences. The algorithm provides several features. First, we use both consensus and position-specific weight matrix representations in our algorithm, taking advantage of the efficiency of the former and the accuracy of the latter. Furthermore, many real motifs contain gaps, but the existing methods usually ignore them or assume a user know their exact locations and lengths, which is usually impractical for real applications. In comparison, our method models gaps explicitly, and provides an easy solution to find gapped motifs without any detailed knowledge of gaps. Our method allows the presence of input sequences containing zero or multiple binding sites.</p> <p>Conclusion</p> <p>Experimental results on synthetic challenge problems as well as real biological sequences show that our method is both more efficient and more accurate than several existing algorithms, especially when gaps are present in the motifs.</p

    DNA motif detection using particle swarm optimization and expectation-maximization

    No full text
    corecore