32 research outputs found

    An efficient parallel method for mining frequent closed sequential patterns

    Get PDF
    Mining frequent closed sequential pattern (FCSPs) has attracted a great deal of research attention, because it is an important task in sequences mining. In recently, many studies have focused on mining frequent closed sequential patterns because, such patterns have proved to be more efficient and compact than frequent sequential patterns. Information can be fully extracted from frequent closed sequential patterns. In this paper, we propose an efficient parallel approach called parallel dynamic bit vector frequent closed sequential patterns (pDBV-FCSP) using multi-core processor architecture for mining FCSPs from large databases. The pDBV-FCSP divides the search space to reduce the required storage space and performs closure checking of prefix sequences early to reduce execution time for mining frequent closed sequential patterns. This approach overcomes the problems of parallel mining such as overhead of communication, synchronization, and data replication. It also solves the load balance issues of the workload between the processors with a dynamic mechanism that re-distributes the work, when some processes are out of work to minimize the idle CPU time.Web of Science5174021739

    A Survey on Clustering Algorithm for Microarray Gene Expression Data

    Get PDF
    The DNA data are huge multidimensional which contains the simultaneous gene expression and it uses the microarray chip technology, also handling these data are cumbersome. Microarray technique is used to measure the expression level from tens of thousands of gene in different condition such as time series during biological process. Clustering is an unsupervised learning process which partitions the given data set into similar or dissimilar groups. The mission of this research paper is to analyze the accuracy level of the microarray data using different clustering algorithms and identify the suitable algorithm for further research process

    Application based technical Approaches of data mining in Pharmaceuticals, and Research approaches in biomedical and Bioinformatics

    Get PDF
    In the past study shows that flow of direction in the field of pharmaceutical was quit slow and simplest and by the time the process of transformation of information was so complex and the it was out of the reach to the technology, new modern technology could not reach to catch the pharmaceutical field. Then the later on technology becomes the compulsorily part of business and its contributed into business progress and developments. But now a days its get technology enabled and smoothly and easily pharma industries managing their billings and inventories and developing new products and services and now its easy to maintain and merging the drugs detail like its cost ,and usage with the patients records prescribe by the doctors in the hospitals .and data collection methods have improved data manipulation techniques are yet to keep pace with them data mining called and refer with the specific term as pattern analysis on large data sets used like clustering, segmentation and classification for helping better manipulation of the data and hence it helps to the pharma firms and industries this paper describes the vital role of data Mining in the pharma industry and thus data mining improves the quality of decision making services in pharmaceutical fields. This paper also describe a brief overviews of tool kits of Data mining and its various Applications in the field of Biomedical research in terms of relational approaches of data minings with the Emphasis on propositionalisation and relational subgroup discovery, and which is quit helpful to prove to be effective for data analysis in biomedical and its applications and in Bioinformatics as well. DOI: 10.17762/ijritcc2321-8169.15038

    Analisis Genetik Gen KiSS1 pada Kambing Berdasarkan Sekuen DNA GenBank

    Get PDF
    Gen KiSS1 merupakan kandidat gen yang berpengaruh pada sifat reproduksi kambing. Tujuan penelitian ini untuk identifikasi SNP, perubahan asam amino dan phylogenetic tree kambing berdasarkan data sekuen DNA gen KiSS1 GenBank. Dua puluh sekuen DNA gen KiSS1 yang terdiri dari 17 kambing dan 3 domba yang diambil dari data GenBank NCBI. Data sekuen DNA KiSS1 disejajarkan menggunakan BioEdit untuk mengetahui lokasi SNP dan perubahan asam amino. Phylogenetic tree dibuat menggunakan fitur di NCBI. Berdasarkan hasil penjajaran sekuen DNA gen KiSS1 pada 2 kelompok daerah sekuen, terdapat 22 SNP yang terdiri dari 7 SNP di 5’UTR, 13 SNP di Intron 1 dan 2 SNP di ekson 2. Dua SNP yang ditemukan pada ekson bersifat silent mutation. Phylogenetic tree menunjukkan bahwa gen KiSS1 kambing berkumpul pada satu branch kemudian diikuti dengan gen KiSS1 domba pada branch yang berbeda. Hasil penelitian ini dapat digunakan sebagai informasi dasar untuk melakukan studi lanjutan yaitu studi asosiasi SNP gen KiSS1 dengan sifat reproduksi pada kambing

    An Experimental Study on Microarray Expression Data from Plants under Salt Stress by using Clustering Methods

    Get PDF
    Current Genome-wide advancements in Gene chips technology provide in the “Omics (genomics, proteomics and transcriptomics) research”, an opportunity to analyze the expression levels of thousand of genes across multiple experiments. In this regard, many machine learning approaches were proposed to deal with this deluge of information. Clustering methods are one of these approaches. Their process consists of grouping data (gene profiles) into homogeneous clusters using distance measurements. Various clustering techniques are applied, but there is no consensus for the best one. In this context, a comparison of seven clustering algorithms was performed and tested against the gene expression datasets of three model plants under salt stress. These techniques are evaluated by internal and relative validity measures. It appears that the AGNES algorithm is the best one for internal validity measures for the three plant datasets. Also, K-Means profiles a trend for relative validity measures for these datasets

    Bioinformatics: Basics, Development, and Future

    Get PDF
    Bioinformatics is an interdisciplinary scientific field of life sciences. Bioinformatics research and application include the analysis of molecular sequence and genomics data; genome annotation, gene/protein prediction, and expression profiling; molecular folding, modeling, and design; building biological networks; development of databases and data management systems; development of software and analysis tools; bioinformatics services and workflow; mining of biomedical literature and text; and bioinformatics education and training. Astronomical accumulation of genomics, proteomics, and metabolomics data as well as a need for their storage, analysis, annotation, organization, systematization, and integration into biological networks and database systems were the main driving forces for the emergence and development of bioinformatics. Current critical needs for bioinformatics among others highlighted in this chapter, however, are to understand basics and specifics of bioinformatics as well as to prepare new generation scientists and specialists with integrated, interdisciplinary, and multilingual knowledge who can use modern bioinformatics resources powered with sophisticated operating systems, software, and database/networking technologies. In this introductory chapter, I aim to give an overall picture on basics and developments of the bioinformatics field for readers with some future perspectives, highlighting chapters published in this book

    On the application of the Methods of poly(A) site identification for Model Plant Sequence

    Get PDF
    伴随着各种基因组测序计划的展开和分子结构测定技术的突破,目前生物学界已经积累了大量关于基因的数据,这就要求生物学家使用新的生物信息分析算法和工具来分析和处理不断膨胀的数据,充分使用这些信息。因此数据挖掘技术在用于基因功能预测和发现新基因方面有着巨大的潜力。而本文研究的就是通过数据挖掘对序列进行聚类,通过聚类识别含有ploy(A)位点的序列,作为基因表达数据研究的第一步。 本论文提出了一个基于自组织映射网络模型(Self-OrganizingMap,简称SOM)的模式植物拟南芥poly(A)位点识别的方法。自组织映射网络是模糊聚类分析中广泛使用的一种无监督学习的神经网络,它通过自组织方式用大量...With the plans developed in genome sequencing and a breakthrough in the measurement of molecular structure, large quantities of biological data on genes have been accumulated by the biological world. Biologists are required to analyze and process the continuously increased biological data with the new bioinformatics algorithms and tools so as to make full use of the data. Therefore, the technology...学位:工学硕士院系专业:信息科学与技术学院自动化系_系统工程学号:20043100

    A brief account on enzyme mining using metagenomic approach

    Get PDF
    Metagenomics is an approach for directly analyzing the genomes of microbial communities in the environment. The use of metagenomics to investigate novel enzymes is critical because it allows researchers to acquire data on microbial diversity, with a 99% success rate, and different kinds of genes encode an enzyme that has yet to be found. Basic metagenomic approaches have been created and are widely used in numerous studies. To promote the success of the advance research, researchers, particularly young researchers, must have a fundamental understanding of metagenomics. As a result, this review was conducted to provide a thorough insight grasp of metagenomics. It also covers the application and fundamental methods of metagenomics in the discovery of novel enzymes, focusing on recent studies. Moreover, the significance of novel biocatalysts anticipated from varied microbial metagenomes and their relevance to future research for novel industrial applications, the ramifications of Next-Generation Sequencing (NGS), sophisticated bio-informatic techniques, and the prospects of the metagenomic approaches are discussed. The current study additionally explores metagenomic research on enzyme exploration, specifically for key enzymes like lipase, protease, and cellulase of microbial origin
    corecore