2,090 research outputs found

    Denver Groups Classification of Human Chromosomes Using Fuzzy C-Means Clustering

    Get PDF
    Unbanded human chromosome can be classified into seven Denver Groups (A-G) based their lengths and the ratio of the length of the shorter arm to the whole length of the chromosome, which is called the centromere index (CI). In this article, the fuzzy c-means method will be used to perform the Denver Group classification of a given set of human chromosomes. The objective in clustering is to partition a given human chromosome set into homogeneous clusters; by homogeneous we mean that all points in the same cluster share similar attributes and they do not share similar attributes with points in other clusters. However, the separation of clusters and the meaning of similarity are fuzzy notions and can be described as such. It is found that the clusters iterations converge, highly depend on the initial partition matrix

    A Novel Approach to Fuzzy Clustering based on a Dissimilarity Relation extracted from Data using a TS System

    Get PDF
    Clustering refers to the process of unsupervised partitioning of a data set based on a dissimilarity measure, which determines the cluster shape. Considering that cluster shapes may change from one cluster to another, it would be of the utmost importance to extract the dissimilarity measure directly from the data by means of a data model. On the other hand, a model construction requires some kind of supervision of the data structure, which is exactly what we look for during clustering. So, the lower the supervision degree used to build the data model, the more it makes sense to resort to a data model for clustering purposes. Conscious of this, we propose to exploit very few pairs of patterns with known dissimilarity to build a TS system which models the dissimilarity relation. Among other things, the rules of the TS system provide an intuitive description of the dissimilarity relation itself. Then we use the TS system to build a dissimilarity matrix which is fed as input to an unsupervised fuzzy relational clustering algorithm, denoted any relation clustering algorithm (ARCA), which partitions the data set based on the proximity of the vectors containing the dissimilarity values between each pattern and all the other patterns in the data set. We show that combining the TS system and the ARCA algorithm allows us to achieve high classification performance on a synthetic data set and on two real data sets. Further, we discuss how the rules of the TS system represent a sort of linguistic description of the dissimilarity relation

    An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs

    Get PDF
    Background: Transcription factors (TFs) control transcription by binding to specific regions of DNA called transcription factor binding sites (TFBSs). The identification of TFBSs is a crucial problem in computational biology and includes the subtask of predicting the location of known TFBS motifs in a given DNA sequence. It has previously been shown that, when scoring matches to known TFBS motifs, interdependencies between positions within a motif should be taken into account. However, this remains a challenging task owing to the fact that sequences similar to those of known TFBSs can occur by chance with a relatively high frequency. Here we present a new method for matching sequences to TFBS motifs based on intuitionistic fuzzy sets (IFS) theory, an approach that has been shown to be particularly appropriate for tackling problems that embody a high degree of uncertainty. Results: We propose SCintuit, a new scoring method for measuring sequence-motif affinity based on IFS theory. Unlike existing methods that consider dependencies between positions, SCintuit is designed to prevent overestimation of less conserved positions of TFBSs. For a given pair of bases, SCintuit is computed not only as a function of their combined probability of occurrence, but also taking into account the individual importance of each single base at its corresponding position. We used SCintuit to identify known TFBSs in DNA sequences. Our method provides excellent results when dealing with both synthetic and real data, outperforming the sensitivity and the specificity of two existing methods in all the experiments we performed. Conclusions: The results show that SCintuit improves the prediction quality for TFs of the existing approaches without compromising sensitivity. In addition, we show how SCintuit can be successfully applied to real research problems. In this study the reliability of the IFS theory for motif discovery tasks is proven

    Family relation and STR-DNA matching using fuzzy inference

    Get PDF
    Deoxyribose Nucleic Acid (DNA) are the basic elements that make up a whole section of an individual. The basic elements store information that is unique to each individual and will be passed down the generations. DNA also helps in identifying the father in paternity testing, locating missing person investigations, identifying victims in mass disasters. Identification of the victims has a problem if the comparison the father and mother no instance the victim’s parents have died or are very far away from where the victim. Therefore, it is necessary to try to identify Short Tandem Repeat (STR) Inference of live family such as sibling, grandfather/grandmother, uncle/aunt, cousin and nephew. In this paper, we performed a method to measure the similarity of human DNA profiles using fuzzy similarity. In this fuzzy system,  DNA profile data is used as an input that stores human identity along with its DNA profile. The data entered is the result of polymerase chain reaction (PCR) identification which is an electropherogram consisting of 16 loci with two alleles for each locus.  Output in this fuzzy system is the value of individual similarity with reference and with similarity levels, namely small, medium and high

    Preparation and characterization of magnetite (Fe3O4) nanoparticles By Sol-Gel method

    Get PDF
    The magnetite (Fe3O4) nanoparticles were successfully synthesized and annealed under vacuum at different temperature. The Fe3O4 nanoparticles prepared via sol-gel assisted method and annealed at 200-400ºC were characterized by Fourier Transformation Infrared Spectroscopy (FTIR), X-ray Diffraction spectra (XRD), Field Emission Scanning Electron Microscope (FESEM) and Atomic Force Microscopy (AFM). The XRD result indicate the presence of Fe3O4 nanoparticles, and the Scherer`s Formula calculated the mean particles size in range of 2-25 nm. The FESEM result shows that the morphologies of the particles annealed at 400ºC are more spherical and partially agglomerated, while the EDS result indicates the presence of Fe3O4 by showing Fe-O group of elements. AFM analyzed the 3D and roughness of the sample; the Fe3O4 nanoparticles have a minimum diameter of 79.04 nm, which is in agreement with FESEM result. In many cases, the synthesis of Fe3O4 nanoparticles using FeCl3 and FeCl2 has not been achieved, according to some literatures, but this research was able to obtained Fe3O4 nanoparticles base on the characterization results

    From approximative to descriptive fuzzy models

    Get PDF

    CAESAR models for developmental toxicity

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The new REACH legislation requires assessment of a large number of chemicals in the European market for several endpoints. Developmental toxicity is one of the most difficult endpoints to assess, on account of the complexity, length and costs of experiments. Following the encouragement of QSAR (<it>in silico</it>) methods provided in the REACH itself, the CAESAR project has developed several models.</p> <p>Results</p> <p>Two QSAR models for developmental toxicity have been developed, using different statistical/mathematical methods. Both models performed well. The first makes a classification based on a random forest algorithm, while the second is based on an adaptive fuzzy partition algorithm. The first model has been implemented and inserted into the CAESAR on-line application, which is java-based software that allows everyone to freely use the models.</p> <p>Conclusions</p> <p>The CAESAR QSAR models have been developed with the aim to minimize false negatives in order to make them more usable for REACH. The CAESAR on-line application ensures that both industry and regulators can easily access and use the developmental toxicity model (as well as the models for the other four endpoints).</p

    Advances in Evolutionary Algorithms

    Get PDF
    With the recent trends towards massive data sets and significant computational power, combined with evolutionary algorithmic advances evolutionary computation is becoming much more relevant to practice. Aim of the book is to present recent improvements, innovative ideas and concepts in a part of a huge EA field
    corecore