11 research outputs found

    Empirical Relationship between Intra-Purine and Intra-Pyrimidine Differences in Conserved Gene Sequences

    Get PDF
    DNA sequences seen in the normal character-based representation appear to have a formidable mixing of the four nucleotides without any apparent order. Nucleotide frequencies and distributions in the sequences have been studied extensively, since the simple rule given by Chargaff almost a century ago that equates the total number of purines to the pyrimidines in a duplex DNA sequence. While it is difficult to trace any relationship between the bases from studies in the character representation of a DNA sequence, graphical representations may provide a clue. These novel representations of DNA sequences have been useful in providing an overview of base distribution and composition of the sequences and providing insights into many hidden structures. We report here our observation based on a graphical representation that the intra-purine and intra-pyrimidine differences in sequences of conserved genes generally follow a quadratic distribution relationship and show that this may have arisen from mutations in the sequences over evolutionary time scales. From this hitherto undescribed relationship for the gene sequences considered in this report we hypothesize that such relationships may be characteristic of these sequences and therefore could become a barrier to large scale sequence alterations that override such characteristics, perhaps through some monitoring process inbuilt in the DNA sequences. Such relationship also raises the possibility of intron sequences playing an important role in maintaining the characteristics and could be indicative of possible intron-late phenomena

    Modeling complex metabolic reactions, ecological systems, and financial and legal networks with MIANN models based on Markov-Wiener node descriptors

    Get PDF
    [Abstract] The use of numerical parameters in Complex Network analysis is expanding to new fields of application. At a molecular level, we can use them to describe the molecular structure of chemical entities, protein interactions, or metabolic networks. However, the applications are not restricted to the world of molecules and can be extended to the study of macroscopic nonliving systems, organisms, or even legal or social networks. On the other hand, the development of the field of Artificial Intelligence has led to the formulation of computational algorithms whose design is based on the structure and functioning of networks of biological neurons. These algorithms, called Artificial Neural Networks (ANNs), can be useful for the study of complex networks, since the numerical parameters that encode information of the network (for example centralities/node descriptors) can be used as inputs for the ANNs. The Wiener index (W) is a graph invariant widely used in chemoinformatics to quantify the molecular structure of drugs and to study complex networks. In this work, we explore for the first time the possibility of using Markov chains to calculate analogues of node distance numbers/W to describe complex networks from the point of view of their nodes. These parameters are called Markov-Wiener node descriptors of order kth (Wk). Please, note that these descriptors are not related to Markov-Wiener stochastic processes. Here, we calculated the Wk(i) values for a very high number of nodes (>100,000) in more than 100 different complex networks using the software MI-NODES. These networks were grouped according to the field of application. Molecular networks include the Metabolic Reaction Networks (MRNs) of 40 different organisms. In addition, we analyzed other biological and legal and social networks. These include the Interaction Web Database Biological Networks (IWDBNs), with 75 food webs or ecological systems and the Spanish Financial Law Network (SFLN). The calculated Wk(i) values were used as inputs for different ANNs in order to discriminate correct node connectivity patterns from incorrect random patterns. The MIANN models obtained present good values of Sensitivity/Specificity (%): MRNs (78/78), IWDBNs (90/88), and SFLN (86/84). These preliminary results are very promising from the point of view of a first exploratory study and suggest that the use of these models could be extended to the high-throughput re-evaluation of connectivity in known complex networks (collation)

    Graphical and numerical representations of DNA sequences: statistical aspects of similarity

    Full text link

    2D-RNA-coupling numbers: a new computational chemistry approach to link secondary structure topology with biological function

    No full text
    Methods for prediction of proteins, DNA, or RNA function and mapping it onto sequence often rely on bioinformatics alignment approach instead of chemical structure. Consequently, it is interesting to develop computational chemistry approaches based on molecular descriptors. In this sense, many researchers used sequence-coupling numbers and our group extended them to 2D proteins representations. However, no coupling numbers have been reported for 2D-RNA topology graphs, which are highly branched and contain useful information. Here, we use a computational chemistry scheme: (a) transforming sequences into RNA secondary structures, (b) defining and calculating new 2D-RNA-coupling numbers, (c) seek a structure-function model, and (d) map biological function onto the folded RNA. We studied as example 1-aminocyclopropane-1-carboxylic acid (ACC) oxidases known as ACO, which control fruit ripening having importance for biotechnology industry. First, we calculated k(2D-RNA) values to a set of 90-folded RNAs, including 28 transcripts of ACO and control sequences. Afterwards, we compared the classification performance of 10 different classifiers implemented in the software WEKA. In particular, the logistic equation ACO ¼ 23.8 1(2D-RNA) þ 41.4 predicts ACOs with 98.9%, 98.0%, and 97.8% of accuracy in training, leaveone- out and 10-fold cross-validation, respectively. Afterwards, with this equation we predict ACO function to a sequence isolated in this work from Coffea arabica (GenBank accession DQ218452). The 1(2D-RNA) also favorably compare with other descriptors. This equation allows us to map the codification of ACO activity on different mRNA topology features. The present computational-chemistry approach is general and could be extended to connect RNA secondary structure topology to other functions

    Modeling Complex Metabolic Reactions, Ecological Systems, and Financial and Legal Networks with MIANN Models Based on Markov-Wiener Node Descriptors

    No full text
    corecore