7,041 research outputs found
BOOL-AN: A method for comparative sequence analysis and phylogenetic reconstruction
A novel discrete mathematical approach is proposed as an additional tool for molecular systematics which does not require prior statistical assumptions concerning the evolutionary process. The method is based on algorithms generating mathematical representations directly from DNA/RNA or protein sequences, followed by the output of numerical (scalar or vector) and visual characteristics (graphs). The binary encoded sequence information is transformed into a compact analytical form, called the Iterative Canonical Form (or ICF) of Boolean functions, which can then be used as a generalized molecular descriptor. The method provides raw vector data for calculating different distance matrices, which in turn can be analyzed by neighbor-joining or UPGMA to derive a phylogenetic tree, or by principal coordinates analysis to get an ordination scattergram. The new method and the associated software for inferring phylogenetic trees are called the Boolean analysis or BOOL-AN
Pattern vectors from algebraic graph theory
Graphstructures have proven computationally cumbersome for pattern analysis. The reason for this is that, before graphs can be converted to pattern vectors, correspondences must be established between the nodes of structures which are potentially of different size. To overcome this problem, in this paper, we turn to the spectral decomposition of the Laplacian matrix. We show how the elements of the spectral matrix for the Laplacian can be used to construct symmetric polynomials that are permutation invariants. The coefficients of these polynomials can be used as graph features which can be encoded in a vectorial manner. We extend this representation to graphs in which there are unary attributes on the nodes and binary attributes on the edges by using the spectral decomposition of a Hermitian property matrix that can be viewed as a complex analogue of the Laplacian. To embed the graphs in a pattern space, we explore whether the vectors of invariants can be embedded in a low- dimensional space using a number of alternative strategies, including principal components analysis ( PCA), multidimensional scaling ( MDS), and locality preserving projection ( LPP). Experimentally, we demonstrate that the embeddings result in well- defined graph clusters. Our experiments with the spectral representation involve both synthetic and real- world data. The experiments with synthetic data demonstrate that the distances between spectral feature vectors can be used to discriminate between graphs on the basis of their structure. The real- world experiments show that the method can be used to locate clusters of graphs
Exploring the randomness of Directed Acyclic Networks
The feed-forward relationship naturally observed in time-dependent processes
and in a diverse number of real systems -such as some food-webs and electronic
and neural wiring- can be described in terms of so-called directed acyclic
graphs (DAGs). An important ingredient of the analysis of such networks is a
proper comparison of their observed architecture against an ensemble of
randomized graphs, thereby quantifying the {\em randomness} of the real systems
with respect to suitable null models. This approximation is particularly
relevant when the finite size and/or large connectivity of real systems make
inadequate a comparison with the predictions obtained from the so-called {\em
configuration model}. In this paper we analyze four methods of DAG
randomization as defined by the desired combination of topological invariants
(directed and undirected degree sequence and component distributions) aimed to
be preserved. A highly ordered DAG, called \textit{snake}-graph and a
Erd\:os-R\'enyi DAG were used to validate the performance of the algorithms.
Finally, three real case studies, namely, the \textit{C. elegans} cell lineage
network, a PhD student-advisor network and the Milgram's citation network were
analyzed using each randomization method. Results show how the interpretation
of degree-degree relations in DAGs respect to their randomized ensembles depend
on the topological invariants imposed. In general, real DAGs provide disordered
values, lower than the expected by chance when the directedness of the links is
not preserved in the randomization process. Conversely, if the direction of the
links is conserved throughout the randomization process, disorder indicators
are close to the obtained from the null-model ensemble, although some
deviations are observed.Comment: 13 pages, 5 figures and 5 table
- …