16 research outputs found

    Embeddability and rate identifiability of Kimura 2-parameter matrices

    Get PDF
    Deciding whether a Markov matrix is embeddable (i.e. can be written as the exponential of a rate matrix) is an open problem even for 4×44\times 4 matrices. We study the embedding problem and rate identifiability for the K80 model of nucleotide substitution. For these 4×44\times 4 matrices, we fully characterize the set of embeddable K80 Markov matrices and the set of embeddable matrices for which rates are identifiable. In particular, we describe an open subset of embeddable matrices with non-identifiable rates. This set contains matrices with positive eigenvalues and also diagonal largest in column matrices, which might lead to consequences in parameter estimation in phylogenetics. Finally, we compute the relative volumes of embeddable K80 matrices and of embeddable matrices with identifiable rates. This study concludes the embedding problem for the more general model K81 and its submodels, which had been initiated by the last two authors in a separate work.Comment: 20 pages; 10 figure

    Commission des Communautes Europeennes: Groupe du Porte-Parole. Reunion de la Commission du 18 et 19 juillet 1978 = Commission of European Communities: Spokesman Group. Meeting of the Committee of 18 and 19 July 1978. Press Spokesman Service Note to National Offices Bio No. (78) 272, 20 July 1978

    Get PDF
    In this note, we characterize the embeddability of generic Kimura 3ST Markov matrices in terms of their eigenvalues. As a consequence, we are able to compute the volume of such matrices relative to the volume of all Markov matrices within the model. We also provide examples showing that, in general, mutation rates are not identifiable from substitution probabilities. These examples also illustrate that symmetries between mutation probabilities do not necessarily arise from symmetries between the corresponding mutation rates.Peer ReviewedPreprin

    Embeddability of centrosymmetric matrices capturing the double-helix structure in natural and synthetic DNA

    Get PDF
    In this paper, we discuss the embedding problem for centrosymmetric matrices, which are higher order generalizations of the matrices occurring in Strand Symmetric Models. These models capture the substitution symmetries arising from the double helix structure of the DNA. Deciding whether a transition matrix is embeddable or not enables us to know if the observed substitution probabilities are consistent with a homogeneous continuous time substitution model, such as the Kimura models, the Jukes-Cantor model or the general time-reversible model. On the other hand, the generalization to higher order matrices is motivated by the setting of synthetic biology, which works with different sizes of genetic alphabets.Comment: 34 pages, 9 table

    An open set of 4×4 embeddable matrices whose principal logarithm is not a Markov generator

    Get PDF
    A Markov matrix is embeddable if it can represent a homogeneous continuous-time Markov process. It is well known that if a Markov matrix has real and pairwise-different eigenvalues, then the embeddability can be determined by checking whether its principal logarithm is a rate matrix or not. The same holds for Markov matrices that are close enough to the identity matrix. In this paper we exhibit open sets of Markov matrices that are embeddable and whose principal logarithm is not a rate matrix, thus proving that the principal logarithm test above does not suffice generically.All authors are partially funded by AGAUR Project 2017 SGR-932 and MINECO/FEDER Projects MTM2015-69135 and MDM-2014-0445. J. Roca-Lacostena has received also funding from Secretaria d'Universitats i Recerca de la Generalitat de Catalunya (AGAUR 2018FI_B_00947) and European Social Funds.Postprint (author's final draft

    The embedding problem for Markov matrices

    Get PDF
    Characterizing whether a Markov process of discrete random variables has a homogeneous continuous-time realization is a hard problem. In practice, this problem reduces to deciding when a given Markov matrix can be written as the exponential of some rate matrix (a Markov generator). This is an old question known in the literature as the embedding problem [11], which has been solved only for matrices of size 2 × 2 or 3 × 3. In this paper, we address this problem and related questions and obtain results along two different lines. First, for matrices of any size, we give a bound on the number of Markov generators in terms of the spectrum of the Markov matrix. Based on this, we establish a criterion for deciding whether a generic (distinct eigenvalues) Markov matrix is embeddable and propose an algorithm that lists all its Markov generators. Then, motivated and inspired by recent results on substitution models of DNA, we focus on the 4 × 4 case and completely solve the embedding problem for any Markov matrix. The solution in this case is more concise as the embeddability is given in terms of a single condition.Characterizing whether a Markov process of discrete random variables has a homogeneous continuous-time realization is a hard problem. In practice, this problem reduces to deciding when a given Markov matrix can be written as the exponential of some rate matrix (a Markov generator). This is an old question known in the literature as the embedding problem [11], which has been solved only for matrices of size 2 × 2 or 3 × 3. In this paper, we address this problem and related questions and obtain results along two different lines. First, for matrices of any size, we give a bound on the number of Markov generators in terms of the spectrum of the Markov matrix. Based on this, we establish a criterion for deciding whether a generic (distinct eigenvalues) Markov matrix is embeddable and propose an algorithm that lists all its Markov generators. Then, motivated and inspired by recent results on substitution models of DNA, we focus on the 4 × 4 case and completely solve theembedding problem for any Markov matrix. The solution in this case is more concise as the embeddability is given in terms of a single condition

    Looking for transcription factor binding sites

    No full text
    IIIA CSI

    Looking for transcription factor binding sites

    No full text
    IIIA CSI
    corecore