16 research outputs found
Embeddability and rate identifiability of Kimura 2-parameter matrices
Deciding whether a Markov matrix is embeddable (i.e. can be written as the
exponential of a rate matrix) is an open problem even for matrices.
We study the embedding problem and rate identifiability for the K80 model of
nucleotide substitution. For these matrices, we fully characterize
the set of embeddable K80 Markov matrices and the set of embeddable matrices
for which rates are identifiable. In particular, we describe an open subset of
embeddable matrices with non-identifiable rates. This set contains matrices
with positive eigenvalues and also diagonal largest in column matrices, which
might lead to consequences in parameter estimation in phylogenetics. Finally,
we compute the relative volumes of embeddable K80 matrices and of embeddable
matrices with identifiable rates. This study concludes the embedding problem
for the more general model K81 and its submodels, which had been initiated by
the last two authors in a separate work.Comment: 20 pages; 10 figure
Commission des Communautes Europeennes: Groupe du Porte-Parole. Reunion de la Commission du 18 et 19 juillet 1978 = Commission of European Communities: Spokesman Group. Meeting of the Committee of 18 and 19 July 1978. Press Spokesman Service Note to National Offices Bio No. (78) 272, 20 July 1978
In this note, we characterize the embeddability of generic Kimura 3ST Markov matrices in terms of their eigenvalues. As a consequence, we are able to compute the volume of such matrices relative to the volume of all Markov matrices within the model. We also provide examples showing that, in general, mutation rates are not identifiable from substitution probabilities. These examples also illustrate that symmetries between mutation probabilities do not necessarily arise from symmetries between the corresponding mutation rates.Peer ReviewedPreprin
Embeddability of centrosymmetric matrices capturing the double-helix structure in natural and synthetic DNA
In this paper, we discuss the embedding problem for centrosymmetric matrices,
which are higher order generalizations of the matrices occurring in Strand
Symmetric Models. These models capture the substitution symmetries arising from
the double helix structure of the DNA. Deciding whether a transition matrix is
embeddable or not enables us to know if the observed substitution probabilities
are consistent with a homogeneous continuous time substitution model, such as
the Kimura models, the Jukes-Cantor model or the general time-reversible model.
On the other hand, the generalization to higher order matrices is motivated by
the setting of synthetic biology, which works with different sizes of genetic
alphabets.Comment: 34 pages, 9 table
An open set of 4×4 embeddable matrices whose principal logarithm is not a Markov generator
A Markov matrix is embeddable if it can represent a homogeneous continuous-time Markov process. It is well known that if a Markov matrix has real and pairwise-different eigenvalues, then the embeddability can be determined by checking whether its principal logarithm is a rate matrix or not. The same holds for Markov matrices that are close enough to the identity matrix. In this paper we exhibit open sets of Markov matrices that are embeddable and whose principal logarithm is not a rate matrix, thus proving that the principal logarithm test above does not suffice generically.All authors are partially funded by AGAUR Project 2017 SGR-932 and MINECO/FEDER Projects MTM2015-69135 and MDM-2014-0445. J. Roca-Lacostena has received also funding from Secretaria d'Universitats i Recerca de la Generalitat de Catalunya (AGAUR 2018FI_B_00947) and European Social Funds.Postprint (author's final draft
The embedding problem for Markov matrices
Characterizing whether a Markov process of discrete random variables has a homogeneous continuous-time realization is a hard problem. In practice, this problem reduces to deciding when a given Markov matrix can be written as the exponential of some rate matrix (a Markov generator). This is an old question known in the literature as the embedding problem [11], which has been solved only for matrices of size 2 × 2 or 3 × 3. In this paper, we address this problem and related questions and obtain results along two different lines. First, for matrices of any size, we give a bound on the number of Markov generators in terms of the spectrum of the Markov matrix. Based on this, we establish a criterion for deciding whether a generic (distinct eigenvalues) Markov matrix is embeddable and propose an algorithm that lists all its Markov generators. Then, motivated and inspired by recent results on substitution models of DNA, we focus on the 4 × 4 case and completely solve the embedding problem for any Markov matrix. The solution in this case is more concise as the embeddability is given in terms of a single condition.Characterizing whether a Markov process of discrete random variables has a homogeneous continuous-time realization is a hard problem. In practice, this problem reduces to deciding when a given Markov matrix can be written as the exponential of some rate matrix (a Markov generator). This is an old question known in the literature as the embedding problem [11], which has been solved only for matrices of size 2 × 2 or 3 × 3. In this paper, we address this problem and related questions and obtain results along two different lines. First, for matrices of any size, we give a bound on the number of Markov generators in terms of the spectrum of the Markov matrix. Based on this, we establish a criterion for deciding whether a generic (distinct eigenvalues) Markov matrix is embeddable and propose an algorithm that lists all its Markov generators. Then, motivated and inspired by recent results on substitution models of DNA, we focus on the 4 × 4 case and completely solve theembedding problem for any Markov matrix. The solution in this case is more concise as the embeddability is given in terms of a single condition