Search CORE

172 research outputs found

Managing and analyzing phylogenetic databases

Author: DEEPAK AKSHAY
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2013
Field of study

The ever growing availability of phylogenomic data makes it increasingly possible to study and analyze phylogenetic relationships across a wide range of species. Indeed, current phylogenetic analyses are now producing enormous collections of trees that vary greatly in size. Our proposed research addresses the challenges posed by storing, querying, and analyzing such phylogenetic databases. Our first contribution is the further development of STBase, a phylogenetic tree database consisting of a billion trees whose leaf sets range from four to 20000. STBase applies techniques from different areas of computer science for efficient tree storage and retrieval. It also introduces new ideas that are specific to tree databases. STBase provides a unique opportunity to explore innovative ways to analyze the results from queries on large sets of phylogenetic trees. We propose new ways of extracting consensus information from a collection of phylogenetic trees. Specifically, this involves extending the maximum agreement subtree problem. We greatly improve upon an existing approach based on frequent subtrees and, propose two new approaches based on agreement subtrees and frequent subtrees respectively. The final part of our proposed work deals with the problem of simplifying multi-labeled trees and handling rogue taxa. We propose a novel technique to extract conflict-free information from multi-labeled trees as a much smaller single labeled tree. We show that the inherent problem in identifying rogue taxa is NP-hard and give fixed-parameter tractable and integer linear programming solutions

Digital Repository @ Iowa State University (ISU)

Bounds on graviton mass using weak lensing and SZ effect in galaxy clusters

Author: Jain Deepak
Mahajan Shobhit
Mukherjee Amitabha
Rana Akshay
Publication venue: 'Elsevier BV'
Publication date: 07/04/2018
Field of study

In General Relativity (GR), the graviton is massless. However, a common feature in several theoretical alternatives of GR is a non-zero mass for the graviton. These theories can be described as massive gravity theories. Despite many theoretical complexities in these theories, on phenomenological grounds, the implications of massive gravity have been widely used to put bounds on graviton mass. One of the generic implications of giving a mass to the graviton is that the gravitational potential will follow a Yukawa-like fall off. We use this feature of massive gravity theories to probe the mass of graviton by using the largest gravitationally bound objects, namely galaxy clusters. In this work, we use the mass estimates of galaxy clusters measured at various cosmologically defined radial distances measured via weak lensing (WL) and Sunyaev-Zel'dovich (SZ) effect. We also use the model independent values of Hubble parameter

H(z)

smoothed by a non-parametric method, Gaussian process. Within

1\sigma

confidence region, we obtain the mass of graviton

m_g < 5.9 \times 10^{-30}

eV with the corresponding Compton length scale

\lambda_g > 6.82

Mpc from weak lensing and

m_g < 8.31 \times 10^{-30}

eV with

\lambda_g > 5.012

Mpc from SZ effect. This analysis improves the upper bound on graviton mass obtained earlier from galaxy clusters.Comment: 9 Pages, 3 Figures, 2 Tables, Accepted for publication in Physics Letters

arXiv.org e-Print Archive

Directory of Open Access Journals

Extracting Conflict-free Information from Multi-labeled Trees

Author: Deepak Akshay
Fernández-Baca David
McMahon Michelle M.
Publication venue
Publication date: 28/06/2012
Field of study

A multi-labeled tree, or MUL-tree, is a phylogenetic tree where two or more leaves share a label, e.g., a species name. A MUL-tree can imply multiple conflicting phylogenetic relationships for the same set of taxa, but can also contain conflict-free information that is of interest and yet is not obvious. We define the information content of a MUL-tree T as the set of all conflict-free quartet topologies implied by T, and define the maximal reduced form of T as the smallest tree that can be obtained from T by pruning leaves and contracting edges while retaining the same information content. We show that any two MUL-trees with the same information content exhibit the same reduced form. This introduces an equivalence relation in MUL-trees with potential applications to comparing MUL-trees. We present an efficient algorithm to reduce a MUL-tree to its maximally reduced form and evaluate its performance on empirical datasets in terms of both quality of the reduced tree and the degree of data reduction achieved.Comment: Submitted in Workshop on Algorithms in Bioinformatics 2012 (http://algo12.fri.uni-lj.si/?file=wabi

arXiv.org e-Print Archive

Crossref

PubMed Central

The University of Arizona

Enumerating All Maximal Frequent Subtrees

Author: Deepak Akshay
Fernández-Baca David
Publication venue: Iowa State University Digital Repository
Publication date: 01/01/2012
Field of study

Given a collection of leaf-labeled trees on a common leafset and a fraction f in (1/2,1], a frequent subtree (FST) is a subtree isomorphically included in at least fraction f of the input trees. The well-known maximum agreement subtree (MAST) problem identifies FST with f = 1 and having the largest number of leaves. Apart from its intrinsic interest from the algorithmic perspective, MAST has practical applications as a metric for tree similarity, for computing tree congruence, in detection horizontal gene transfer events and as a consensus approach. Enumerating FSTs extend the MAST problem by denition and reveal additional subtrees not displayed by MAST. This can happen in tow ways - such a subtree is included in majority but not all of the input trees or such a subtree though included in all the input trees, does not have the maximum number of leaves. Further, FSTs can be enumerated on collection o ftrees having partially overlapping leafsets. MAST may not be useful here especially if the common overlap among leafsets is very low. Though very useful, the number of FSTs suffer from combinatorial explosion - just a single enumeration of maximal frequent subtrees (MFSTs). A MFST is a FST that is not a subtree to any othe rFST. the set of MFSTs is a compact non-redundant summary of all FSTs and is much smaller in size. Here we tackle the novel problem of enumerating all MFSTs in collections of phylogenetic trees. We demonstrate its utility in returning larger consensus trees in comparison to MAST. The current implementation is available on the web

Digital Repository @ Iowa State University (ISU)