Search CORE

152 research outputs found

Quantum Analog of Shannon's Lower Bound Theorem

Author: Basu Saugata
Parida Laxmi
Publication venue
Publication date: 24/08/2023
Field of study

Shannon proved that almost all Boolean functions require a circuit of size

\Theta(2^n/n)

. We prove a quantum analog of this classical result. Unlike in the classical case the number of quantum circuits of any fixed size that we allow is uncountably infinite. Our main tool is a classical result in real algebraic geometry bounding the number of realizable sign conditions of any finite set of real polynomials in many variables.Comment: Comments welcom

arXiv.org e-Print Archive

Detection of subtle variations as consensus motifs

Author: Comin Matteo
Parida Laxmi
Publication venue: Elsevier Ltd.
Publication date
Field of study

AbstractWe address the problem of detecting consensus motifs, that occur with subtle variations, across multiple sequences. These are usually functional domains in DNA sequences such as transcriptional binding factors or other regulatory sites. The problem in its generality has been considered difficult and various benchmark data serve as the litmus test for different computational methods. We present a method centered around unsupervised combinatorial pattern discovery. The parameters are chosen using a careful statistical analysis of consensus motifs. This method works well on the benchmark data and is general enough to be extended to a scenario where the variation in the consensus motif includes indels (along with mutations). We also present some results on detection of transcription binding factors in human DNA sequences

Elsevier - Publisher Connector

10231 Abstracts Collection -- Structure Discovery in Biology: Motifs, Networks & Phylogenies

Author: Apostolico Alberto
Dress Andreas
Parida Laxmi
Publication venue: Dagstuhl Seminar Proceedings. 10231 - Structure Discovery in Biology: Motifs, Networks & Phylogenies
Publication date: 01/01/2010
Field of study

From 06.06. to 11.06.2010, the Dagstuhl Seminar 10231 ``Structure Discovery in Biology: Motifs, Networks & Phylogenies \u27\u27 was held in Schloss Dagstuhl~--~Leibniz Center for Informatics. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

Dagstuhl Research Online Publication Server

Essential Simplices in Persistent Homology and Subtle Admixture Detection

Author: Basu Saugata
Parida Laxmi
Utro Filippo
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 18th International Workshop on Algorithms in Bioinformatics (WABI 2018)
Publication date: 01/01/2018
Field of study

We introduce a robust mathematical definition of the notion of essential elements in a basis of the homology space and prove that these elements are unique. Next we give a novel visualization of the essential elements of the basis of the homology space through a rainfall-like plot (RFL). This plot is data-centric, i.e., is associated with the individual samples of the data, as opposed to the structure-centric barcodes of persistent homology. The proof-of-concept was tested on data generated by SimRA that simulates different admixture scenarios. We show that the barcode analysis can be used not just to detect the presence of admixture but also estimate the number of admixed populations. We also demonstrate that data-centric RFL plots have the potential to further disentangle the common history into admixture events and relative timing of the events, even in very complex scenarios

Dagstuhl Research Online Publication Server

Sampling ARG of multiple populations under complex configurations of subdivision and admixture.

Author: Anna Paola Carrieri
Filippo Utro
Laxmi Parida
Publication venue
Publication date: 07/12/2015
Field of study

Abstract Motivation: Simulating complex evolution scenarios of multiple populations is an important task for answering many basic questions relating to population genomics. Apart from the population samples, the underlying Ancestral Recombinations Graph (ARG) is an additional important means in hypothesis checking and reconstruction studies. Furthermore, complex simulations require a plethora of interdependent parameters making even the scenario-specification highly non-trivial. Results: We present an algorithm SimRA that simulates generic multiple population evolution model with admixture. It is based on random graphs that improve dramatically in time and space requirements of the classical algorithm of single populations. Using the underlying random graphs model, we also derive closed forms of expected values of the ARG characteristics i.e., height of the graph, number of recombinations, number of mutations and population diversity in terms of its defining parameters. This is crucial in aiding the user to specify meaningful parameters for the complex scenario simulations, not through trial-and-error based on raw compute power but intelligent parameter estimation. To the best of our knowledge this is the first time closed form expressions have been computed for the ARG properties. We show that the expected values closely match the empirical values through simulations. Finally, we demonstrate that SimRA produces the ARG in compact forms without compromising any accuracy. We demonstrate the compactness and accuracy through extensive experiments. Availability and implementation: SimRA (Simulation based on Random graph Algorithms) source, executable, user manual and sample input-output sets are available for downloading at: https://github.com/ComputationalGenomics/SimRA Contact: [email protected] Supplementary information: Supplementary data are available at Bioinformatics online

Crossref

Open Access Repository

Recommended from our members

Characterizing redescriptions using persistent homology to isolate genetic pathways contributing to pathogenesis

Author: Basu Saugata
Parida Laxmi
Platt Daniel E.
Zalloua Pierre A.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/07/2016
Field of study

Background: Complex diseases may have multiple pathways leading to disease. E.g. coronary artery disease evolves from arterial damage to their epithelial layers, but has multiple causal pathways. More challenging, those pathways are highly correlated within metabolic syndrome. The challenge is to identify specific clusters of phenotype characteristics (composite phenotypes) that may reflect these different etiologies. Further, GWAS seeking to identify SNPs satisfying multiple composite phenotype descriptions allows for lower false positive rates at lower α thresholds, allowing for the possibility of reducing false negatives. This may provide a window into the missing heritability problem. Methods: We identify significant phenotype patterns, and identify fuzzy redescriptions among those patterns using Jaccard distances. Further, we construct Vietoris-Rips complexes from the Jaccard distances and compute the persistent homology associated with those. The patterns comprising these topological features are identified as composite phenotpyes, whose genetic associations are explored with logistic regression applied to pathways and to GWAS. Results: We identified several phenotypes that tended to be dominated by metabolic syndrome descriptions, and which were distinct among the combinations of metabolic syndrome conditions. Among SNPs marking the RAAS complex, various SNPs associated specifically with different groups of composite phenotypes, as well as distinguishing between the composite phenotypes and simple phenotypes. Each of these showed different genetic associations, namely rs6693954, rs762551, rs1378942, and rs1133323. GWAS identified SNPs that associated with composite phenotypes included rs12365545, rs6847235, and rs701319. Eighteen GWAS identified SNPs appeared in combinations supported in composite combinations with greater power than for any individual phenotype. Conclusions: We do find systematic associations among metabolic syndrome variates that show distinctive genetic association profiles. Further, the systematic characterization involves composite phenotype descriptions that allow for combined power of individual phenotype GWAS tests, yielding more significance for lower individual thresholds, permitting the exploration of SNPs that would otherwise show as false negatives

Harvard University - DASH

Springer - Publisher Connector

Minimizing recombinations in consensus networks for phylogeographic studies

Author: Asif Javed
BME Moret
C Semple
D Gusfield
DH Huson
EO Wilson
Francesc Calafell
J Hein
Jaume Bertranpetit
L Parida
Laxmi Parida
MA Jobling
Marta Melé
S Arora
TH Cormen
V Vazirani
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background We address the problem of studying recombinational variations in (human) populations. In this paper, our focus is on one computational aspect of the general task: Given two networks <it>G</it>1 and <it>G</it>2, with both mutation and recombination events, defined on overlapping sets of extant units the objective is to compute a consensus network <it>G</it>3 with minimum number of additional recombinations. We describe a polynomial time algorithm with a guarantee that the number of computed new recombination events is within <it>ϵ </it>= <it>sz</it>(<it>G</it>1, <it>G</it>2) (function <it>sz </it>is a well-behaved function of the sizes and topologies of <it>G</it>1 and <it>G</it>2) of the optimal <it>number </it>of recombinations. To date, this is the best known result for a network consensus problem. Results Although the network consensus problem can be applied to a variety of domains, here we focus on structure of human populations. With our preliminary analysis on a segment of the human Chromosome X data we are able to infer ancient recombinations, population-specific recombinations and more, which also support the widely accepted 'Out of Africa' model. These results have been verified independently using traditional manual procedures. To the best of our knowledge, this is the first recombinations-based characterization of human populations. Conclusion We show that our mathematical model identifies recombination spots in the individual haplotypes; the aggregate of these spots over a set of haplotypes defines a recombinational landscape that has enough signal to detect continental as well as population divide based on a short segment of Chromosome X. In particular, we are able to infer ancient recombinations, population-specific recombinations and more, which also support the widely accepted 'Out of Africa' model. The agreement with mutation-based analysis can be viewed as an indirect validation of our results and the model. Since the model in principle gives us more information embedded in the networks, in our future work, we plan to investigate more non-traditional questions via these structures computed by our methodology.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

UPF Digital Repository

ScholarlyCommons@Penn

A minimal descriptor of an ancestral recombinations graph

Author: Asif Javed
B Padhukasahasram
C Wiuf
GAT McVean
GK Chen
J Hein
L L Liang
L Parida
L Parida
Laxmi Parida
M Arenas
M Jobling
P Marjoram
Pier Francesco Palamara
R Bürger
RC Griffiths
RR Hudson
RR Hudson
S Schaffner
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Ancestral Recombinations Graph (ARG) is a phylogenetic structure that encodes both duplication events, such as mutations, as well as genetic exchange events, such as recombinations: this captures the (genetic) dynamics of a population evolving over generations. Results In this paper, we identify structure-preserving and samples-preserving core of an ARG <it>G</it> and call it the minimal descriptor ARG of <it>G</it>. Its structure-preserving characteristic ensures that all the branch lengths of the marginal trees of the minimal descriptor ARG are identical to that of <it>G</it> and the samples-preserving property asserts that the patterns of genetic variation in the samples of the minimal descriptor ARG are exactly the same as that of <it>G</it>. We also prove that even an unbounded <it>G</it> has a finite minimal descriptor, that continues to preserve certain (graph-theoretic) properties of <it>G</it> and for an appropriate class of ARGs, our estimate (Eqn 8) as well as empirical observation is that the expected reduction in the number of vertices is exponential. Conclusions Based on the definition of this lossless and bounded structure, we derive local properties of the vertices of a minimal descriptor ARG, which lend itself very naturally to the design of efficient sampling algorithms. We further show that a class of minimal descriptors, that of binary ARGs, models the standard coalescent exactly (Thm 6).</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central