4,540 research outputs found

    Mining Pure, Strict Epistatic Interactions from High-Dimensional Datasets: Ameliorating the Curse of Dimensionality

    Get PDF
    Background: The interaction between loci to affect phenotype is called epistasis. It is strict epistasis if no proper subset of the interacting loci exhibits a marginal effect. For many diseases, it is likely that unknown epistatic interactions affect disease susceptibility. A difficulty when mining epistatic interactions from high-dimensional datasets concerns the curse of dimensionality. There are too many combinations of SNPs to perform an exhaustive search. A method that could locate strict epistasis without an exhaustive search can be considered the brass ring of methods for analyzing high-dimensional datasets. Methodology/Findings: A SNP pattern is a Bayesian network representing SNP-disease relationships. The Bayesian score for a SNP pattern is the probability of the data given the pattern, and has been used to learn SNP patterns. We identified a bound for the score of a SNP pattern. The bound provides an upper limit on the Bayesian score of any pattern that could be obtained by expanding a given pattern. We felt that the bound might enable the data to say something about the promise of expanding a 1-SNP pattern even when there are no marginal effects. We tested the bound using simulated datasets and semi-synthetic high-dimensional datasets obtained from GWAS datasets. We found that the bound was able to dramatically reduce the search time for strict epistasis. Using an Alzheimer's dataset, we showed that it is possible to discover an interaction involving the APOE gene based on its score because of its large marginal effect, but that the bound is most effective at discovering interactions without marginal effects. Conclusions/Significance: We conclude that the bound appears to ameliorate the curse of dimensionality in high-dimensional datasets. This is a very consequential result and could be pivotal in our efforts to reveal the dark matter of genetic disease risk from high-dimensional datasets. © 2012 Jiang, Neapolitan

    A Pattern Decomposition Algorithm for Data Mining of Frequent Patterns

    Full text link

    Radio jet emission from GeV-emitting narrow-line Seyfert 1 galaxies

    Get PDF
    We studied the radio emission from four radio-loud and gamma-ray-loud narrow-line Seyfert 1 galaxies. The goal was to investigate whether a relativistic jet is operating at the source, and quantify its characteristics. We relied on the most systematic monitoring of such system in the cm and mm radio bands which is conducted with the Effelsberg 100 m and IRAM 30 m telescopes and covers the longest time-baselines and the most radio frequencies to date. We extract variability parameters and compute variability brightness temperatures and Doppler factors. The jet powers were computed from the light curves to estimate the energy output. The dynamics of radio spectral energy distributions were examined to understand the mechanism causing the variability. All the sources display intensive variability that occurs at a pace faster than what is commonly seen in blazars. The flaring events show intensive spectral evolution indicative of shock evolution. The brightness temperatures and Doppler factors are moderate, implying a mildly relativistic jet. The computed jet powers show very energetic flows. The radio polarisation in one case clearly implies a quiescent jet underlying the recursive flaring activity. Despite the generally lower flux densities, the sources appear to show all typical characteristics seen in blazars that are powered by relativistic jets.Comment: Accepted for publication in 4 - Extragalactic astronomy of Astronomy and Astrophysic

    Modeling Graphs with Vertex Replacement Grammars

    Full text link
    One of the principal goals of graph modeling is to capture the building blocks of network data in order to study various physical and natural phenomena. Recent work at the intersection of formal language theory and graph theory has explored the use of graph grammars for graph modeling. However, existing graph grammar formalisms, like Hyperedge Replacement Grammars, can only operate on small tree-like graphs. The present work relaxes this restriction by revising a different graph grammar formalism called Vertex Replacement Grammars (VRGs). We show that a variant of the VRG called Clustering-based Node Replacement Grammar (CNRG) can be efficiently extracted from many hierarchical clusterings of a graph. We show that CNRGs encode a succinct model of the graph, yet faithfully preserves the structure of the original graph. In experiments on large real-world datasets, we show that graphs generated from the CNRG model exhibit a diverse range of properties that are similar to those found in the original networks.Comment: Accepted as a regular paper at IEEE ICDM 2019. 15 pages, 9 figure
    • …
    corecore