6 research outputs found

    F-measures for simulation results.

    No full text
    <p>The median value (black) and quantile ranges (in 5% steps) of the micro- (top) and macro-averaged (bottom) F-measures (<i>F</i><sub>mi</sub>, <i>F</i><sub>ma</sub>) for uncompressed (left) and compressed (right) FBG inference, on the same 129,600 simulated data sets, using automatic priors. The x-axis represents the number of iterations alone, and does not reflect the additional speedup obtained through compression. Notice that the compressed HMM converges no later than 50 iterations (inset figures, right).</p

    Mapping of wavelets <i>ψ</i><sub><i>j</i>, <i>k</i></sub> and data points <i>y</i><sub><i>t</i></sub> to tree nodes <i>N</i><sub><i>ℓ</i>, <i>t</i></sub>.

    No full text
    <p>Each node is the root of a subtree with <i>n</i> = 2<sup><i>ℓ</i></sup> leaves; pruning that subtree yields a block of size <i>n</i>, starting at position <i>t</i>. For instance, the node <i>N</i><sub>1,6</sub> is located at position 13 of the DFS array (solid line), and corresponds to the wavelet <i>ψ</i><sub>3,3</sub>. A block of size <i>n</i> = 2 can be created by pruning the subtree, which amounts to advancing by 2<i>n</i> − 1 = 3 positions (dashed line), yielding <i>N</i><sub>3,8</sub> at position 16, which is the wavelet <i>ψ</i><sub>1,1</sub>. Thus the number of steps for creating blocks per iteration is at most the number of nodes in the tree, and thus strictly smaller than 2<i>T</i>.</p

    HaMMLET’s inference of copy-number segments on T47D breast ductal carcinoma.

    No full text
    <p>Notice that the data is much more complex than the simple structure of a diploid majority class with some small aberrations typically observed for Coriell data.</p

    F-measures of CBS (light) and HaMMLET (dark) for calling aberrant copy numbers on simulated aCGH data [66].

    No full text
    <p>Boxes represent the interquartile range (IQR = Q3−Q1), with a horizontal line showing the median (Q2), whiskers representing the range ( beyond Q1 and Q3), and the bullet representing the mean. HaMMLET has the same or better F-measures in most cases, and on the SRS simulation converges to 1 for larger segments, whereas CBS plateaus for aberrations greater than 10.</p

    Example of dynamic block creation.

    No full text
    <p>The data is of size T = 256, so the wavelet tree contains 512 nodes. Here, only 37 entries had to be checked against the threshold (dark line), 19 of which (round markers) yielded a block (vertical lines on the bottom). Sampling is hence done on a short array of 19 blocks instead of 256 individual values, thus the compression ratio is 13.5. The horizontal lines in the bottom subplot are the block means derived from the sufficient statistics in the nodes. Notice how the algorithm creates small blocks around the breakpoints, e. g. at t ≈ 125, which requires traversing to lower levels and thus induces some additional blocks in other parts of the tree (left subtree), since all block sizes are powers of 2. This somewhat reduces the compression ratio, which is unproblematic as it increases the degrees of freedom in the sampler.</p

    Overview of HaMMLET.

    No full text
    <p>Instead of individual computations per observation (panel a), Forward-Backward Gibbs Sampling is performed on a compressed version of the data, using sufficient statistics for block-wise computations (panel b) to accelerate inference in Bayesian Hidden Markov Models. During the sampling (panel c) parameters and copy number sequences are sampled iteratively. During each iteration, the sampled emission variances determine which coefficients of the data’s Haar wavelet transform are dynamically set to zero. This controls potential break points at finer or coarser resolution or, equivalently, defines blocks of variable number and size (panel c, bottom). Our approach thus yields a dynamic, adaptive compression scheme which greatly improves speed of convergence, accuracy and running times.</p
    corecore