Search CORE

6 research outputs found

F-measures for simulation results.

Author: Alexander Schliep (40907)
Eric Brugel (3170319)
John Wiedenhoeft (3170316)
Publication venue
Publication date
Field of study

The median value (black) and quantile ranges (in 5% steps) of the micro- (top) and macro-averaged (bottom) F-measures (Fmi, Fma) for uncompressed (left) and compressed (right) FBG inference, on the same 129,600 simulated data sets, using automatic priors. The x-axis represents the number of iterations alone, and does not reflect the additional speedup obtained through compression. Notice that the compressed HMM converges no later than 50 iterations (inset figures, right).</p

FigShare

Mapping of wavelets ψj, k and data points yt to tree nodes Nℓ, t.

Author: Alexander Schliep (40907)
Eric Brugel (3170319)
John Wiedenhoeft (3170316)
Publication venue
Publication date
Field of study

Each node is the root of a subtree with n = 2ℓ leaves; pruning that subtree yields a block of size n, starting at position t. For instance, the node N1,6 is located at position 13 of the DFS array (solid line), and corresponds to the wavelet ψ3,3. A block of size n = 2 can be created by pruning the subtree, which amounts to advancing by 2n − 1 = 3 positions (dashed line), yielding N3,8 at position 16, which is the wavelet ψ1,1. Thus the number of steps for creating blocks per iteration is at most the number of nodes in the tree, and thus strictly smaller than 2T.</p

FigShare

HaMMLET’s inference of copy-number segments on T47D breast ductal carcinoma.

Author: Alexander Schliep (40907)
Eric Brugel (3170319)
John Wiedenhoeft (3170316)
Publication venue
Publication date
Field of study

Notice that the data is much more complex than the simple structure of a diploid majority class with some small aberrations typically observed for Coriell data.</p

FigShare

F-measures of CBS (light) and HaMMLET (dark) for calling aberrant copy numbers on simulated aCGH data [66].

Author: Alexander Schliep (40907)
Eric Brugel (3170319)
John Wiedenhoeft (3170316)
Publication venue
Publication date
Field of study

Boxes represent the interquartile range (IQR = Q3−Q1), with a horizontal line showing the median (Q2), whiskers representing the range ( beyond Q1 and Q3), and the bullet representing the mean. HaMMLET has the same or better F-measures in most cases, and on the SRS simulation converges to 1 for larger segments, whereas CBS plateaus for aberrations greater than 10.</p

FigShare

Example of dynamic block creation.

Author: Alexander Schliep (40907)
Eric Brugel (3170319)
John Wiedenhoeft (3170316)
Publication venue
Publication date
Field of study

The data is of size T = 256, so the wavelet tree contains 512 nodes. Here, only 37 entries had to be checked against the threshold (dark line), 19 of which (round markers) yielded a block (vertical lines on the bottom). Sampling is hence done on a short array of 19 blocks instead of 256 individual values, thus the compression ratio is 13.5. The horizontal lines in the bottom subplot are the block means derived from the sufficient statistics in the nodes. Notice how the algorithm creates small blocks around the breakpoints, e. g. at t ≈ 125, which requires traversing to lower levels and thus induces some additional blocks in other parts of the tree (left subtree), since all block sizes are powers of 2. This somewhat reduces the compression ratio, which is unproblematic as it increases the degrees of freedom in the sampler.</p

FigShare

Overview of HaMMLET.

Author: Alexander Schliep (40907)
Eric Brugel (3170319)
John Wiedenhoeft (3170316)
Publication venue
Publication date
Field of study

Instead of individual computations per observation (panel a), Forward-Backward Gibbs Sampling is performed on a compressed version of the data, using sufficient statistics for block-wise computations (panel b) to accelerate inference in Bayesian Hidden Markov Models. During the sampling (panel c) parameters and copy number sequences are sampled iteratively. During each iteration, the sampled emission variances determine which coefficients of the data’s Haar wavelet transform are dynamically set to zero. This controls potential break points at finer or coarser resolution or, equivalently, defines blocks of variable number and size (panel c, bottom). Our approach thus yields a dynamic, adaptive compression scheme which greatly improves speed of convergence, accuracy and running times.</p

FigShare