Search CORE

16 research outputs found

F-measures for simulation results.

Author: Alexander Schliep (40907)
Eric Brugel (3170319)
John Wiedenhoeft (3170316)
Publication venue
Publication date
Field of study

The median value (black) and quantile ranges (in 5% steps) of the micro- (top) and macro-averaged (bottom) F-measures (Fmi, Fma) for uncompressed (left) and compressed (right) FBG inference, on the same 129,600 simulated data sets, using automatic priors. The x-axis represents the number of iterations alone, and does not reflect the additional speedup obtained through compression. Notice that the compressed HMM converges no later than 50 iterations (inset figures, right).</p

FigShare

F-measures of CBS (light) and HaMMLET (dark) for calling aberrant copy numbers on simulated aCGH data [66].

Author: Alexander Schliep (40907)
Eric Brugel (3170319)
John Wiedenhoeft (3170316)
Publication venue
Publication date
Field of study

Boxes represent the interquartile range (IQR = Q3−Q1), with a horizontal line showing the median (Q2), whiskers representing the range ( beyond Q1 and Q3), and the bullet representing the mean. HaMMLET has the same or better F-measures in most cases, and on the SRS simulation converges to 1 for larger segments, whereas CBS plateaus for aberrations greater than 10.</p

FigShare

Overview of HaMMLET.

Author: Alexander Schliep (40907)
Eric Brugel (3170319)
John Wiedenhoeft (3170316)
Publication venue
Publication date
Field of study

Instead of individual computations per observation (panel a), Forward-Backward Gibbs Sampling is performed on a compressed version of the data, using sufficient statistics for block-wise computations (panel b) to accelerate inference in Bayesian Hidden Markov Models. During the sampling (panel c) parameters and copy number sequences are sampled iteratively. During each iteration, the sampled emission variances determine which coefficients of the data’s Haar wavelet transform are dynamically set to zero. This controls potential break points at finer or coarser resolution or, equivalently, defines blocks of variable number and size (panel c, bottom). Our approach thus yields a dynamic, adaptive compression scheme which greatly improves speed of convergence, accuracy and running times.</p

FigShare

Example of dynamic block creation.

Author: Alexander Schliep (40907)
Eric Brugel (3170319)
John Wiedenhoeft (3170316)
Publication venue
Publication date
Field of study

The data is of size T = 256, so the wavelet tree contains 512 nodes. Here, only 37 entries had to be checked against the threshold (dark line), 19 of which (round markers) yielded a block (vertical lines on the bottom). Sampling is hence done on a short array of 19 blocks instead of 256 individual values, thus the compression ratio is 13.5. The horizontal lines in the bottom subplot are the block means derived from the sufficient statistics in the nodes. Notice how the algorithm creates small blocks around the breakpoints, e. g. at t ≈ 125, which requires traversing to lower levels and thus induces some additional blocks in other parts of the tree (left subtree), since all block sizes are powers of 2. This somewhat reduces the compression ratio, which is unproblematic as it increases the degrees of freedom in the sampler.</p

FigShare

HaMMLET’s inference of copy-number segments on T47D breast ductal carcinoma.

Author: Alexander Schliep (40907)
Eric Brugel (3170319)
John Wiedenhoeft (3170316)
Publication venue
Publication date
Field of study

Notice that the data is much more complex than the simple structure of a diploid majority class with some small aberrations typically observed for Coriell data.</p

FigShare

Mapping of wavelets ψj, k and data points yt to tree nodes Nℓ, t.

Author: Alexander Schliep (40907)
Eric Brugel (3170319)
John Wiedenhoeft (3170316)
Publication venue
Publication date
Field of study

Each node is the root of a subtree with n = 2ℓ leaves; pruning that subtree yields a block of size n, starting at position t. For instance, the node N1,6 is located at position 13 of the DFS array (solid line), and corresponds to the wavelet ψ3,3. A block of size n = 2 can be created by pruning the subtree, which amounts to advancing by 2n − 1 = 3 positions (dashed line), yielding N3,8 at position 16, which is the wavelet ψ1,1. Thus the number of steps for creating blocks per iteration is at most the number of nodes in the tree, and thus strictly smaller than 2T.</p

FigShare

Identifying protein complexes directly from high-throughput TAP data with Markov random fields-0

Author: Alexander Schliep (40907)
Arno Schödl (55889)
Roland Krause (4351)
Wasinee Rungsarityotin (55888)
Publication venue
Publication date
Field of study

Copyright information:Taken from "Identifying protein complexes directly from high-throughput TAP data with Markov random fields"http://www.biomedcentral.com/1471-2105/8/482BMC Bioinformatics 2007;8():482-482.Published online 19 Dec 2007PMCID:PMC2222659.ive rate is set to 0.005 and the false negative rates is 0.2 or 0.5. With = 0.2 (2(a), 2(b)), MRF can recover the true clustering with the minimum negative log-likelihood which is taken on for 11 clusters. Notice that any more clusters do not reduce the cost any further; additional clusters simply remain empty. For = 0.5, the accuracy is worse and needs more empty clusters to reach convergence. In 2(c) and 2(d) the convergence rate fluctuates more

FigShare

Identifying protein complexes directly from high-throughput TAP data with Markov random fields-3

Author: Alexander Schliep (40907)
Arno Schödl (55889)
Roland Krause (4351)
Wasinee Rungsarityotin (55888)
Publication venue
Publication date
Field of study

FigShare

Semi-supervised learning for the identification of syn-expressed genes from fused microarray and image data-6

Author: Alexander Schliep (40907)
Ivan G Costa (40906)
Lennart Opitz (86036)
Roland Krause (4351)
Publication venue
Publication date
Field of study

Copyright information:Taken from "Semi-supervised learning for the identification of syn-expressed genes from fused microarray and image data"http://www.biomedcentral.com/1471-2105/8/S10/S3BMC Bioinformatics 2007;8(Suppl 10):S3-S3.Published online 21 Dec 2007PMCID:PMC2230504.ges investigated. The major phenomena are depletion of maternal mRNA (maternal genes) and start of the embryonic transcriptional machinery during embryogenesis at time point 3 hours (zigotically expressed genes). In the clusters with zigotically expressed genes, we observe two main periods of activation: 3–4 hours for cluster U1 to U5, and 7–8 h for clusters U8 to U11. In the clusters with maternal genes, we observe under-expression of genes at several time periods: 3–4 h in clusters U21 to U28; 4–5 h for clusters U17 to U20; 6–7 h for cluster U16; 7–8 h for clusters U12 and U13; and 9–10 h for cluster U15

FigShare

Semi-supervised learning for the identification of syn-expressed genes from fused microarray and image data-3

Author: Alexander Schliep (40907)
Ivan G Costa (40906)
Lennart Opitz (86036)
Roland Krause (4351)
Publication venue
Publication date
Field of study

FigShare