9 research outputs found

    Analysis of dictionary components learned by the SSMCA algorithm on natural image patches.

    No full text
    <p><b>A</b> Example dictionary elements <b>W</b><sub><i>h</i></sub> after learning. <b>B</b> Fraction of globular fields estimated from <i>in vivo</i> measurements, compared to ours (after fitting with Gabor wavelets and DoG’s; globular percentages taken from [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0124088#pone.0124088.ref019" target="_blank">19</a>] who analyzed data provided by [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0124088#pone.0124088.ref018" target="_blank">18</a>] and estimated percentages of globular fields from data in two further papers [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0124088#pone.0124088.ref043" target="_blank">43</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0124088#pone.0124088.ref044" target="_blank">44</a>]. <b>C</b> Learned prior. <b>D</b> Actual activations of diverse dictionary elements <i>s</i><sub><i>h</i></sub> (posterior averaged over data points).</p

    Nonlinear spike-and-slab sparse coding for interpretable image encoding 4) Natural image patches. PLOS ONE

    No full text
    <p>4) Natural image patches</p> <p>This file explains the natural image patches data used in the final experiment of the corresponding publication.</p

    Synthetic occlusion dataset and cut–out original and noisy patches.

    No full text
    <p>Examples taken from the occlusion dataset. <b>A</b> shows an original noise-free image of generated occluding strokes of random width, pixel intensity, and starting/ending points. <b>B</b> shows a handfull of overlapping image patches cut from the original, noise-free data. <b>C</b> shows examples of the noisy training data, with independent <i>σ</i> = 25 noise added to <b>B</b>.</p

    Parameter recovery on synthetic data.

    No full text
    <p>Results of three differently parameterized sets of experiments, each with 10 experimental runs of 30 EM iterations on identical artificial ground-truth data generated according to the SSMCA model: <b>A</b> <i>N</i> = 2,000, <i>D</i> = 5 × 5. Three shown experimental settings are: <b>B</b> <i>H</i>′ = <i>H</i> = 10, <b>C</b> <i>H</i>′ = 5, and <b>D</b> <i>H</i>′ = 4, although the same results were obtained by the entire range of parameters <i>H</i>′ = [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0124088#pone.0124088.ref004" target="_blank">4</a>, <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0124088#pone.0124088.ref010" target="_blank">10</a>]. Importantly, the figure shows accurate recovery of ground-truth parameters which are plotted with dotted lines. <b>B</b>, <b>C</b> and <b>D</b> show in each column the parameter convergence of each of the three experiments, where the rows contain the following: data noise <i>σ</i>, sparsity <i>H</i> × <i>π</i>, prior standard dev. <i>σ</i><sub>pr</sub>, and the prior mean <i>μ</i><sub>pr</sub>. Finally, <b>E</b> shows the set of learned generative fields/components <b>W</b><sub><i>h</i></sub> corresponding to each experimental set <b>B</b> <i>H</i>′ = <i>H</i> = 10, <b>C</b> <i>H</i>′ = 5, and <b>D</b> <i>H</i>′ = 4.</p

    Results of comparative experiments of linear and nonlinear sparse coding methods on component learning/image reconstruction on natural image patches.

    No full text
    <p><b>A</b> shows the original natural image data, bridge.jpg [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0124088#pone.0124088.ref039" target="_blank">39</a>], from which we cut an occlusion-rich underbrush region. <b>B</b> shows the original section taken from <b>A</b>, scaled up to 256 × 256 pixels, which was then cut into overlapping patches and given independent Gaussian noise with <i>σ</i> = 5 to compose the considered dataset. <b>C</b> shows the mean squared error (MSE) of the compared nonlinear and linear methods’ reconstruction averaged over the entire dataset, with the standard deviation indicated with error bars. The trend is the same as in the artificial occlusions data experiments: the nonlinear method maintains reasonably low MSE, while learning a sparse set of interpretable components, whereas the linear method achieves a very low MSE only when it does not learn a sparse (and never interpretable) solution of components.</p

    Toy example illustrating the problem setting: approximating occlusions in images.

    No full text
    <p>Given an image patch with occlusions (<b>A</b>), assume both the linear and nonlinear sparse coding models were given the true generating dictionary elements (<b>B</b>) and the task is for each model to use a sparse set of these to generate a reconstruction of the patch (<b>C</b>). <b>A</b> Example natural image with one patch to be reconstructed by the models. <b>B</b> 10 ground-truth dictionary elements, assumed to be known and with only 2 of 10 having generated the image patch. <b>C</b> Image reconstruction using the sparse dictionary set of the 2 models: the standard linear sparse coding model and the nonlinear spike-and-slab SC model. The linear sum leads to inaccurate pixel estimates when components overlap, whereas the nonlinear max aims to approximate this type of data more realistically in this scenario. Furthermore, the spike-and-slab prior (shown here for the the nonlinear model) allows the model to adapt the intensity of each image component to match what it observed in the data.</p

    Comparative experiments of linear and nonlinear sparse coding on dictionary learning and image reconstruction.

    No full text
    <p>With <i>H</i> = 100 learned dictionary components we evaluate the number learned and used for reconstruction. <b>A</b> shows the relationship between sparsity (number of components used for reconstruction) and data complexity (number of strokes in the data). Interestingly, the SSMCA plot (blue curves) shows a nearly linear relationship of the number of components used for reconstruction versus the number of components (strokes) actually in the data, suggesting that reconstruction-complexity of the data by nonlinear model more closely follows the actual complexity in the data. On the contrary, the linear parameterization that yields good reconstruction results <i>a</i> = 1 shown in green, does not adapt to the data complexity at all: it consistently uses nearly 80 of the learned 100 components per reconstruction, regardless of the data point’s actual complexity (note the change in scale of the <i>y</i>-axis around 30 components in order to fit the green curve on the plot). <b>B</b> shows the relationship of the mean squared error (MSE) of the reconstructions of all versus the corresponding data complexity (number of strokes in the data). When the reconstruction-complexity (sparsity) is far from the actual complexity of the data (linear methods: red, <i>a</i> = 50 and green <i>a</i> = 1 cases) the MSE improves. However, when the sparsity is more closely matched to the data, SSMCA and the weakly regularized linear methods result in a poorer MSE. SSMCA nevertheless yields a better MSE in this case, even when it and linSC <i>a</i> = 100 have a very similarly sparse solutions/use the same number of components. Note that the error of the least sparse LinSC approach (<i>a</i> = 1) is so low (mean MSE = 1.81), it does not even appear on this graph. Error bars shown are scaled to be 10% of the standard deviation for all methods in all stroke-complexity cases. The mean MSE (averaged over the entire dataset) is shown in the legend next to the respective algorithm.</p

    Comparison of linear and nonlinear sparse coding on image reconstruction.

    No full text
    <p>Shown are a handfull of real data points of varying complexity in terms of the number of strokes <i>k</i> in each image (<i>k</i> ∈ (1,5) strokes per image), the components/fields learned by the various algorithms, the corresponding reconstruction of the given data point, and the mean squared error (MSE) of each reconstruction. <b>A</b> image with <i>k</i> = 1 stroke, <b>B</b> <i>k</i> = 2 strokes, <b>C</b> <i>k</i> = 3 strokes, <b>D</b> <i>k</i> = 4 strokes, and <b>E</b> <i>k</i> = 5 strokes. Regardless of image complexity—how many causes/strokes are in an image—the components used by the nonlinear method (SSMCA) resemble the true causes of the image: each component contains a single, interpretable stroke. On the other hand, none of the <i>a</i> parameterizations of the linear method yield stroke-like components, even when the solution is regularized to be as sparse as SSMCA (<i>a</i> = 100). Note: all images in the <i>a</i> = 1 case appear brighter than they actually are, due to visualization with a python toolbox, but are in reality of the identical brightness scale to the original data point (and all other shown cases), hence the reconstruction error (MSE) is very low.</p

    Illustration of choice of prior distribution and multimodality in the latent space.

    No full text
    <p>A H = 2-dimensional spike-and-slab and Laplace priors over latent variables and the multimodal posterior distribution induced by these priors for both linear and nonlinear data likelihoods.</p
    corecore