17 research outputs found

    HPerf: A Lightweight Profiler for Task Distribution on CPU+GPU Platforms

    Get PDF
    Research areas: Computer architecture, Programming analysisHeterogeneous computing has emerged as one of the major computing platforms in many domains. Although there have been several proposals to aid programming for heterogeneous computing platforms, optimizing applications on heterogeneous computing platforms is not an easy task. Identifying which parallel regions (or tasks) should run on GPUs or CPUs is one of the critical decisions to improve performance. In this paper, we propose a profiler, HPerf, to identify an efficient task distribution on CPUs+GPUs system with low profiling overhead. HPerf is a hierarchical profiler. First it performs lightweight profiling and then if necessary, it performs detailed profiling to measure caching and data transfer cost. Compared to a brute-force approach, HPerf reduces the profiling overhead significantly and compared to a naive decision, HPerf improves the performance of OpenCL applications up to 25%

    MTL : A VAX/VMS compiler for a multi-tasking and message passing language Author

    No full text
    Thesis (M.Sc.) -- University of Adelaide, Dept. of Computer Science, 1983

    Comments on “the cost of selective recompilation and environment processing”

    No full text

    Massively parallel unsupervised single-particle cryo-EM data clustering via statistical manifold learning

    No full text
    Structural heterogeneity in single-particle cryo-electron microscopy (cryo-EM) data represents a major challenge for high-resolution structure determination. Unsupervised classification may serve as the first step in the assessment of structural heterogeneity. However, traditional algorithms for unsupervised classification, such as K-means clustering and maximum likelihood optimization, may classify images into wrong classes with decreasing signal-to-noise-ratio (SNR) in the image data, yet demand increased computational costs. Overcoming these limitations requires further development of clustering algorithms for high-performance cryo-EM data processing. Here we introduce an unsupervised single-particle clustering algorithm derived from a statistical manifold learning framework called generative topographic mapping (GTM). We show that unsupervised GTM clustering improves classification accuracy by about 40% in the absence of input references for data with lower SNRs. Applications to several experimental datasets suggest that our algorithm can detect subtle structural differences among classes via a hierarchical clustering strategy. After code optimization over a high-performance computing (HPC) environment, our software implementation was able to generate thousands of reference-free class averages within hours in a massively parallel fashion, which allows a significant improvement on ab initio 3D reconstruction and assists in the computational purification of homogeneous datasets for high-resolution visualization

    Strategy for unsupervised single-particle clustering via statistical manifold learning.

    No full text
    <p><b>(A)</b> The fundamental principle of GTM is to establish a numerical relationship between variables in the latent space and a non-Euclidean manifold composed of the Fourier transformed image data in the data space. The manifold embedding can be determined by a set of nonlinear basis functions and a weighted parametric matrix. The likelihood function for the nonlinear mapping is solved by the expectation-maximization algorithm. <b>(B)</b> The workflow of implementing the unsupervised clustering strategies in ROME is as follows: (I) All images are aligned using MAP2D in a reference-free manner, and are subsequently classified into many groups by unsupervised GTM. (II) The unsupervised classes obtained in step (I) are further classified into many sub-classes by unsupervised GTM in a hierarchical fashion.</p

    Initial 3D reconstruction from the reference-free class averages of ROME and EMAN2.

    No full text
    <p><b>(A)</b> The initial reconstruction calculated by the ROME-generated class averages is superimposed with the atomic model of free RP shown in a ribbon representation, suggesting that they are highly compatible with each other. <b>(B)</b> The initial reconstruction calculated by the EMAN2-generated class averages is superimposed over the atomic model of free RP shown in a ribbon representation. A substantial part of the atomic model is outside of the density of the initial reconstruction, suggesting poor map quality and a large reconstruction error. <b>(C)</b> FSC curves between the RP atomic model and the initial reconstructions generated by ROME- and EMAN2-based class averages.</p

    Benchmarking the performance of unsupervised clustering using simulated data.

    No full text
    <p><b>(A)</b> A projection of the 70S ribosome model. <b>(B</b> and <b>C)</b> Examples of the simulated images of the 70S ribosome with SNRs of 1/100 <b>(B)</b> and 1/200 <b>(C)</b>. The right panel in <b>(B)</b> and <b>(C)</b> shows the low-pass filtered version of each simulated image. <b>(D</b> and <b>F)</b> The normalized histogram exhibits the distributions of angular distances resulting from the five classification methods that were applied to the simulated images with SNRs of 1/100 (panel <b>D</b>) and 1/200 (panel <b>F</b>). <b>(E</b> and <b>G)</b> The sizes of classes were ranked for the five classification methods with SNRs of 1/100 (panel <b>E</b>) and 1/200 (panel <b>G</b>).</p

    Classification accuracy with one-, two- and three-dimensional latent space in our GTM algorithm.

    No full text
    <p><b>(A)</b> Normalized histograms exhibit the angular distances for the one- and two-dimensional latent space under different SNRs. <b>(B)</b> The sizes of classes are for different latent space dimensions with varying SNRs. The label ‘GTM_D’ in <b>(A)</b> and <b>(B)</b> represents the number of dimensions. GTM_1D denotes that 500 points in one dimensional latent space were sampled in the GTM algorithm. GTM_2D denotes that 100 points in one dimension and 5 points in the other dimension, a total of 500 points, were sampled by the GTM algorithm. GTM_3D denotes that 20 points in the first dimension and 5 points in each of the other two dimensions, giving a total 500 points, were sampled in the GTM algorithm.</p

    Performance evaluation of unsupervised clustering with ROME.

    No full text
    <p><b>(A)</b> Performance of unsupervised single-particle clustering in ROME versus RELION using different datasets. Unsupervised 2D classification into 300 classes using both software programs were performed on four experimental datasets: Dataset1 refers to the 16,306-particle dataset of the inflammasome with 250×250 box size; dataset2 refers to the 35,407-particle dataset of the free RP complex with 160×160 box size; dataset3 refers to the 96,488-particle dataset of the RP-CP complex with 160×160 box size; dataset4 refers to the 57,001-particle dataset of the free RP complex with 180×180 box size. MAP2D alignment in ROME and GTM clustering for 300 classes wes also performed. The blue, green, and red histograms represent the running time of RELION, MAP2D in ROME, and GTM in ROME, respectively. For more comparison, see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182130#pone.0182130.s009" target="_blank">S9 Fig</a> and <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182130#pone.0182130.s011" target="_blank">S1 Table</a>. <b>(B)</b> The 96,488-particle dataset of the RP-CP subcomplex was used to test the performance of GTM in ROME (blue dots). The green dots represent the total running time including both the MAP2D alignment and GTM clustering in ROME. The running time was polynomially related to the number of classes.</p

    Unsupervised clustering by GTM.

    No full text
    <p><b>(A)</b> Typical class averages of inflammasome particles generated by unsupervised GTM clustering in ROME. Red, yellow and green boxes indicate the top views (first row) and the side views (second row) of 10-, 11-, and 12-fold inflammasome complex, respectively. The side views of the complex structure differ by length. Besides, the purple box denotes the class average of an incomplete inflammasome complex. <b>(B)</b> Typical class averages of RP-CP sub-complexes generated by unsupervised GTM in ROME. The red or yellow boxes indicate a pair of class averages showing differences in local features corresponding to the local movement of the Rpn5 subunit of the RP-CP subcomplex [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182130#pone.0182130.ref007" target="_blank">7</a>]. The green box indicates a pair of class averages showing the movement of the Rpn1 subunit of RP-CP subcomplex [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0182130#pone.0182130.ref007" target="_blank">7</a>]. The purple box labels the class average of the incomplete RP-CP subcomplex. <b>(C)</b> Typical side-view class averages of the inflammasome were initially classified using the MAP2D classifier in a reference-free manner. Two classes among 50 classes visually resemble the 11-fold inflammasome complex particles. <b>(D)</b> The class average highlighted by red box in panel (<b>C)</b> was further classified by GTM. The red boxes indicate the 11-fold inflammasome particles. The green boxes indicate the 10-fold inflammasome particles that were misclassified by MAP2D into the same class as the rest 11-fold structures. The yellow boxes indicate the 12-fold inflammasome particles that were misclassified by MAP2D into the same class as the rest of the 11-fold structures. <b>(E)</b> A 57,001-particle dataset of free RP was initially classified using the MAP2D classifier in a reference-free manner. <b>(F)</b> The class marked by the red box in panel (<b>E)</b> was further classified by GTM in ROME. Several classes of RP-CP sub-complex particles (red boxes) were found to be misclassified into this free RP class.</p
    corecore