64 research outputs found

    Massively Parallel Algorithms for the Stochastic Block Model

    Full text link
    Learning the community structure of a large-scale graph is a fundamental problem in machine learning, computer science and statistics. We study the problem of exactly recovering the communities in a graph generated from the Stochastic Block Model (SBM) in the Massively Parallel Computation (MPC) model. Specifically, given knkn vertices that are partitioned into kk equal-sized clusters (i.e., each has size nn), a graph on these knkn vertices is randomly generated such that each pair of vertices is connected with probability~pp if they are in the same cluster and with probability qq if not, where p>q>0p > q > 0. We give MPC algorithms for the SBM in the (very general) \emph{ss-space MPC model}, where each machine has memory s=Ω(logn)s=\Omega(\log n). Under the condition that pqpΩ~(k12n12+12(r1))\frac{p-q}{\sqrt{p}}\geq \tilde{\Omega}(k^{\frac12}n^{-\frac12+\frac{1}{2(r-1)}}) for any integer r[3,O(logn)]r\in [3,O(\log n)], our first algorithm exactly recovers all the kk clusters in O(krlogsn)O(kr\log_s n) rounds using O~(m)\tilde{O}(m) total space, or in O(rlogsn)O(r\log_s n) rounds using O~(km)\tilde{O}(km) total space. If pqpΩ~(k34n14)\frac{p-q}{\sqrt{p}}\geq \tilde{\Omega}(k^{\frac34}n^{-\frac14}), our second algorithm achieves O(logsn)O(\log_s n) rounds and O~(m)\tilde{O}(m) total space complexity. Both algorithms significantly improve upon a recent result of Cohen-Addad et al. [PODC'22], who gave algorithms that only work in the \emph{sublinear space MPC model}, where each machine has local memory~s=O(nδ)s=O(n^{\delta}) for some constant δ>0\delta>0, with a much stronger condition on p,q,kp,q,k. Our algorithms are based on collecting the rr-step neighborhood of each vertex and comparing the difference of some statistical information generated from the local neighborhoods for each pair of vertices. To implement the clustering algorithms in parallel, we present efficient approaches for implementing some basic graph operations in the ss-space MPC model

    Massively Parallel Algorithms for the Stochastic Block Model

    Get PDF
    Learning the community structure of a large-scale graph is a fundamental problem in machine learning, computer science and statistics. Among others, the Stochastic Block Model (SBM) serves a canonical model for community detection and clustering, and the Massively Parallel Computation (MPC) model is a mathematical abstraction of real-world parallel computing systems, which provides a powerful computational framework for handling large-scale datasets. We study the problem of exactly recovering the communities in a graph generated from the SBM in the MPC model. Specifically, given kn vertices that are partitioned into k equal-sized clusters (i.e., each has size n), a graph on these kn vertices is randomly generated such that each pair of vertices is connected with probability p if they are in the same cluster and with probability q if not, where p > q > 0. We give MPC algorithms for the SBM in the (very general) s-space MPC model, where each machine is guaranteed to have memory s = ?(log n). Under the condition that (p-q)/?p ? ??(k^{1/2} n^{-1/2+1/(2(r-1))}) for any integer r ? [3,O(log n)], our first algorithm exactly recovers all the k clusters in O(kr log_s n) rounds using O?(m) total space, or in O(rlog_s n) rounds using O?(km) total space. If (p-q)/?p ? ??(k^{3/4} n^{-1/4}), our second algorithm achieves O(log_s n) rounds and O?(m) total space complexity. Both algorithms significantly improve upon a recent result of Cohen-Addad et al. [PODC\u2722], who gave algorithms that only work in the sublinear space MPC model, where each machine has local memory s = O(n^?) for some constant ? > 0, with a much stronger condition on p,q,k. Our algorithms are based on collecting the r-step neighborhood of each vertex and comparing the difference of some statistical information generated from the local neighborhoods for each pair of vertices. To implement the clustering algorithms in parallel, we present efficient approaches for implementing some basic graph operations in the s-space MPC model

    SAM-PARSER: Fine-tuning SAM Efficiently by Parameter Space Reconstruction

    Full text link
    Segment Anything Model (SAM) has received remarkable attention as it offers a powerful and versatile solution for object segmentation in images. However, fine-tuning SAM for downstream segmentation tasks under different scenarios remains a challenge, as the varied characteristics of different scenarios naturally requires diverse model parameter spaces. Most existing fine-tuning methods attempt to bridge the gaps among different scenarios by introducing a set of new parameters to modify SAM's original parameter space. Unlike these works, in this paper, we propose fine-tuning SAM efficiently by parameter space reconstruction (SAM-PARSER), which introduce nearly zero trainable parameters during fine-tuning. In SAM-PARSER, we assume that SAM's original parameter space is relatively complete, so that its bases are able to reconstruct the parameter space of a new scenario. We obtain the bases by matrix decomposition, and fine-tuning the coefficients to reconstruct the parameter space tailored to the new scenario by an optimal linear combination of the bases. Experimental results show that SAM-PARSER exhibits superior segmentation performance across various scenarios, while reducing the number of trainable parameters by 290\approx 290 times compared with current parameter-efficient fine-tuning methods

    USAGE: A Unified Seed Area Generation Paradigm for Weakly Supervised Semantic Segmentation

    Full text link
    Seed area generation is usually the starting point of weakly supervised semantic segmentation (WSSS). Computing the Class Activation Map (CAM) from a multi-label classification network is the de facto paradigm for seed area generation, but CAMs generated from Convolutional Neural Networks (CNNs) and Transformers are prone to be under- and over-activated, respectively, which makes the strategies to refine CAMs for CNNs usually inappropriate for Transformers, and vice versa. In this paper, we propose a Unified optimization paradigm for Seed Area GEneration (USAGE) for both types of networks, in which the objective function to be optimized consists of two terms: One is a generation loss, which controls the shape of seed areas by a temperature parameter following a deterministic principle for different types of networks; The other is a regularization loss, which ensures the consistency between the seed areas that are generated by self-adaptive network adjustment from different views, to overturn false activation in seed areas. Experimental results show that USAGE consistently improves seed area generation for both CNNs and Transformers by large margins, e.g., outperforming state-of-the-art methods by a mIoU of 4.1% on PASCAL VOC. Moreover, based on the USAGE-generated seed areas on Transformers, we achieve state-of-the-art WSSS results on both PASCAL VOC and MS COCO

    Absolute Wrong Makes Better: Boosting Weakly Supervised Object Detection via Negative Deterministic Information

    Full text link
    Weakly supervised object detection (WSOD) is a challenging task, in which image-level labels (e.g., categories of the instances in the whole image) are used to train an object detector. Many existing methods follow the standard multiple instance learning (MIL) paradigm and have achieved promising performance. However, the lack of deterministic information leads to part domination and missing instances. To address these issues, this paper focuses on identifying and fully exploiting the deterministic information in WSOD. We discover that negative instances (i.e. absolutely wrong instances), ignored in most of the previous studies, normally contain valuable deterministic information. Based on this observation, we here propose a negative deterministic information (NDI) based method for improving WSOD, namely NDI-WSOD. Specifically, our method consists of two stages: NDI collecting and exploiting. In the collecting stage, we design several processes to identify and distill the NDI from negative instances online. In the exploiting stage, we utilize the extracted NDI to construct a novel negative contrastive learning mechanism and a negative guided instance selection strategy for dealing with the issues of part domination and missing instances, respectively. Experimental results on several public benchmarks including VOC 2007, VOC 2012 and MS COCO show that our method achieves satisfactory performance.Comment: 7 pages, 5 figure

    A Survey on Label-efficient Deep Image Segmentation: Bridging the Gap between Weak Supervision and Dense Prediction

    Full text link
    The rapid development of deep learning has made a great progress in image segmentation, one of the fundamental tasks of computer vision. However, the current segmentation algorithms mostly rely on the availability of pixel-level annotations, which are often expensive, tedious, and laborious. To alleviate this burden, the past years have witnessed an increasing attention in building label-efficient, deep-learning-based image segmentation algorithms. This paper offers a comprehensive review on label-efficient image segmentation methods. To this end, we first develop a taxonomy to organize these methods according to the supervision provided by different types of weak labels (including no supervision, inexact supervision, incomplete supervision and inaccurate supervision) and supplemented by the types of segmentation problems (including semantic segmentation, instance segmentation and panoptic segmentation). Next, we summarize the existing label-efficient image segmentation methods from a unified perspective that discusses an important question: how to bridge the gap between weak supervision and dense prediction -- the current methods are mostly based on heuristic priors, such as cross-pixel similarity, cross-label constraint, cross-view consistency, and cross-image relation. Finally, we share our opinions about the future research directions for label-efficient deep image segmentation.Comment: Accepted to IEEE TPAM

    Effects of deformation temperature on edge crack characteristics and mechanical properties of as-cast aluminum alloy

    No full text
    In this study, the rolling technique of aluminum alloy was investigated, and the effects of deformation temperature on the edge cracks and mechanical properties of aluminum alloy were studied through a hot compression experiment on high magnesium aluminum alloy. Based on the test, DEFORM-3D software was introduced to optimize the selection of the influence conditions of the experiment. The research results suggested that the crack length of the as-cast aluminum alloy samples decreased with the increase of temperature when the deformation temperature was between 300 °C and 450 °C; the tensile strength and elongation after fracture increased with the increase of temperature when the deformation temperature was between 300 °C and 500 °C. Therefore it is concluded that the cracks of high magnesium aluminum alloy can be reduced through controlling deformation temperature, which provides an idea for the optimization of aluminium alloy

    Phenolic compounds weaken the impact of drought on soil enzyme activity in global wetlands

    Get PDF
    Soil enzymes play a central role in carbon and nutrient cycling, and their activities can be affected by drought-induced oxygen exposure. However, a systematic global estimate of enzyme sensitivity to drought in wetlands is still lacking. Through a meta-analysis of 55 studies comprising 761 paired observations, this study found that phosphorus-related enzyme activity increased by 38% as result of drought in wetlands, while the majority of other soil enzyme activities remained stable. The expansion of vascular plants under long-term drought significantly promoted the accumulation of phenolic compounds. Using a 2-week incubation experiment with phenol supplementation, we found that phosphorus-related enzyme could tolerate higher biotoxicity of phenolic compounds than other enzymes. Moreover, a long-term (35 years) drainage experiment in a northern peatland in China confirmed that the increased phenolic concentration in surface layer resulting from a shift in vegetation composition inhibited the increase in enzyme activities caused by rising oxygen availability, except for phosphorus-related enzyme. Overall, these results demonstrate the complex and resilient nature of wetland ecosystems, with soil enzymes showing a high degree of adaptation to drought conditions. These new insights could help evaluate the impact of drought on future wetland ecosystem services and provide a theoretical foundation for the remediation of degraded wetlands
    corecore