21 research outputs found

    GPUSCAN++^{++}:Efficient Structural Graph Clustering on GPUs

    Full text link
    Structural clustering is one of the most popular graph clustering methods, which has achieved great performance improvement by utilizing GPUs. Even though, the state-of-the-art GPU-based structural clustering algorithm, GPUSCAN, still suffers from efficiency issues since lots of extra costs are introduced for parallelization. Moreover, GPUSCAN assumes that the graph is resident in the GPU memory. However, the GPU memory capacity is limited currently while many real-world graphs are big and cannot fit in the GPU memory, which makes GPUSCAN unable to handle large graphs. Motivated by this, we present a new GPU-based structural clustering algorithm, GPUSCAN++, in this paper. To address the efficiency issue, we propose a new progressive clustering method tailored for GPUs that not only avoid high parallelization costs but also fully exploits the computing resources of GPUs. To address the GPU memory limitation issue, we propose a partition-based algorithm for structural clustering that can process large graphs with limited GPU memory. We conduct experiments on real graphs, and the experimental results demonstrate that our algorithm can achieve up to 168 times speedup compared with the state-of-the-art GPU-based algorithm when the graph can be resident in the GPU memory. Moreover, our algorithm is scalable to handle large graphs. As an example, our algorithm can finish the structural clustering on a graph with 1.8 billion edges using less than 2 GB GPU memory

    Nonparametric Dynamic Curve Monitoring

    No full text
    <p>Rapid sequential comparison between the longitudinal pattern of a given subject and a target pattern has become increasingly important in modern scientific research for detecting abnormal activities in many data-rich applications. This article focuses on this problem when observations are collected sequentially with uncorrelated or correlated noise involved. A dynamic monitoring procedure is developed after connecting the curve monitoring problem to curve comparison. Under the framework of generalized likelihood ratio testing, we suggest a new exponentially weighted moving average (EWMA) control chart that can accommodate unequally spaced design points. An adaptive parameter selection feature is built in the proposed control chart so that the chart can detect a wide range of longitudinal pattern shifts effectively. To furnish fast computation, recursive formulas are derived for computing the charting statistic. Numerical studies show that the proposed method can deliver a satisfactory performance, and it outperforms existing methods in various cases. An example from the semiconductor manufacturing industry is used for the illustration of its implementation. Supplementary materials for this article are available online.</p

    Fully Dynamic Contraction Hierarchies with Label Restrictions on Road Networks

    No full text
    Abstract In the real world, road networks with weight and label on edges can be applied in several application domains. The shortest path query with label restrictions has been receiving increasing attention recently. To efficiently answer such kind of queries, a novel index, namely Contraction Hierarchies with Label Restrictions (CHLR), is proposed in the literature. However, existing studies mainly focus on the static road networks and do not support the CHLR maintenance when the road networks are dynamically changed. Motivated by this, in this paper, we investigate the CHLR maintenance problem in dynamic road networks. We first devise a baseline approach to update CHLR by recomputing the potential affected shortcuts. However, many shortcuts recomputed in baseline do not change in fact, which leads to unnecessary overhead of the baseline. To overcome the drawbacks of baseline, we further propose a novel CHLR maintenance algorithm which can only travel little shortcuts through an update propagate chain with accuracy guarantee. Moreover, an optimization strategy is presented to further improve the efficiency of index maintenance. Considering the frequency of edge changes, we also propose a batch index maintenance algorithm to handle batch edge changes which can process a large number of edge changes at once. Furthermore, a parallel method is proposed to further accelerate calculations. Extensive and comprehensive experiments are conducted on real road networks. The experimental results demonstrate the efficiency and effectiveness of our proposed algorithms

    A Distribution-Free Multivariate Control Chart

    No full text
    <div><p>Monitoring multivariate quality variables or data streams remains an important and challenging problem in statistical process control (SPC). Although the multivariate SPC has been extensively studied in the literature, designing distribution-free control schemes are still challenging and yet to be addressed well. This paper develops a new nonparametric methodology for monitoring location parameters when only a small reference dataset is available. The key idea is to construct a series of conditionally distribution-free test statistics in the sense that their distributions are free of the underlying distribution given the empirical distribution functions. The conditional probability that the charting statistic exceeds the control limit at present given that there is no alarm before the current time point can be guaranteed to attain a specified false alarm rate. The success of the proposed method lies in the use of data-dependent control limits, which are determined based on the observations on-line rather than decided before monitoring. Our theoretical and numerical studies show that the proposed control chart is able to deliver satisfactory in-control run-length performance for any distributions with any dimension. It is also very efficient in detecting multivariate process shifts when the process distribution is heavy-tailed or skewed. Supplementary materials for this article are available online.</p></div

    Finding the Maximum \u3cinline-formula\u3e\u3ctex-math notation= LaTeX \u3ekk\u3c/tex-math\u3e\u3c/inline-formula\u3e-Balanced Biclique on Weighted Bipartite Graphs

    No full text
    Bipartite graphs are widely used to capture the relationships between two types of entities. In bipartite graph analysis, finding the maximum balanced biclique (MBB) is an important problem with numerous applications. A biclique is balanced if its two disjoint vertex sets are of equal size. However, in real-world scenarios, each vertex is associated with a weight to denote its properties, such as influence, i.e., weighted bipartite graph. For weighted bipartite graphs, the previous studies for MBB are no longer applicable due to the ignorance of weight. To fill the gap, in this paper, we propose a reasonable definition of “balance” by restricting the weight difference between two sides of a biclique within kk. Given a weighted bipartite graph GG and a constraint kk, we aim to find the maximum kk-balanced biclique (Max kk BB) with the maximum weight. To address the problem, we first propose an approach based on biclique enumeration on single side of GG following the Branch-and-Bound framework. To improve the performance, we further devise three optimization strategies to prune invalid search branches. Moreover, we utilize graph reduction strategy to reduce the redundant search space. Extensive experiments are conducted on 12 real bipartite datasets to demonstrate the efficiency, effectiveness and scalability of our proposed algorithms. The experimental results show that our algorithms can address MBB detection problem efficiently, and the case study demonstrates the effectiveness of our model compared with MBB model

    Nonparametric Dynamic Curve Monitoring

    No full text
    corecore