18 research outputs found

    ํฐ ๊ทธ๋ž˜ํ”„ ์ƒ์—์„œ์˜ ๊ฐœ์ธํ™”๋œ ํŽ˜์ด์ง€ ๋žญํฌ์— ๋Œ€ํ•œ ๋น ๋ฅธ ๊ณ„์‚ฐ ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2020. 8. ์ด์ƒ๊ตฌ.Computation of Personalized PageRank (PPR) in graphs is an important function that is widely utilized in myriad application domains such as search, recommendation, and knowledge discovery. Because the computation of PPR is an expensive process, a good number of innovative and efficient algorithms for computing PPR have been developed. However, efficient computation of PPR within very large graphs with over millions of nodes is still an open problem. Moreover, previously proposed algorithms cannot handle updates efficiently, thus, severely limiting their capability of handling dynamic graphs. In this paper, we present a fast converging algorithm that guarantees high and controlled precision. We improve the convergence rate of traditional Power Iteration method by adopting successive over-relaxation, and initial guess revision, a vector reuse strategy. The proposed method vastly improves on the traditional Power Iteration in terms of convergence rate and computation time, while retaining its simplicity and strictness. Since it can reuse the previously computed vectors for refreshing PPR vectors, its update performance is also greatly enhanced. Also, since the algorithm halts as soon as it reaches a given error threshold, we can flexibly control the trade-off between accuracy and time, a feature lacking in both sampling-based approximation methods and fully exact methods. Experiments show that the proposed algorithm is at least 20 times faster than the Power Iteration and outperforms other state-of-the-art algorithms.๊ทธ๋ž˜ํ”„ ๋‚ด์—์„œ ๊ฐœ์ธํ™”๋œ ํŽ˜์ด์ง€๋žญํฌ (P ersonalized P age R ank, PPR ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์€ ๊ฒ€์ƒ‰ , ์ถ”์ฒœ , ์ง€์‹๋ฐœ๊ฒฌ ๋“ฑ ์—ฌ๋Ÿฌ ๋ถ„์•ผ์—์„œ ๊ด‘๋ฒ”์œ„ํ•˜๊ฒŒ ํ™œ์šฉ๋˜๋Š” ์ค‘์š”ํ•œ ์ž‘์—… ์ด๋‹ค . ๊ฐœ์ธํ™”๋œ ํŽ˜์ด์ง€๋žญํฌ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์€ ๊ณ ๋น„์šฉ์˜ ๊ณผ์ •์ด ํ•„์š”ํ•˜๋ฏ€๋กœ , ๊ฐœ์ธํ™”๋œ ํŽ˜์ด์ง€๋žญํฌ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ํšจ์œจ์ ์ด๊ณ  ํ˜์‹ ์ ์ธ ๋ฐฉ๋ฒ•๋“ค์ด ๋‹ค์ˆ˜ ๊ฐœ๋ฐœ๋˜์–ด์™”๋‹ค . ๊ทธ๋Ÿฌ๋‚˜ ์ˆ˜๋ฐฑ๋งŒ ์ด์ƒ์˜ ๋…ธ๋“œ๋ฅผ ๊ฐ€์ง„ ๋Œ€์šฉ๋Ÿ‰ ๊ทธ๋ž˜ํ”„์— ๋Œ€ํ•œ ํšจ์œจ์ ์ธ ๊ณ„์‚ฐ์€ ์—ฌ์ „ํžˆ ํ•ด๊ฒฐ๋˜์ง€ ์•Š์€ ๋ฌธ์ œ์ด๋‹ค . ๊ทธ์— ๋”ํ•˜์—ฌ , ๊ธฐ์กด ์ œ์‹œ๋œ ์•Œ๊ณ ๋ฆฌ๋“ฌ๋“ค์€ ๊ทธ๋ž˜ํ”„ ๊ฐฑ์‹ ์„ ํšจ์œจ์ ์œผ๋กœ ๋‹ค๋ฃจ์ง€ ๋ชปํ•˜์—ฌ ๋™์ ์œผ๋กœ ๋ณ€ํ™”ํ•˜๋Š” ๊ทธ๋ž˜ํ”„๋ฅผ ๋‹ค๋ฃจ๋Š” ๋ฐ์— ํ•œ๊ณ„์ ์ด ํฌ๋‹ค . ๋ณธ ์—ฐ๊ตฌ์—์„œ๋Š” ๋†’์€ ์ •๋ฐ€๋„๋ฅผ ๋ณด์žฅํ•˜๊ณ  ์ •๋ฐ€๋„๋ฅผ ํ†ต์ œ ๊ฐ€๋Šฅํ•œ , ๋น ๋ฅด๊ฒŒ ์ˆ˜๋ ดํ•˜๋Š” ๊ฐœ์ธํ™”๋œ ํŽ˜์ด์ง€๋žญํฌ ๊ณ„์‚ฐ ์•Œ๊ณ ๋ฆฌ๋“ฌ์„ ์ œ์‹œํ•œ๋‹ค . ์ „ํ†ต์ ์ธ ๊ฑฐ๋“ญ์ œ๊ณฑ๋ฒ• (Power ์— ์ถ•์ฐจ๊ฐ€์†์™„ํ™”๋ฒ• (Successive Over Relaxation) ๊ณผ ์ดˆ๊ธฐ ์ถ”์ธก ๊ฐ’ ๋ณด์ •๋ฒ• (Initial Guess ์„ ํ™œ์šฉํ•œ ๋ฒกํ„ฐ ์žฌ์‚ฌ์šฉ ์ „๋žต์„ ์ ์šฉํ•˜์—ฌ ์ˆ˜๋ ด ์†๋„๋ฅผ ๊ฐœ์„ ํ•˜์˜€๋‹ค . ์ œ์‹œ๋œ ๋ฐฉ๋ฒ•์€ ๊ธฐ์กด ๊ฑฐ๋“ญ์ œ๊ณฑ๋ฒ•์˜ ์žฅ์ ์ธ ๋‹จ์ˆœ์„ฑ๊ณผ ์—„๋ฐ€์„ฑ์„ ์œ ์ง€ ํ•˜๋ฉด์„œ ๋„ ์ˆ˜๋ ด์œจ๊ณผ ๊ณ„์‚ฐ์†๋„๋ฅผ ํฌ๊ฒŒ ๊ฐœ์„  ํ•œ๋‹ค . ๋˜ํ•œ ๊ฐœ์ธํ™”๋œ ํŽ˜์ด์ง€๋žญํฌ ๋ฒกํ„ฐ์˜ ๊ฐฑ์‹ ์„ ์œ„ํ•˜์—ฌ ์ด์ „์— ๊ณ„์‚ฐ ๋˜์–ด ์ €์žฅ๋œ ๋ฒกํ„ฐ๋ฅผ ์žฌ์‚ฌ์šฉํ•˜ ์—ฌ , ๊ฐฑ์‹  ์— ๋“œ๋Š” ์‹œ๊ฐ„์ด ํฌ๊ฒŒ ๋‹จ์ถ•๋œ๋‹ค . ๋ณธ ๋ฐฉ๋ฒ•์€ ์ฃผ์–ด์ง„ ์˜ค์ฐจ ํ•œ๊ณ„์— ๋„๋‹ฌํ•˜๋Š” ์ฆ‰์‹œ ๊ฒฐ๊ณผ๊ฐ’์„ ์‚ฐ์ถœํ•˜๋ฏ€๋กœ ์ •ํ™•๋„์™€ ๊ณ„์‚ฐ์‹œ๊ฐ„์„ ์œ ์—ฐํ•˜๊ฒŒ ์กฐ์ ˆํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ ์ด๋Š” ํ‘œ๋ณธ ๊ธฐ๋ฐ˜ ์ถ”์ •๋ฐฉ๋ฒ•์ด๋‚˜ ์ •ํ™•ํ•œ ๊ฐ’์„ ์‚ฐ์ถœํ•˜๋Š” ์—ญํ–‰๋ ฌ ๊ธฐ๋ฐ˜ ๋ฐฉ๋ฒ• ์ด ๊ฐ€์ง€์ง€ ๋ชปํ•œ ํŠน์„ฑ์ด๋‹ค . ์‹คํ—˜ ๊ฒฐ๊ณผ , ๋ณธ ๋ฐฉ๋ฒ•์€ ๊ฑฐ๋“ญ์ œ๊ณฑ๋ฒ•์— ๋น„ํ•˜์—ฌ 20 ๋ฐฐ ์ด์ƒ ๋น ๋ฅด๊ฒŒ ์ˆ˜๋ ดํ•œ๋‹ค๋Š” ๊ฒƒ์ด ํ™•์ธ๋˜์—ˆ์œผ๋ฉฐ , ๊ธฐ ์ œ์‹œ๋œ ์ตœ๊ณ  ์„ฑ๋Šฅ ์˜ ์•Œ๊ณ ๋ฆฌ ๋“ฌ ๋ณด๋‹ค ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ด๋Š” ๊ฒƒ ๋˜ํ•œ ํ™•์ธ๋˜์—ˆ๋‹ค1 Introduction 1 2 Preliminaries: Personalized PageRank 4 2.1 Random Walk, PageRank, and Personalized PageRank. 5 2.1.1 Basics on Random Walk 5 2.1.2 PageRank. 6 2.1.3 Personalized PageRank 8 2.2 Characteristics of Personalized PageRank. 9 2.3 Applications of Personalized PageRank. 12 2.4 Previous Work on Personalized PageRank Computation. 17 2.4.1 Basic Algorithms 17 2.4.2 Enhanced Power Iteration 18 2.4.3 Bookmark Coloring Algorithm. 20 2.4.4 Dynamic Programming 21 2.4.5 Monte-Carlo Sampling. 22 2.4.6 Enhanced Direct Solving 24 2.5 Summary 26 3 Personalized PageRank Computation with Initial Guess Revision 30 3.1 Initial Guess Revision and Relaxation 30 3.2 Finding Optimal Weight of Successive Over Relaxation for PPR. 34 3.3 Initial Guess Construction Algorithm for Personalized PageRank. 36 4 Fully Personalized PageRank Algorithm with Initial Guess Revision 42 4.1 FPPR with IGR. 42 4.2 Optimization. 49 4.3 Experiments. 52 5 Personalized PageRank Query Processing with Initial Guess Revision 56 5.1 PPR Query Processing with IGR 56 5.2 Optimization. 64 5.3 Experiments. 67 6 Conclusion 74 Bibliography 77 Appendix 88 Abstract (In Korean) 90Docto

    Graph Spectral Image Processing

    Full text link
    Recent advent of graph signal processing (GSP) has spurred intensive studies of signals that live naturally on irregular data kernels described by graphs (e.g., social networks, wireless sensor networks). Though a digital image contains pixels that reside on a regularly sampled 2D grid, if one can design an appropriate underlying graph connecting pixels with weights that reflect the image structure, then one can interpret the image (or image patch) as a signal on a graph, and apply GSP tools for processing and analysis of the signal in graph spectral domain. In this article, we overview recent graph spectral techniques in GSP specifically for image / video processing. The topics covered include image compression, image restoration, image filtering and image segmentation

    A brief survey of visual saliency detection

    Get PDF

    Rethinking auto-colourisation of natural Images in the context of deep learning

    Get PDF
    Auto-colourisation is the ill-posed problem of creating a plausible full-colour image from a grey-scale prior. The current state of the art utilises image-to-image Generative Adversarial Networks (GANs). The standard method for training colourisation is reformulating RGB images into a luminance prior and two-channel chrominance supervisory signal. However, progress in auto-colourisation is inherently limited by multiple prerequisite dilemmas, where unsolved problems are mutual prerequisites. This thesis advances the field of colourisation on three fronts: architecture, measures, and data. Changes are recommended to common GAN colourisation architectures. Firstly, removing batch normalisation from the discriminator to allow the discriminator to learn the primary statistics of plausible colour images. Secondly, eliminating the direct L1 loss on the generator as L1 will limit the discovery of the plausible colour manifold. The lack of an objective measure of plausible colourisation necessitates resource-intensive human evaluation and repurposed objective measures from other fields. There is no consensus on the best objective measure due to a knowledge gap regarding how well objective measures model the mean human opinion of plausible colourisation. An extensible data set of human-evaluated colourisations, the Human Evaluated Colourisation Dataset (HECD) is presented. The results from this dataset are compared to the commonly-used objective measures and uncover a poor correlation between the objective measures and mean human opinion. The HECD can assess the future appropriateness of proposed objective measures. An interactive tool supplied with the HECD allows for a first exploration of the space of plausible colourisation. Finally, it will be shown that the luminance channel is not representative of the legacy black-and-white images that will be presented to models when deployed; This leads to out-of-distribution errors in all three channels of the final colour image. A novel technique is proposed to simulate priors that match any black-and-white media for which the spectral response is known

    A picture is worth a thousand words : content-based image retrieval techniques

    Get PDF
    In my dissertation I investigate techniques for improving the state of the art in content-based image retrieval. To place my work into context, I highlight the current trends and challenges in my field by analyzing over 200 recent articles. Next, I propose a novel paradigm called __artificial imagination__, which gives the retrieval system the power to imagine and think along with the user in terms of what she is looking for. I then introduce a new user interface for visualizing and exploring image collections, empowering the user to navigate large collections based on her own needs and preferences, while simultaneously providing her with an accurate sense of what the database has to offer. In the later chapters I present work dealing with millions of images and focus in particular on high-performance techniques that minimize memory and computational use for both near-duplicate image detection and web search. Finally, I show early work on a scene completion-based image retrieval engine, which synthesizes realistic imagery that matches what the user has in mind.LEI Universiteit LeidenNWOImagin

    Edge-preserving colorization using data-driven random walks with restart

    No full text
    In this paper, we consider the colorization problem of grayscale images in which some color scribbles are initially given. Our proposed method is based on the weighted color blending of the scribbles. Unlike previous works which utilize the shortest distance as the blending weights, we employ a new intrinsic distance measure based on the Random Walks with Restart (RWR), known as a very successful technique for defining the relevance between two nodes in a graph. In our work, we devise new modified data-driven RWR framework that can incorporate locally adaptive and data-driven restarting probabilities. In this new framework, the restarting probability of each pixel becomes dependent on its edgeness, generated by the canny detector. Since this data-driven RWR enforces color consistency in the areas bounded by the edges, it produces more reliable edge-preserving colorization results that are less sensitive to the size and position of each scribble. Moreover, if the additional information about the scribbles which indicate the foreground object is available, our method can be readily applied to the object segmentation and matting. Experiments on several synthetic, cartoon and natural images demonstrate that our method achieves much high quality colorization results compared with the state-of-the-art methods. Index Terms โ€” Data-Driven Random Walks with Restart, color blending, edge-preserving colorization

    Analyzing Granger causality in climate data with time series classification methods

    Get PDF
    Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested

    19th SC@RUG 2022 proceedings 2021-2022

    Get PDF

    19th SC@RUG 2022 proceedings 2021-2022

    Get PDF
    corecore