1,379 research outputs found

    Arithmetic and Boolean Operations on Recursively Run-Length Compressed Natural Numbers

    Get PDF
    We study arithmetic properties of a new tree-based canonical num- ber representation, recursively run-length compressed natural numbers, defined by applying recursively a run-length encoding of their binary digits. We design arithmetic and boolean operations with recursively run- length compressed natural numbers that work a block of digits at a time and are limited only by the representation complexity of their operands, rather than their bitsizes. As a result, operations on very large numbers exhibiting a regular structure become tractable. In addition, we ensure that the average complexity of our operations is still within constant factors of the usual arithmetic operations on binary numbers. Arithmetic operations on our recursively run-length compressed are specified as pattern-directed recursive equations made executable by using a purely declarative subset of the functional language Haskell

    Visualizing and Understanding Sum-Product Networks

    Full text link
    Sum-Product Networks (SPNs) are recently introduced deep tractable probabilistic models by which several kinds of inference queries can be answered exactly and in a tractable time. Up to now, they have been largely used as black box density estimators, assessed only by comparing their likelihood scores only. In this paper we explore and exploit the inner representations learned by SPNs. We do this with a threefold aim: first we want to get a better understanding of the inner workings of SPNs; secondly, we seek additional ways to evaluate one SPN model and compare it against other probabilistic models, providing diagnostic tools to practitioners; lastly, we want to empirically evaluate how good and meaningful the extracted representations are, as in a classic Representation Learning framework. In order to do so we revise their interpretation as deep neural networks and we propose to exploit several visualization techniques on their node activations and network outputs under different types of inference queries. To investigate these models as feature extractors, we plug some SPNs, learned in a greedy unsupervised fashion on image datasets, in supervised classification learning tasks. We extract several embedding types from node activations by filtering nodes by their type, by their associated feature abstraction level and by their scope. In a thorough empirical comparison we prove them to be competitive against those generated from popular feature extractors as Restricted Boltzmann Machines. Finally, we investigate embeddings generated from random probabilistic marginal queries as means to compare other tractable probabilistic models on a common ground, extending our experiments to Mixtures of Trees.Comment: Machine Learning Journal paper (First Online), 24 page

    ๊ณ ์„ฑ๋Šฅ ์ธ๊ณต ์‹ ๊ฒฝ๋ง์„ ์œ„ํ•œ ๋ฉ”๋ชจ๋ฆฌ ๋ ˆ์ด์•„์›ƒ ๋ฐ ์ปดํ“จํŒ… ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2021. 2. ๊น€ํƒœํ™˜.์ธ๊ณต ์‹ ๊ฒฝ๋ง ์—ฐ์‚ฐ์„ ์ˆ˜ํ–‰ํ•˜๊ณ ์ž ํ•˜๋Š” ์ˆ˜์š”๊ฐ€ ๊พธ์ค€ํžˆ ์ฆ๊ฐ€ํ•˜๊ณ  ์žˆ์ง€๋งŒ, ๊นŠ์€ ์ธ๊ณต ์‹ ๊ฒฝ๋ง์—๋Š” ๊ณผ๋„ํ•œ ๋ฉ”๋ชจ๋ฆฌ์™€ ๊ณ„์‚ฐ ๋น„์šฉ์ด ์ˆ˜๋ฐ˜๋˜๊ธฐ ๋•Œ๋ฌธ์— ๋งŽ์€ ์„ค๊ณ„ ๋ฌธ์ œ๊ฐ€ ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์ธ๊ณต ์‹ ๊ฒฝ๋ง ์ถ”๋ก  ์—ฐ์‚ฐ์„ ํšจ๊ณผ์ ์œผ๋กœ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•œ ์—ฌ๋Ÿฌ ๊ฐ€์ง€ ์ƒˆ๋กœ์šด ๊ธฐ์ˆ ์„ ์—ฐ๊ตฌํ•œ๋‹ค. ์ฒซ ๋ฒˆ์งธ๋กœ, ์ตœ๋Œ€ ๊ณ„์‚ฐ ์†๋„ ํ–ฅ์ƒ์ด ๊ฐ€์ค‘์น˜์˜ 0 ์•„๋‹Œ ๋น„ํŠธ์˜ ์ด ์ˆ˜์— ์˜ํ•ด ์ œํ•œ๋˜๋Š” ํ•œ๊ณ„์˜ ๊ทน๋ณต์„ ์‹œ๋„ํ•œ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, ๋ถ€ํ˜ธ์žˆ๋Š” ์ˆซ์ž ์ธ์ฝ”๋”ฉ์— ๊ธฐ๋ฐ˜ํ•œ ๋ณธ ์—ฐ๊ตฌ์—์„œ, (1) ๋ชจ๋“  ๊ฐ€์ค‘์น˜์˜ 2์˜ ๋ณด์ˆ˜ ํ‘œํ˜„์„ ํ•„์ˆ˜ ๋น„ํŠธ๋ฅผ ์ตœ์†Œ๋กœ ํ•˜๋Š” ๋ถ€ํ˜ธ์žˆ๋Š” ์ˆซ์ž ํ‘œํ˜„์˜ ์ง‘ํ•ฉ์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ๋ณ€ํ™˜ ๊ธฐ๋ฒ•์„ ์ œ์•ˆํ•˜๋ฉฐ, (2) ๊ฐ€์ค‘์น˜์˜ ๋น„ํŠธ ๋‹จ์œ„ ๊ณฑ์…ˆ์˜ ๋ณ‘๋ ฌ์„ฑ์„ ์ตœ๋Œ€ํ•˜ํ™”๋Š” ๊ฐ€์ค‘์น˜์˜ ๋ถ€ํ˜ธ์žˆ๋Š” ์ˆซ์ž ํ‘œํ˜„์„ ์„ ํƒํ•˜๋Š” ๋ฌธ์ œ๋ฅผ ์ˆซ์ž ์ธ๋ฑ์Šค (์—ด ๋‹จ์œ„) ์••์ถ• ์ตœ๋Œ€ํ™”๋ฅผ ๋‹ฌ์„ฑํ•˜๋„๋ก ๋‹ค๋ชฉ์  ์ตœ๋‹จ ๊ฒฝ๋กœ ๋ฌธ์ œ๋กœ ๊ณต์‹ํ™”ํ•˜์—ฌ ๊ทผ์‚ฌ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜์—ฌ ํšจ์œจ์ ์œผ๋กœ ํ•ด๊ฒฐํ•˜๋ฉฐ, (3) ์ฃผ์š” ํ•˜๋“œ์›จ์–ด๋ฅผ ์ถ”๊ฐ€๋กœ ํฌํ•จํ•˜์ง€ ์•Š๊ณ  ์•ž์„œ ์ œ์•ˆํ•œ ๊ธฐ๋ฒ•์„ ์ง€์›ํ•˜๋Š” ์ƒˆ๋กœ์šด ๊ฐ€์†๊ธฐ ์•„ํ‚คํ…์ฒ˜(DWP)๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ๋˜ํ•œ, ์šฐ๋ฆฌ๋Š” (4) ๋ณ‘๋ ฌ ์ฒ˜๋ฆฌ์—์„œ ์ตœ์•…์˜ ์ง€์—ฐ ์‹œ๊ฐ„์„ ์—„๊ฒฉํ•˜๊ฒŒ ์˜ˆ์ธกํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์ด ํฌํ•จ๋œ ๋น„ํŠธ ๋‹จ์œ„ ๋ณ‘๋ ฌ ๊ณฑ์…ˆ์„ ์ง€์›ํ•˜๋„๋ก ๋‹ค๋ฅธ ํ˜•ํƒœ์˜ DWP๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ์‹คํ—˜์„ ํ†ตํ•ด ๋ณธ ์—ฐ๊ตฌ์—์„œ ์ œ์•ˆํ•˜๋Š” ์ ‘๊ทผ ๋ฐฉ๋ฒ•์€ ํ•„์ˆ˜ ๋น„ํŠธ ์ˆ˜๋ฅผ AlexNet์—์„œ 69%, VGG-16์—์„œ 74%, ResNet-152์—์„œ 68%๊นŒ์ง€ ์ค„์ผ ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ๋˜ํ•œ ์ด๋ฅผ ์ง€์›ํ•˜๋Š” ๊ฐ€์†๊ธฐ๋Š” ์ถ”๋ก  ์—ฐ์‚ฐ ์‹œ๊ฐ„์„ ๊ธฐ์กด์˜ ๋น„ํŠธ ๋‹จ์œ„ ๊ฐ€์ค‘์น˜ ๊ฐ€์ง€์น˜๊ธฐ ๋ฐฉ๋ฒ•์— ๋น„ํ•ด ์ตœ๋Œ€ 3.57๋ฐฐ๊นŒ์ง€ ๊ฐ์†Œ์‹œ์ผฐ๋‹ค. ๋‘ ๋ฒˆ์งธ๋กœ, ์ด์ง„ ๋ฐ ์‚ผ์ง„ ๊ฐ€์ค‘์น˜์˜ ์ปจ๋ณผ๋ฃจ์…˜ ์ธ๊ณต ์‹ ๊ฒฝ๋ง์—์„œ ์ปจ๋ณผ๋ฃจ์…˜ ๊ฐ„์˜ ์ค‘๋ณต ์—ฐ์‚ฐ์„ ์ตœ๋Œ€ํ•œ ์ œ๊ฑฐํ•˜๊ธฐ ์œ„ํ•˜์—ฌ ๊ณตํ†ต ์ปค๋„ ๋ฐ ์ปจ๋ณผ๋ฃจ์…˜์„ ์ถ”์ถœํ•˜๋Š” ์ƒˆ๋กœ์šด ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์‹œํ•œ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, (1) ๊ธฐ์กด ๋ฐฉ๋ฒ•์—์„œ ๊ณตํ†ต ์ปค๋„ ํ›„๋ณด์˜ ๊ตญ๋ถ€์ ์ด๊ณ  ์ œํ•œ์ ์ธ ํƒ์ƒ‰์„ ๊ทน๋ณตํ•˜๊ธฐ ์œ„ํ•œ ์ƒˆ๋กœ์šด ๊ณตํ†ต ์ปค๋„ ์ถ”์ถœ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์•ˆํ•˜๊ณ , ์ดํ›„์— (2) ์ปจ๋ณผ๋ฃจ์…˜ ์—ฐ์‚ฐ์—์„œ์˜ ์ค‘๋ณต์„ฑ์„ ์ตœ๋Œ€ํ•œ์œผ๋กœ ์ œ๊ฑฐํ•˜๊ธฐ ์œ„ํ•œ ์ƒˆ๋กœ์šด ๊ฐœ๋…์˜ ๊ณตํ†ต ์ปจ๋ณผ๋ฃจ์…˜ ์ถ”์ถœ์„ ์ ์šฉํ•œ๋‹ค. ๋˜ํ•œ, ์šฐ๋ฆฌ์˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ (3) ์ปจ๋ณผ๋ฃจ์…˜์— ๋Œ€ํ•ด ์ตœ์ข…์ ์œผ๋กœ ๋„์ถœ๋œ ์ปค๋„ ์ˆ˜๋ฅผ ์ตœ์†Œํ™”ํ•˜์—ฌ ์ปค๋„์— ๋Œ€ํ•œ ์ด ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ ์ง€์—ฐ ์‹œ๊ฐ„์„ ์ ˆ์•ฝํ•  ์ˆ˜ ์žˆ๋‹ค. ์‚ผ์ง„ ๊ฐ€์ค‘์น˜์˜ VGG-16์— ๋Œ€ํ•œ ์‹คํ—˜ ๊ฒฐ๊ณผ๋กœ ๋ชจ๋“  ์ปจ๋ณผ๋ฃจ์…˜์— ๋Œ€ํ•œ ์ด ์—ฐ์‚ฐ ์ˆ˜๋ฅผ 25.8-26.3% ๊ฐ์†Œ์‹œ์ผœ, ์ตœ์‹  ์•Œ๊ณ ๋ฆฌ์ฆ˜์œผ๋กœ ์ถ”์ถœํ•œ ๊ณตํ†ต ์ปค๋„์„ ์‚ฌ์šฉํ•˜๋Š” ์ปจ๋ณผ๋ฃจ์…˜์— ๋น„ํ•ด 2.7-3.8% ๋” ์ ์€ ์ปค๋„์„ ์‚ฌ์šฉํ•˜๋Š” ๋™์•ˆ ํ•˜๋“œ์›จ์–ด ํ”Œ๋žซํผ์—์„œ์˜ ์ด ์ˆ˜ํ–‰ ์‚ฌ์ดํด์„ 22.4% ๊ฐ์†Œ์‹œํ‚ด์œผ๋กœ์จ ์šฐ๋ฆฌ๊ฐ€ ์ œ์•ˆํ•œ ์ปจ๋ณผ๋ฃจ์…˜ ์ตœ์ ํ™” ์•Œ๊ณ ๋ฆฌ์ฆ˜์ด ๋งค์šฐ ํšจ๊ณผ์ ์ž„์„ ๋ณด์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ์šฐ๋ฆฌ๋Š” ์••์ถ•๋œ DNN์˜ ๋ชจ๋“  ๊ณ ์œ  ๊ฐ€์ค‘์น˜๋“ค์„ ์˜จ-์นฉ ๋ฉ”๋ชจ๋ฆฌ์— ์™„์ „ํžˆ ํฌํ•จํ•  ์ˆ˜ ์—†๋Š” ๊ฒฝ์šฐ ์ •ํ™•๋„ ์œ ์ง€๋ฅผ ์œ„ํ•ด ๋ถ€์ ํ•ฉ ์••์ถ•์„ ์‚ฌ์šฉํ•˜๋Š” DNN ์†”๋ฃจ์…˜์„ ์ œ์•ˆํ•œ๋‹ค. ๊ตฌ์ฒด์ ์œผ๋กœ, ๊ฐ€์ค‘์น˜์˜ ์ ‘๊ทผ ์‹œํ€€์Šค๊ฐ€ ์ฃผ์–ด์ง€๋ฉด, (1) ์ฒซ ๋ฒˆ์งธ ๋ฌธ์ œ๋Š” ์˜คํ”„-์นฉ ๋ฉ”๋ชจ๋ฆฌ์˜ ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ ์ˆ˜(์ ‘๊ทผ์— ์˜ํ•ด ์†Œ๋น„๋˜๋Š” ์—๋„ˆ์ง€)๋ฅผ ์ตœ์†Œํ™”ํ•˜๋„๋ก ์˜คํ”„-์นฉ ๋ฉ”๋ชจ๋ฆฌ์— ๊ฐ€์ค‘์น˜๋ฅผ ๋ฐฐ์—ดํ•˜๋Š” ๊ฒƒ์ด๊ณ , (2) ๋‘ ๋ฒˆ์งธ ๋ฌธ์ œ๋Š” ๋ธ”๋ก ๊ต์ฒด๋ฅผ ์œ„ํ•œ ์ธ๋ฑ์Šค ํƒ์ƒ‰์— ์†Œ๋น„๋˜๋Š” ์˜ค๋ฒ„ํ—ค๋“œ์™€ ์˜คํ”„-์นฉ ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ์— ์†Œ๋ชจ๋˜๋Š” ์ด ์—๋„ˆ์ง€์˜ ์ตœ์†Œํ™”๋ฅผ ๋ชฉ์ ์œผ๋กœ ํ•˜์—ฌ ๋ธ”๋ก ๋ฏธ์Šค ๋ฐœ์ƒ ์‹œ ์˜จ-์นฉ ๋ฉ”๋ชจ๋ฆฌ์—์„œ ๊ต์ฒด๋  ๊ฐ€์ค‘์น˜ ๋ธ”๋ก์„ ์„ ํƒํ•˜๋Š” ์ „๋žต์„ ๊ณ ์•ˆํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์••์ถ•๋œ AlexNet ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•œ ์‹คํ—˜์„ ํ†ตํ•ด ์šฐ๋ฆฌ์˜ ์†”๋ฃจ์…˜์€ ์ตœ์ ํ™”๋˜์ง€ ์•Š์€ ๋ฉ”๋ชจ๋ฆฌ ๋ ˆ์ด์•„์›ƒ ๋ฐ LRU ๊ต์ฒด ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒฝ์šฐ์— ๋น„ํ•ด ํƒ์ƒ‰ ์˜ค๋ฒ„ํ—ค๋“œ๋ฅผ ํฌํ•จํ•˜์—ฌ ์˜คํ”„-์นฉ ๋ฉ”๋ชจ๋ฆฌ ์ ‘๊ทผ์— ํ•„์š”ํ•œ ์ด ์—๋„ˆ์ง€ ์†Œ๋น„๋ฅผ ํ‰๊ท  34.2%๊นŒ์ง€ ์ค„์ผ ์ˆ˜ ์žˆ์Œ์„ ๋ณด์˜€๋‹ค.Although the demand for exploiting neural networks is steadily increasing, there are many design challenges since deep neural networks (DNNs) entail excessive memory and computation cost. This dissertation studies a number of new techniques for effectively processing DNN inference operations. Firstly, we attempt to overcome that the maximal computation speedup is bounded by the total number of non-zero bits of the weights. Precisely, this work, based on the signed-digit encoding, (1) proposes a transformation technique which converts the twos complement representation of every weight into a set of signed-digit representations of the minimal number of essential bits, (2) formulates the problem of selecting signed-digit representations of weights that maximize the parallelism of bit-level multiplication on the weights into a multi-objective shortest path problem to achieve a maximal digit-index by digit-index (i.e. column-wise) compression for the weights and solves it efficiently using an approximation algorithm, and (3) proposes a supporting novel acceleration architecture (DWP) with no additional inclusion of non-trivial hardware. In addition, we (4) propose a variant of DWP to support bit-level parallel multiplication with the capability of predicting a tight worst-case latency of the parallel processing. Through experiments on several representative models using the ImageNet dataset, it is shown that our proposed approach is able to reduce the number of essential bits by 69% on AlexNet, 74% on VGG-16, and 68% on ResNet-152, by which our accelerator is able to reduce the inference computation time by up to 3.57x over the conventional bit-level weight pruning. Secondly, a new algorithm for extracting common kernels and convolutions to maximally eliminate the redundant operations among the convolutions in binary- and ternary-weight convolutional neural networks is presented. Specifically, we propose (1) a new algorithm of common kernel extraction to overcome the local and limited exploration of common kernel candidates by the existing method, and subsequently apply (2) a new concept of common convolution extraction to maximally eliminate the redundancy in the convolution operations. In addition, our algorithm is able to (3) tune in minimizing the number of resulting kernels for convolutions, thereby saving the total memory access latency for kernels. Experimental results on ternary-weight VGG-16 demonstrate that our convolution optimization algorithm is very effective, reducing the total number of operations for all convolutions by 25.8-26.3%, thereby reducing the total number of execution cycles on hardware platform by 22.4% while using 2.7-3.8% fewer kernels over that of the convolution utilizing the common kernels extracted by the state-of-the-art algorithm. Finally, we propose solutions for DNNs with unfitted compression to maintain the accuracy, in which all distinct weights of the compressed DNNs could not be entirely contained in on-chip memory. Precisely, given an access sequence of weights, (1) the first problem is to arrange the weights in off-chip memory, so that the number of memory accesses to the off-chip memory (equivalently the energy consumed by the accesses) be minimized, and (2) the second problem is to devise a strategy of selecting a weight block in on-chip memory for replacement when a block miss occurs, with the objective of minimizing the total energy consumed by the off-chip memory accesses and the overhead of scanning indexes for block replacement. Through experiments with the model of compressed AlexNet, it is shown that our solutions are able to reduce the total energy consumption of the off-chip memory accesses including the scanning overhead by 34.2% on average over the use of unoptimized memory layout and LRU replacement scheme.1 Introduction 1 1.1 Deep Neural Networks and Its Challenges 1 1.2 Redundant Weight Elimination Methods in DNN 4 1.3 Redundant Representation Elimination Methods in DNN 8 1.4 Contributions of This Dissertation 12 2 Bit-level Weight Pruning Techniques for High-Performance Neural Networks 17 2.1 Preliminary 17 2.1.1 Bit-level Weight Pruning in Binary Representation 17 2.1.2 Bit-level Weight Pruning in Signed-digit Representation 19 2.1.3 CSD Representation Conversion 21 2.2 Motivations 23 2.2.1 Inefficiency in Two's Complement Representation 23 2.2.2 Inability to Exploit Signed-digit Representation 25 2.3 Signed-digit Representation-based Deeper Weight Pruning 28 2.3.1 Generating Signed-digit Representations 28 2.3.2 Selecting Signed-digit Representations for Maximal Parallelism 30 2.3.3 Extension to the Low-precision Weights 32 2.4 Supporting Hardware Architecture 33 2.4.1 Technique for Using a Single Bit to Encode Ternary Value 33 2.4.2 Structure of Supporting Architecture 35 2.4.3 Memory Analysis 37 2.4.4 Full Utilization of Accumulation Adders 38 2.4.5 Modification for Hybrid Approach 38 2.5 Bit-level Intra-weight Pruning 41 2.5.1 Signed-digit Representation Conversion 41 2.5.2 Encoding Technique 41 2.5.3 Supporting Hardware Architecture 42 2.6 Experimental Results 44 2.6.1 Essential Bits 44 2.6.2 Memory Usage 46 2.6.3 Performance 46 2.6.4 Area 50 2.6.5 Energy Efficiency 56 3 Convolution Computation Techniques for High-Performance Neural Networks 59 3.1 Motivations 59 3.1.1 Limited Space Exploration for Common Kernels 59 3.1.2 Inability to Exploit Common Expressions of Convolution Values 61 3.2 The Proposed Algorithm 63 3.2.1 Common Kernel Extraction 63 3.2.2 Common Convolution Extraction 67 3.2.3 Memory Access Minimization 69 3.3 Hardware Implementation 70 3.4 Experimental Results 72 3.4.1 Experimental Setup 72 3.4.2 Assessing Effectiveness of ConvOpt-op and ConvOpt-mem 72 3.4.3 Measuring Performance through Hardware Implementation 78 3.4.4 Running Time of ConvOpt 78 4 Memory Layout and Block Replacement Techniques for High-Performance Neural Networks 81 4.1 Motivation 81 4.2 Algorithms for Off-chip Memory Access Optimization for DNNs with Unfitted Compression 84 4.2.1 Algorithm for Off-chip Memory Layout 84 4.2.2 Algorithm for On-chip Memory Block Replacement 86 4.2.3 Exploitation of Parallel Computing 91 4.3 Experimental Results 94 4.3.1 Experimental Setup 94 4.3.2 Assessing the Effectiveness of Mem-layout 94 4.3.3 Assessing the Effectiveness of MIN-k Combined with Mem-layout 97 5 Conclusions 101 5.1 Bit-level Weight Pruning Techniques for High-Performance Neural Networks 101 5.2 Convolution Computation Techniques for High-Performance Neural Networks 102 5.3 Memory Layout and Block Replacement Techniques for High-Performance Neural Networks 102 Abstract (In Korean) 117Docto

    The {RDF}-3X Engine for Scalable Management of {RDF} Data

    Get PDF
    RDF is a data model for schema-free structured information that is gaining momentum in the context of Semantic-Web data, life sciences, and also Web 2.0 platforms. The ``pay-as-you-go'' nature of RDF and the flexible pattern-matching capabilities of its query language SPARQL entail efficiency and scalability challenges for complex queries including long join paths. This paper presents the RDF-3X engine, an implementation of SPARQL that achieves excellent performance by pursuing a RISC-style architecture with streamlined indexing and query processing. The physical design is identical for all RDF-3X databases regardless of their workloads, and completely eliminates the need for index tuning by exhaustive indexes for all permutations of subject-property-object triples and their binary and unary projections. These indexes are highly compressed, and the query processor can aggressively leverage fast merge joins with excellent performance of processor caches. The query optimizer is able to choose optimal join orders even for complex queries, with a cost model that includes statistical synopses for entire join paths. Although RDF-3X is optimized for queries, it also provides good support for efficient online updates by means of a staging architecture: direct updates to the main database indexes are deferred, and instead applied to compact differential indexes which are later merged into the main indexes in a batched manner. Experimental studies with several large-scale datasets with more than 50 million RDF triples and benchmark queries that include pattern matching, manyway star-joins, and long path-joins demonstrate that RDF-3X can outperform the previously best alternatives by one or two orders of magnitude

    Community detection and stochastic block models: recent developments

    Full text link
    The stochastic block model (SBM) is a random graph model with planted clusters. It is widely employed as a canonical model to study clustering and community detection, and provides generally a fertile ground to study the statistical and computational tradeoffs that arise in network and data sciences. This note surveys the recent developments that establish the fundamental limits for community detection in the SBM, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery (a.k.a., detection). The main results discussed are the phase transitions for exact recovery at the Chernoff-Hellinger threshold, the phase transition for weak recovery at the Kesten-Stigum threshold, the optimal distortion-SNR tradeoff for partial recovery, the learning of the SBM parameters and the gap between information-theoretic and computational thresholds. The note also covers some of the algorithms developed in the quest of achieving the limits, in particular two-round algorithms via graph-splitting, semi-definite programming, linearized belief propagation, classical and nonbacktracking spectral methods. A few open problems are also discussed
    • โ€ฆ
    corecore