502 research outputs found

    A Pattern Classification Based approach for Blur Classification

    Get PDF
    Blur type identification is one of the most crucial step of image restoration. In case of blind restoration of such images, it is generally assumed that the blur type is known prior to restoration of such images. However, it is not practical in real applications. So, blur type identification is extremely desirable before application of blind restoration technique to restore a blurred image. An approach to categorize blur in three classes namely motion, defocus, and combined blur is presented in this paper. Curvelet transform based energy features are utilized as features of blur patterns and a neural network is designed for classification. The simulation results show preciseness of proposed approach

    A novel neural network approach to cDNA microarray image segmentation

    Get PDF
    This is the post-print version of the Article. The official published version can be accessed from the link below. Copyright @ 2013 Elsevier.Microarray technology has become a great source of information for biologists to understand the workings of DNA which is one of the most complex codes in nature. Microarray images typically contain several thousands of small spots, each of which represents a different gene in the experiment. One of the key steps in extracting information from a microarray image is the segmentation whose aim is to identify which pixels within an image represent which gene. This task is greatly complicated by noise within the image and a wide degree of variation in the values of the pixels belonging to a typical spot. In the past there have been many methods proposed for the segmentation of microarray image. In this paper, a new method utilizing a series of artificial neural networks, which are based on multi-layer perceptron (MLP) and Kohonen networks, is proposed. The proposed method is applied to a set of real-world cDNA images. Quantitative comparisons between the proposed method and commercial software GenePix(ยฎ) are carried out in terms of the peak signal-to-noise ratio (PSNR). This method is shown to not only deliver results comparable and even superior to existing techniques but also have a faster run time.This work was funded in part by the National Natural Science Foundation of China under Grants 61174136 and 61104041, the Natural Science Foundation of Jiangsu Province of China under Grant BK2011598, the International Science and Technology Cooperation Project of China under Grant No. 2011DFA12910, the Engineering and Physical Sciences Research Council (EPSRC) of the U.K. under Grant GR/S27658/01, the Royal Society of the U.K., and the Alexander von Humboldt Foundation of Germany

    Deep learning systems as complex networks

    Full text link
    Thanks to the availability of large scale digital datasets and massive amounts of computational power, deep learning algorithms can learn representations of data by exploiting multiple levels of abstraction. These machine learning methods have greatly improved the state-of-the-art in many challenging cognitive tasks, such as visual object recognition, speech processing, natural language understanding and automatic translation. In particular, one class of deep learning models, known as deep belief networks, can discover intricate statistical structure in large data sets in a completely unsupervised fashion, by learning a generative model of the data using Hebbian-like learning mechanisms. Although these self-organizing systems can be conveniently formalized within the framework of statistical mechanics, their internal functioning remains opaque, because their emergent dynamics cannot be solved analytically. In this article we propose to study deep belief networks using techniques commonly employed in the study of complex networks, in order to gain some insights into the structural and functional properties of the computational graph resulting from the learning process.Comment: 20 pages, 9 figure

    ์—ญ์—ฐ์‚ฐ์— ๊ธฐ๋ฐ˜ํ•œ ํ•ฉ์„ฑ๊ณฑ์‹ ๊ฒฝ๋ง์˜ ์„ค๋ช… ๋ฐ ์‹œ๊ฐํ™”

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ •๋ณด๊ณตํ•™๋ถ€, 2021.8. ๊ถŒํ˜์ง„.Interpretability and explainability of machine learning systems have received ever-increasing attention, especially for convolutional neural networks (CNN). Although there are various interpretation techniques for learning algorithms, post-hoc local explanation methods (e.g., the attribution method that visualizes pixel-level contribution of input to its corresponding result) are under great interest because they can deal with the high dimensional parameters and nonlinear operations of CNNs. Therefore, this dissertation presents three new post-hoc local explanation methods to visualize and understand the working mechanisms of CNNs. At first, this dissertation presents a new method called guided nonlinearity (GNL) that improves the performance of attribution by backpropagating only positive gradients through nonlinear operations. GNL is inspired by the mechanism of action potential (AP) generation in the postsynaptic neuron that depends on the sum of excitatory (EPSP) and inhibitory postsynaptic potentials (IPSP). This dissertation assumes that paths consisting of excitatory synapses faithfully reflect the contributions of inputs to the output. Then this assumption is applied to CNNs by allowing only positive gradients backpropagate through nonlinear operations. Experimental results have shown that GNL outperforms existing methods for computing attributions in terms of the deletion metrics and yields fine-grained and human-interpretable attributions. However, the attributions from existing methods, including GNL, lack a common theoretical background and sometimes give contradicting results. To address this problem, this dissertation develops the operation-wise inverse method that computes the inverse of prediction in an operation-wise manner by considering that CNNs can be decomposed with four fundamental operations (convolution, max-pooling, ReLU, and fully-connected). The operation-wise inverse process assumes that the forward-pass of CNN is a sequential propagation of physical quantities that indicate the magnitude of specific image features. The inverses of fundamental operations are formulated as constrained optimization problems that inverse results should generate output features consistent with the forward-pass. Then, the inverse of prediction is computed by sequentially applying inverses of fundamental operations of CNN. Experimental results show that the proposed operation-wise approach can be a reference tool for computing attributions because it can provide equivalent visualization results to several conventional methods, and the attributions from the operation-wise method achieve state-of-the-art performances in terms of deletion score. Although the operation-wise method can provide a reference framework to compute attributions, applying the attribution concept to CNNs with multiple-valued predictions has not yet been addressed because the computation of attribution requires a single scalar value represents the prediction. To address this problem, this dissertation proposes the layer-wise inverse-based approach by decomposing CNNs into a set of layers that process only positive values that can be interpreted as neural activations. Then, the inverses of layers are formulated as constrained optimization problems that identify activations-of-interest in lower-layers. Then, the inverse of prediction is computed by sequentially applying inverses of layers of CNN as in the operation-wise method. Experimental results show that the proposed layer-wise inverse-based method can analyze CNNs for classification and regression in the same framework. Especially for the case of regression, the layer-wise approach showed that conventional CNNs for single image super-resolution overlook a portion of frequency bands that may result in performance degradation.ํ•ด์„ ๊ฐ€๋Šฅํ•œ ๊ธฐ๊ณ„ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜๋“ค์€ ์ตœ๊ทผ ๋งŽ์€ ๊ด€์‹ฌ์„ ๋ฐ›๊ณ  ์žˆ์œผ๋ฉฐ, ์ด ์ค‘ ํ•ฉ์„ฑ๊ณฑ์‹ ๊ฒฝ๋ง (CNN)์˜ ์„ค๋ช… ๋ฐ ์‹œ๊ฐํ™”๋Š” ์ฃผ์š”ํ•œ ์—ฐ๊ตฌ์ฃผ์ œ๋กœ์„œ ์ทจ๊ธ‰๋˜๊ณ  ์žˆ๋‹ค. ๊ธฐ๊ณ„ํ•™์Šต ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ดํ•ดํ•˜๊ธฐ ์œ„ํ•œ ๋‹ค์–‘ํ•œ ๋ฐฉ๋ฒ• ์ค‘ ํŠนํžˆ ์ฃผ์–ด์ง„ ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ ์˜ˆ์ธก ๊ฒฐ๊ณผ์— ๋Œ€ํ•œ ์ž…๋ ฅ์˜ ๊ธฐ์—ฌ๋„๋ฅผ ์‹œ๊ฐํ™”ํ•˜๋Š” ๊ท€์ธ (attribution)๊ณผ ๊ฐ™์€ ์‚ฌํ›„๊ฒ€์ • (post-hoc) ๊ตญ์†Œ์„ค๋ช… (local explanation) ๋ฐฉ๋ฒ•์€ ๊ณ ์ฐจ์› ๋งค๊ฐœ ๋ณ€์ˆ˜๋ฅผ ๊ฐ€์ง„ ๋น„์„ ํ˜• ํ•จ์ˆ˜์— ์ ์šฉํ•  ์ˆ˜ ์žˆ์–ด์„œ CNN์˜ ์„ค๋ช… ๋ฐ ์‹œ๊ฐํ™”์˜ ์ฃผ์š”ํ•œ ๋ฐฉ๋ฒ•์œผ๋กœ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ๋‹ค. ์ด์— ๋”ฐ๋ผ ๋ณธ ๋…ผ๋ฌธ์€ CNN์˜ ์ž‘๋™ ์›๋ฆฌ๋ฅผ ์‹œ๊ฐํ™”ํ•˜๊ณ  ์ดํ•ดํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ๋Š” ์„ธ ๊ฐ€์ง€ ์‚ฌํ›„๊ฒ€์ • ๊ตญ์†Œ์„ค๋ช… ๋ฐฉ๋ฒ•๋“ค์„ ์ œ์‹œํ•œ๋‹ค. ์ฒซ ๋ฒˆ์งธ๋กœ, ๋ณธ ๋…ผ๋ฌธ์€ ๋น„์„ ํ˜• ์—ฐ์‚ฐ์˜ ์–‘์˜ ๊ธฐ์šธ๊ธฐ (positive valued gradient)๋งŒ ์—ญ์ „ํŒŒ (backpropagation)ํ•˜์—ฌ ๊ท€์ธ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒ์‹œํ‚ค๋Š” ์œ ๋„๋œ๋น„์„ ํ˜•๋ฒ• (guided nonlinearity method)์„ ์ œ์‹œํ•œ๋‹ค. ์œ ๋„๋œ๋น„์„ ํ˜•๋ฒ•์˜ ์„ค๊ณ„๋Š” ํฅ๋ถ„์„ฑ ๋ฐ ์–ต์ œ์„ฑ ์‹œ๋ƒ…์Šค ํ›„ ์ „์œ„์˜ ํ•ฉ์— ์˜์กดํ•˜๋Š” ์‹œ๋ƒ…์Šค ํ›„ ๋‰ด๋Ÿฐ์˜ ํ™œ๋™ ์ „์œ„ ์ƒ์„ฑ ๋ฉ”์ปค๋‹ˆ์ฆ˜์œผ๋กœ๋ถ€ํ„ฐ ๋น„๋กฏ๋˜์—ˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ํฅ๋ถ„์„ฑ ์‹œ๋ƒ…์Šค๋กœ ๊ตฌ์„ฑ๋œ ๊ฒฝ๋กœ๊ฐ€ ์ถœ๋ ฅ์— ๋Œ€ํ•œ ์ž…๋ ฅ์˜ ๊ธฐ์—ฌ๋„๋ฅผ ์ถฉ์‹คํ•˜๊ฒŒ ๋ฐ˜์˜ํ•˜๊ณ  ์žˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•˜์˜€๋‹ค. ๊ทธ ํ›„, ๋ณธ ๋…ผ๋ฌธ์€ ๋น„์„ ํ˜• ์—ฐ์‚ฐ์˜ ์–‘์˜ ๊ธฐ์šธ๊ธฐ๋งŒ ์—ญ์ „ํŒŒ ๋˜๋„๋ก ํ—ˆ์šฉํ•จ์œผ๋กœ์จ ์ด ๊ฐ€์ •์„ CNN์˜ ์„ค๋ช… ๋ฐ ์‹œ๊ฐํ™”์— ์ ์šฉํ•  ์ˆ˜ ์žˆ๋„๋ก ๊ตฌํ˜„ํ•˜์˜€๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ์‹คํ—˜์„ ํ†ตํ•ด, ์ œ์•ˆ๋œ ์œ ๋„๋œ๋น„์„ ํ˜•๋ฒ•์ด ์‚ญ์ œ์ฒ™๋„ (deletion metric) ์ธก๋ฉด์—์„œ ๊ธฐ์กด์˜ ๋ฐฉ๋ฒ•๋“ค๋ณด๋‹ค ํ–ฅ์ƒ๋œ ์„ฑ๋Šฅ์„ ๋ณด์ด๋ฉฐ ํ•ด์„ ๊ฐ€๋Šฅํ•˜๊ณ  ์„ธ๋ฐ€ํ•œ (fine-grained) ๊ท€์ธ์„ ์‚ฐ์ถœํ•จ์„ ๋ณด์˜€๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์œ ๋„๋œ๋น„์„ ํ˜•๋ฒ•์„ ํฌํ•จํ•œ ๊ธฐ์กด์˜ ๊ท€์ธ ๋ฐฉ๋ฒ•๋“ค์€ ์„œ๋กœ ๋‹ค๋ฅธ ์ด๋ก ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ์„ค๊ณ„๋˜์—ˆ์œผ๋ฉฐ, ์ด๋กœ ์ธํ•˜์—ฌ ์„œ๋กœ ๋ชจ์ˆœ๋˜๋Š” ๊ท€์ธ๋“ค์„ ๊ณ„์‚ฐํ•˜๋Š” ๋•Œ๋„ ์žˆ๋‹ค. ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” CNN์ด ํ•ฉ์„ฑ๊ณฑ (convolution), ์ตœ๋Œ€ํ’€๋ง (max-pooling), ReLU, ์ „์—ฐ๊ฒฐ (full-connected)์˜ 4๊ฐ€์ง€ ๊ธฐ๋ณธ ์—ฐ์‚ฐ๋“ค์˜ ํ•ฉ์„ฑํ•จ์ˆ˜๋กœ ํ‘œํ˜„๋  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์— ๊ธฐ๋ฐ˜ํ•˜์—ฌ, CNN์„ ํ†ตํ•œ ์˜ˆ์ธก์˜ ์—ญ์ƒ (inverse image)์„ ๊ธฐ๋ณธ ์—ฐ์‚ฐ๋“ค์˜ ์—ญ์—ฐ์‚ฐ์„ ํ†ตํ•ด ๊ณ„์‚ฐํ•˜๋Š” ์—ฐ์‚ฐ๋ณ„์—ญ์—ฐ์‚ฐ๋ฒ• (operation-wise inverse-based method)์„ ์ œ์•ˆํ•œ๋‹ค. ์—ฐ์‚ฐ๋ณ„์—ญ์—ฐ์‚ฐ๋ฒ•์€ CNN์˜ ์ •๋ฐฉํ–ฅ์ง„ํ–‰ (forward-pass)์„ ํŠน์ • ์ด๋ฏธ์ง€ํŠน์ง• (image feature)์˜ ํฌ๊ธฐ๋ฅผ ์˜๋ฏธํ•˜๋Š” ๋ฌผ๋ฆฌ๋Ÿ‰์˜ ์ˆœ์ฐจ์  ์ „ํŒŒ๋กœ ๊ฐ€์ •ํ•œ๋‹ค. ์ด ๊ฐ€์ •ํ•˜์— ์—ฐ์‚ฐ๋ณ„์—ญ์—ฐ์‚ฐ๋ฒ•์€ ๊ณ„์‚ฐ๋œ ์—ญ์ƒ์ด ๊ธฐ์กด์˜ ์ •๋ฐฉํ–ฅ์ง„ํ–‰ ๊ฒฐ๊ณผ์™€ ๋ชจ์ˆœ๋˜์ง€ ์•Š๋„๋ก ์„ค๊ณ„๋œ ์ œํ•œ๋œ ์ตœ์ ํ™” ๋ฌธ์ œ (constrained optimization problem)๋ฅผ ํ†ตํ•ด ๊ธฐ๋ณธ ์—ฐ์‚ฐ์˜ ์—ญ์—ฐ์‚ฐ์„ ๊ณ„์‚ฐํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ์‹คํ—˜์„ ํ†ตํ•ด ์—ฐ์‚ฐ๋ณ„์—ญ์—ฐ์‚ฐ๋ฒ•์ด ๊ธฐ์กด์˜ ์—ฌ๋Ÿฌ ๊ท€์ธ ๋ฐฉ๋ฒ•๋“ค๋ณด๋‹ค ์‚ญ์ œ์ฒ™๋„ ์ธก๋ฉด์—์„œ ํ–ฅ์ƒ๋˜์—ˆ์œผ๋ฉด์„œ๋„ ์งˆ์  ์ธก๋ฉด์—์„œ ์œ ์‚ฌํ•œ ์‹œ๊ฐํ™” ๊ฒฐ๊ณผ๋ฅผ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์„ ๋ณด์ž„์œผ๋กœ์จ ์—ฐ์‚ฐ๋ณ„์—ญ์—ฐ์‚ฐ๋ฒ•์ด ๊ท€์ธ๊ณ„์‚ฐ์˜ ๊ณตํ†ต ํ”„๋ ˆ์ž„ ์›Œํฌ (reference framework)๋กœ ์‚ฌ์šฉ๋  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์˜€๋‹ค. ํ•œํŽธ, ์˜์ƒ ๋ถ„๋ฅ˜ ๋ฌธ์ œ์™€ ๊ฐ™์ด ๋‹จ์ผ ์˜ˆ์ธก์„ ๋Œ€์ƒ์œผ๋กœ ํ•œ CNN๊ณผ๋Š” ๋‹ฌ๋ฆฌ ๋ณต์ˆ˜์˜ ์˜ˆ์ธก๊ฐ’์„ ๊ฐ€์ง€๋Š” CNN์— ๋Œ€ํ•˜์—ฌ ๊ท€์ธ๊ณ„์‚ฐ์„ ์‹œ๋„ํ•œ ์—ฐ๊ตฌ๋Š” ํ˜„์žฌ๊นŒ์ง€ ๋ณด๊ณ ๋˜์ง€ ์•Š์•˜๋‹ค. ์ด๋Š” ๊ธฐ์กด์˜ ๊ท€์ธ ๊ณ„์‚ฐ๋ฐฉ๋ฒ•๋“ค์€ CNN์— ๋Œ€ํ•˜์—ฌ ๋‹จ์ผ ์Šค์นผ๋ผ (scalar) ๊ฐ’์„ ์ถœ๋ ฅํ•˜๋„๋ก ์š”๊ตฌํ•˜๊ธฐ ๋•Œ๋ฌธ์ด๋‹ค. ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๊ณ„์ธต๋ณ„์—ญ์—ฐ์‚ฐ๋ฒ• (layer-wise inverse-based method)์„ ์ œ์•ˆํ•œ๋‹ค. ๊ณ„์ธต๋ณ„์—ญ์—ฐ์‚ฐ๋ฒ•์€ CNN์„ ์ธ๊ณต ๋‰ด๋Ÿฐ์˜ ํ™œ์„ฑ๊ฐ’ (neural activation)์œผ๋กœ ํ•ด์„ํ•  ์ˆ˜ ์žˆ๋Š” ์–‘์˜ ์‹ค์ˆ˜๋“ค์„ ์ž…์ถœ๋ ฅ์œผ๋กœ ํ•˜๋Š” ๊ณ„์ธต (layer)์œผ๋กœ ๋ถ„ํ•ดํ•˜๊ณ , ์ œํ•œ๋œ ์ตœ์ ํ™” ๋ฌธ์ œ๋กœ ์ •์˜๋˜๋Š” ๊ฐ ๊ณ„์ธต์˜ ์—ญ์—ฐ์‚ฐ์„ ์ •๋ฐฉํ–ฅ์ง„ํ–‰ ๊ฒฐ๊ณผ์— ์ˆœ์ฐจ์ ์œผ๋กœ ์ ์šฉํ•จ์œผ๋กœ์จ CNN์„ ํ†ตํ•œ ์˜ˆ์ธก์˜ ์—ญ์ƒ์„ ๊ณ„์‚ฐํ•œ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์€ ์‹คํ—˜์„ ํ†ตํ•ด, ์ œ์•ˆ๋œ ๊ณ„์ธต๋ณ„์—ญ์—ฐ์‚ฐ๋ฒ•์ด ์˜์ƒ ๋ถ„๋ฅ˜ ๋ฐ ํšŒ๊ธฐ๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•œ CNN๋“ค์˜ ์„ค๋ช… ๋ฐ ์‹œ๊ฐํ™”๋ฅผ ๋™์ผํ•œ ํ”„๋ ˆ์ž„ ์›Œํฌ (common framework)์—์„œ ์ฒ˜๋ฆฌํ•  ์ˆ˜ ์žˆ์Œ์„ ํ™•์ธํ•˜์˜€๋‹ค. ๋˜ํ•œ, ๋ณธ ๋…ผ๋ฌธ์€ ๊ณ„์ธต๋ณ„์—ญ์—ฐ์‚ฐ๋ฒ•์„ ํ†ตํ•ด ๋‹จ์ผ ์˜์ƒ ๊ณ ํ•ด์ƒํ™” (single image super-resolution)๋ฅผ ๋Œ€์ƒ์œผ๋กœ ํ•œ CNN์ธ VDSR์ด ์ž…๋ ฅ ์˜์ƒ์˜ ์ฃผํŒŒ์ˆ˜ ๋Œ€์—ญ์˜ ์ผ๋ถ€๋ฅผ ๊ฐ„๊ณผํ•˜๊ณ  ์žˆ๊ณ  ์ด๋Š” VDSR์„ ํ†ตํ•œ ๊ณ ํ•ด์ƒํ™”์‹œ ํŠน์ • ์ฃผํŒŒ์ˆ˜ ๋Œ€์—ญ์—์„œ ์˜์ƒ ํ’ˆ์งˆ์˜ ํ•˜๋ฝ์„ ์œ ๋ฐœํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์˜€๋‹ค.Contents 1 List of Tables 4 List of Figures 5 1 Introduction 7 1.1 Guided Nonlinearity 8 1.2 Inverse-based approach 9 1.2.1 Operation-wise method 10 1.2.2 Layer-wise method 11 1.3 Outline 14 2 RelatedWork 15 2.1 Activation-based approach 15 2.2 Perturbation-based approach 16 2.3 Backpropagation-based approach 17 2.4 Inverse-based approach 18 3 Guided Nonlinearity 19 3.1 Motivation and Overview 19 3.2 Proposed Guided Non-linearity 23 3.2.1 Integrated Gradients 23 3.2.2 Postulations 23 3.2.3 Proposed method 24 3.3 Experimental Results 27 3.3.1 Evaluation Metrics 29 3.3.2 Experiment details 29 3.3.3 Results and Discussions 30 3.4 Summary 30 4 Operation-wise Approach 32 4.1 Motivation and Overview 32 4.2 Proposed Method 35 4.2.1 Problem statement 36 4.2.2 Proposed constraints 36 4.2.3 Mathematical formulation 37 4.3 Implementation details 38 4.3.1 Inverse of ReLU and Max Pooling 38 4.3.2 Inverse of Fully Connected and Convolution Layers 39 4.4 Experimental Settings 40 4.4.1 Qualitative results 40 4.4.2 Quantitative Results 46 4.5 Summary 50 5 Layer-wise Approach 51 5.1 Motivation and Overview 51 5.2 Formulation of the Proposed Inverse Approach 55 5.2.1 Activation range 56 5.2.2 Minimal activation 56 5.2.3 Linear approximation 57 5.2.4 Layer-wise inverse 57 5.3 Details of inverse computation 59 5.3.1 Convolution block (linear part) + ReLU 59 5.3.2 Max-pooling layer 60 5.3.3 Fully connected block (linear part) + ReLU 61 5.3.4 Fully connected block (linear part) + Softmax 61 5.4 Application to the ImageNet classification task 62 5.4.1 Evaluation of output-reconstruction in terms of input-simplicity 62 5.4.2 Deletion and insertion scores 63 5.4.3 Selection of the regularization term weight 64 5.4.4 Comparison to Existing Methods 67 5.4.5 Output-reconstruction versus input-simplicity plot 68 5.4.6 Ablation study of the activation regularization 72 5.5 The inverse of single image super-resolution network 72 5.5.1 Experimental setting 72 5.5.2 Selection of the regularization term weight 74 5.5.3 Evaluation of the proposed inverse process 77 5.5.4 Frequency domain analysis of attribution 77 5.6 Summary 81 6 Conclusions 82 Bibliography 84 Abstract (In Korean) 95๋ฐ•

    Visual pattern recognition using neural networks

    Get PDF
    Neural networks have been widely studied in a number of fields, such as neural architectures, neurobiology, statistics of neural network and pattern classification. In the field of pattern classification, neural network models are applied on numerous applications, for instance, character recognition, speech recognition, and object recognition. Among these, character recognition is commonly used to illustrate the feature and classification characteristics of neural networks. In this dissertation, the theoretical foundations of artificial neural networks are first reviewed and existing neural models are studied. The Adaptive Resonance Theory (ART) model is improved to achieve more reasonable classification results. Experiments in applying the improved model to image enhancement and printed character recognition are discussed and analyzed. We also study the theoretical foundation of Neocognitron in terms of feature extraction, convergence in training, and shift invariance. We investigate the use of multilayered perceptrons with recurrent connections as the general purpose modules for image operations in parallel architectures. The networks are trained to carry out classification rules in image transformation. The training patterns can be derived from user-defmed transformations or from loading the pair of a sample image and its target image when the prior knowledge of transformations is unknown. Applications of our model include image smoothing, enhancement, edge detection, noise removal, morphological operations, image filtering, etc. With a number of stages stacked up together we are able to apply a series of operations on the image. That is, by providing various sets of training patterns the system can adapt itself to the concatenated transformation. We also discuss and experiment in applying existing neural models, such as multilayered perceptron, to realize morphological operations and other commonly used imaging operations. Some new neural architectures and training algorithms for the implementation of morphological operations are designed and analyzed. The algorithms are proven correct and efficient. The proposed morphological neural architectures are applied to construct the feature extraction module of a personal handwritten character recognition system. The system was trained and tested with scanned image of handwritten characters. The feasibility and efficiency are discussed along with the experimental results

    Dense semantic labeling of sub-decimeter resolution images with convolutional neural networks

    Full text link
    Semantic labeling (or pixel-level land-cover classification) in ultra-high resolution imagery (< 10cm) requires statistical models able to learn high level concepts from spatial data, with large appearance variations. Convolutional Neural Networks (CNNs) achieve this goal by learning discriminatively a hierarchy of representations of increasing abstraction. In this paper we present a CNN-based system relying on an downsample-then-upsample architecture. Specifically, it first learns a rough spatial map of high-level representations by means of convolutions and then learns to upsample them back to the original resolution by deconvolutions. By doing so, the CNN learns to densely label every pixel at the original resolution of the image. This results in many advantages, including i) state-of-the-art numerical accuracy, ii) improved geometric accuracy of predictions and iii) high efficiency at inference time. We test the proposed system on the Vaihingen and Potsdam sub-decimeter resolution datasets, involving semantic labeling of aerial images of 9cm and 5cm resolution, respectively. These datasets are composed by many large and fully annotated tiles allowing an unbiased evaluation of models making use of spatial information. We do so by comparing two standard CNN architectures to the proposed one: standard patch classification, prediction of local label patches by employing only convolutions and full patch labeling by employing deconvolutions. All the systems compare favorably or outperform a state-of-the-art baseline relying on superpixels and powerful appearance descriptors. The proposed full patch labeling CNN outperforms these models by a large margin, also showing a very appealing inference time.Comment: Accepted in IEEE Transactions on Geoscience and Remote Sensing, 201

    Automotive Object Detection via Learning Sparse Events by Temporal Dynamics of Spiking Neurons

    Full text link
    Event-based sensors, with their high temporal resolution (1us) and dynamical range (120dB), have the potential to be deployed in high-speed platforms such as vehicles and drones. However, the highly sparse and fluctuating nature of events poses challenges for conventional object detection techniques based on Artificial Neural Networks (ANNs). In contrast, Spiking Neural Networks (SNNs) are well-suited for representing event-based data due to their inherent temporal dynamics. In particular, we demonstrate that the membrane potential dynamics can modulate network activity upon fluctuating events and strengthen features of sparse input. In addition, the spike-triggered adaptive threshold can stabilize training which further improves network performance. Based on this, we develop an efficient spiking feature pyramid network for event-based object detection. Our proposed SNN outperforms previous SNNs and sophisticated ANNs with attention mechanisms, achieving a mean average precision (map50) of 47.7% on the Gen1 benchmark dataset. This result significantly surpasses the previous best SNN by 9.7% and demonstrates the potential of SNNs for event-based vision. Our model has a concise architecture while maintaining high accuracy and much lower computation cost as a result of sparse computation. Our code will be publicly available

    Advanced Information Processing Methods and Their Applications

    Get PDF
    This Special Issue has collected and presented breakthrough research on information processing methods and their applications. Particular attention is paid to the study of the mathematical foundations of information processing methods, quantum computing, artificial intelligence, digital image processing, and the use of information technologies in medicine

    Fish species classification in unconstrained underwater environments based on deep learning

    Get PDF
    Underwater video and digital still cameras are rapidly being adopted by marine scientists and managers as a tool for non-destructively quantifying and measuring the relative abundance, cover and size of marine fauna and flora. Imagery recorded of fish can be time consuming and costly to process and analyze manually. For this reason, there is great interest in automatic classification, counting, and measurement of fish. Unconstrained underwater scenes are highly variable due to changes in light intensity, changes in fish orientation due to movement, a variety of background habitats which sometimes also move, and most importantly similarity in shape and patterns among fish of different species. This poses a great challenge for image/video processing techniques to accurately differentiate between classes or species of fish to perform automatic classification. We present a machine learning approach, which is suitable for solving this challenge. We demonstrate the use of a convolution neural network model in a hierarchical feature combination setup to learn species-dependent visual features of fish that are unique, yet abstract and robust against environmental and intra-and inter-species variability. This approach avoids the need for explicitly extracting features from raw images of the fish using several fragmented image processing techniques. As a result, we achieve a single and generic trained architecture with favorable performance even for sample images of fish species that have not been used in training. Using the LifeCLEF14 and LifeCLEF15 benchmark fish datasets, we have demonstrated results with a correct classification rate of more than 90%

    Low-Quality Fingerprint Classification

    Get PDF
    Traditsioonilised sรตrmejรคlgede tuvastamise sรผsteemid kasutavad otsuste tegemisel minutiae punktide informatsiooni. Nagu selgub paljude varasemate tรถรถde pรตhjal, ei ole sรตrmejรคlgede pildid mitte alati piisava kvaliteediga, et neid saaks kasutada automaatsetes sรตrmejรคljetuvastuse sรผsteemides. Selle takistuse รผletamiseks keskendub magistritรถรถ vรคga madala kvaliteediga sรตrmejรคlgede piltide tuvastusele โ€“ sellistel piltidel on mitmed รผldteada moonutused, nagu kuivus, mรคrgus, fรผรผsiline vigastatus, punktide olemasolu ja hรคgusus. Tรถรถ eesmรคrk on vรคlja tรถรถtada efektiivne ja kรตrge tรคpsusega sรผgaval nรคrvivรตrgul pรตhinev algoritm, mis tunneb sรตrmejรคlje รคra selliselt madala kvaliteediga pildilt. Eksperimentaalsed katsed sรผgavรตppepรตhise meetodiga nรคitavad kรตrget tulemuslikkust ja robustsust, olles rakendatud praktikast kogutud madala kvaliteediga sรตrmejรคlgede andmebaasil. VGG16 baseeruv sรผgavรตppe nรคrvivรตrk saavutas kรตrgeima tulemuslikkuse kuivade (93%) ja madalaima tulemuslikkuse hรคguste (84%) piltide klassifitseerimisel.Fingerprint recognition systems mainly use minutiae points information. As shown in many previous research works, fingerprint images do not always have good quality to be used by automatic fingerprint recognition systems. To tackle this challenge, in this thesis, we are focusing on very low-quality fingerprint images, which contain several well-known distortions such as dryness, wetness, physical damage, presence of dots, and blurriness. We develop an efficient, with high accuracy, deep neural network algorithm, which recognizes such low-quality fingerprints. The experimental results have been conducted on real low-quality fingerprint database, and the achieved results show the high performance and robustness of the introduced deep network technique. The VGG16 based deep network achieves the highest performance of 93% for dry and the lowest of 84% for blurred fingerprint classes
    • โ€ฆ
    corecore