916 research outputs found

    Physics-Informed Computer Vision: A Review and Perspectives

    Full text link
    Incorporation of physical information in machine learning frameworks are opening and transforming many application domains. Here the learning process is augmented through the induction of fundamental knowledge and governing physical laws. In this work we explore their utility for computer vision tasks in interpreting and understanding visual data. We present a systematic literature review of formulation and approaches to computer vision tasks guided by physical laws. We begin by decomposing the popular computer vision pipeline into a taxonomy of stages and investigate approaches to incorporate governing physical equations in each stage. Existing approaches in each task are analyzed with regard to what governing physical processes are modeled, formulated and how they are incorporated, i.e. modify data (observation bias), modify networks (inductive bias), and modify losses (learning bias). The taxonomy offers a unified view of the application of the physics-informed capability, highlighting where physics-informed learning has been conducted and where the gaps and opportunities are. Finally, we highlight open problems and challenges to inform future research. While still in its early days, the study of physics-informed computer vision has the promise to develop better computer vision models that can improve physical plausibility, accuracy, data efficiency and generalization in increasingly realistic applications

    ๋ณต๋ถ€ CT์—์„œ ๊ฐ„๊ณผ ํ˜ˆ๊ด€ ๋ถ„ํ•  ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ)--์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› :๊ณต๊ณผ๋Œ€ํ•™ ์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€,2020. 2. ์‹ ์˜๊ธธ.๋ณต๋ถ€ ์ „์‚ฐํ™” ๋‹จ์ธต ์ดฌ์˜ (CT) ์˜์ƒ์—์„œ ์ •ํ™•ํ•œ ๊ฐ„ ๋ฐ ํ˜ˆ๊ด€ ๋ถ„ํ• ์€ ์ฒด์  ์ธก์ •, ์น˜๋ฃŒ ๊ณ„ํš ์ˆ˜๋ฆฝ ๋ฐ ์ถ”๊ฐ€์ ์ธ ์ฆ๊ฐ• ํ˜„์‹ค ๊ธฐ๋ฐ˜ ์ˆ˜์ˆ  ๊ฐ€์ด๋“œ์™€ ๊ฐ™์€ ์ปดํ“จํ„ฐ ์ง„๋‹จ ๋ณด์กฐ ์‹œ์Šคํ…œ์„ ๊ตฌ์ถ•ํ•˜๋Š”๋ฐ ํ•„์ˆ˜์ ์ธ ์š”์†Œ์ด๋‹ค. ์ตœ๊ทผ ๋“ค์–ด ์ปจ๋ณผ๋ฃจ์…”๋„ ์ธ๊ณต ์‹ ๊ฒฝ๋ง (CNN) ํ˜•ํƒœ์˜ ๋”ฅ ๋Ÿฌ๋‹์ด ๋งŽ์ด ์ ์šฉ๋˜๋ฉด์„œ ์˜๋ฃŒ ์˜์ƒ ๋ถ„ํ• ์˜ ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋˜๊ณ  ์žˆ์ง€๋งŒ, ์‹ค์ œ ์ž„์ƒ์— ์ ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ๋†’์€ ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ์ œ๊ณตํ•˜๊ธฐ๋Š” ์—ฌ์ „ํžˆ ์–ด๋ ต๋‹ค. ๋˜ํ•œ ๋ฌผ์ฒด์˜ ๊ฒฝ๊ณ„๋Š” ์ „ํ†ต์ ์œผ๋กœ ์˜์ƒ ๋ถ„ํ• ์—์„œ ๋งค์šฐ ์ค‘์š”ํ•œ ์š”์†Œ๋กœ ์ด์šฉ๋˜์—ˆ์ง€๋งŒ, CT ์˜์ƒ์—์„œ ๊ฐ„์˜ ๋ถˆ๋ถ„๋ช…ํ•œ ๊ฒฝ๊ณ„๋ฅผ ์ถ”์ถœํ•˜๊ธฐ๊ฐ€ ์–ด๋ ต๊ธฐ ๋•Œ๋ฌธ์— ํ˜„๋Œ€ CNN์—์„œ๋Š” ์ด๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ  ์žˆ๋‹ค. ๊ฐ„ ํ˜ˆ๊ด€ ๋ถ„ํ•  ์ž‘์—…์˜ ๊ฒฝ์šฐ, ๋ณต์žกํ•œ ํ˜ˆ๊ด€ ์˜์ƒ์œผ๋กœ๋ถ€ํ„ฐ ํ•™์Šต ๋ฐ์ดํ„ฐ๋ฅผ ๋งŒ๋“ค๊ธฐ ์–ด๋ ต๊ธฐ ๋•Œ๋ฌธ์— ๋”ฅ ๋Ÿฌ๋‹์„ ์ ์šฉํ•˜๊ธฐ๊ฐ€ ์–ด๋ ต๋‹ค. ๋˜ํ•œ ์–‡์€ ํ˜ˆ๊ด€ ๋ถ€๋ถ„์˜ ์˜์ƒ ๋ฐ๊ธฐ ๋Œ€๋น„๊ฐ€ ์•ฝํ•˜์—ฌ ์›๋ณธ ์˜์ƒ์—์„œ ์‹๋ณ„ํ•˜๊ธฐ๊ฐ€ ๋งค์šฐ ์–ด๋ ต๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ์œ„ ์–ธ๊ธ‰ํ•œ ๋ฌธ์ œ๋“ค์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•ด ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์ด ํ–ฅ์ƒ๋œ CNN๊ณผ ์–‡์€ ํ˜ˆ๊ด€์„ ํฌํ•จํ•˜๋Š” ๋ณต์žกํ•œ ๊ฐ„ ํ˜ˆ๊ด€์„ ์ •ํ™•ํ•˜๊ฒŒ ๋ถ„ํ• ํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์•ˆํ•œ๋‹ค. ๊ฐ„ ๋ถ„ํ•  ์ž‘์—…์—์„œ ์šฐ์ˆ˜ํ•œ ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ๊ฐ–๋Š” CNN์„ ๊ตฌ์ถ•ํ•˜๊ธฐ ์œ„ํ•ด, ๋‚ด๋ถ€์ ์œผ๋กœ ๊ฐ„ ๋ชจ์–‘์„ ์ถ”์ •ํ•˜๋Š” ๋ถ€๋ถ„์ด ํฌํ•จ๋œ ์ž๋™ ์ปจํ…์ŠคํŠธ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์•ˆํ•œ๋‹ค. ๋˜ํ•œ, CNN์„ ์‚ฌ์šฉํ•œ ํ•™์Šต์— ๊ฒฝ๊ณ„์„ ์˜ ๊ฐœ๋…์ด ์ƒˆ๋กญ๊ฒŒ ์ œ์•ˆ๋œ๋‹ค. ๋ชจํ˜ธํ•œ ๊ฒฝ๊ณ„๋ถ€๊ฐ€ ํฌํ•จ๋˜์–ด ์žˆ์–ด ์ „์ฒด ๊ฒฝ๊ณ„ ์˜์—ญ์„ CNN์— ํ›ˆ๋ จํ•˜๋Š” ๊ฒƒ์€ ๋งค์šฐ ์–ด๋ ต๊ธฐ ๋•Œ๋ฌธ์— ๋ฐ˜๋ณต๋˜๋Š” ํ•™์Šต ๊ณผ์ •์—์„œ ์ธ๊ณต ์‹ ๊ฒฝ๋ง์ด ์Šค์Šค๋กœ ์˜ˆ์ธกํ•œ ํ™•๋ฅ ์—์„œ ๋ถ€์ •ํ™•ํ•˜๊ฒŒ ์ถ”์ •๋œ ๋ถ€๋ถ„์  ๊ฒฝ๊ณ„๋งŒ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ธ๊ณต ์‹ ๊ฒฝ๋ง์„ ํ•™์Šตํ•œ๋‹ค. ์‹คํ—˜์  ๊ฒฐ๊ณผ๋ฅผ ํ†ตํ•ด ์ œ์•ˆ๋œ CNN์ด ๋‹ค๋ฅธ ์ตœ์‹  ๊ธฐ๋ฒ•๋“ค๋ณด๋‹ค ์ •ํ™•๋„๊ฐ€ ์šฐ์ˆ˜ํ•˜๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์ธ๋‹ค. ๋˜ํ•œ, ์ œ์•ˆ๋œ CNN์˜ ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ๊ฒ€์ฆํ•˜๊ธฐ ์œ„ํ•ด ๋‹ค์–‘ํ•œ ์‹คํ—˜์„ ์ˆ˜ํ–‰ํ•œ๋‹ค. ๊ฐ„ ํ˜ˆ๊ด€ ๋ถ„ํ• ์—์„œ๋Š” ๊ฐ„ ๋‚ด๋ถ€์˜ ๊ด€์‹ฌ ์˜์—ญ์„ ์ง€์ •ํ•˜๊ธฐ ์œ„ํ•ด ์•ž์„œ ํš๋“ํ•œ ๊ฐ„ ์˜์—ญ์„ ํ™œ์šฉํ•œ๋‹ค. ์ •ํ™•ํ•œ ๊ฐ„ ํ˜ˆ๊ด€ ๋ถ„ํ• ์„ ์œ„ํ•ด ํ˜ˆ๊ด€ ํ›„๋ณด ์ ๋“ค์„ ์ถ”์ถœํ•˜์—ฌ ์‚ฌ์šฉํ•˜๋Š” ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์•ˆํ•œ๋‹ค. ํ™•์‹คํ•œ ํ›„๋ณด ์ ๋“ค์„ ์–ป๊ธฐ ์œ„ํ•ด, ์‚ผ์ฐจ์› ์˜์ƒ์˜ ์ฐจ์›์„ ๋จผ์ € ์ตœ๋Œ€ ๊ฐ•๋„ ํˆฌ์˜ ๊ธฐ๋ฒ•์„ ํ†ตํ•ด ์ด์ฐจ์›์œผ๋กœ ๋‚ฎ์ถ˜๋‹ค. ์ด์ฐจ์› ์˜์ƒ์—์„œ๋Š” ๋ณต์žกํ•œ ํ˜ˆ๊ด€์˜ ๊ตฌ์กฐ๊ฐ€ ๋ณด๋‹ค ๋‹จ์ˆœํ™”๋  ์ˆ˜ ์žˆ๋‹ค. ์ด์–ด์„œ, ์ด์ฐจ์› ์˜์ƒ์—์„œ ํ˜ˆ๊ด€ ๋ถ„ํ• ์„ ์ˆ˜ํ–‰ํ•˜๊ณ  ํ˜ˆ๊ด€ ํ”ฝ์…€๋“ค์€ ์›๋ž˜์˜ ์‚ผ์ฐจ์› ๊ณต๊ฐ„์ƒ์œผ๋กœ ์—ญ ํˆฌ์˜๋œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ์ „์ฒด ํ˜ˆ๊ด€์˜ ๋ถ„ํ• ์„ ์œ„ํ•ด ์›๋ณธ ์˜์ƒ๊ณผ ํ˜ˆ๊ด€ ํ›„๋ณด ์ ๋“ค์„ ๋ชจ๋‘ ์‚ฌ์šฉํ•˜๋Š” ์ƒˆ๋กœ์šด ๋ ˆ๋ฒจ ์…‹ ๊ธฐ๋ฐ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ์•ˆํ•œ๋‹ค. ์ œ์•ˆ๋œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋ณต์žกํ•œ ๊ตฌ์กฐ๊ฐ€ ๋‹จ์ˆœํ™”๋˜๊ณ  ์–‡์€ ํ˜ˆ๊ด€์ด ๋” ์ž˜ ๋ณด์ด๋Š” ์ด์ฐจ์› ์˜์ƒ์—์„œ ์–ป์€ ํ›„๋ณด ์ ๋“ค์„ ์‚ฌ์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์–‡์€ ํ˜ˆ๊ด€ ๋ถ„ํ• ์—์„œ ๋†’์€ ์ •ํ™•๋„๋ฅผ ๋ณด์ธ๋‹ค. ์‹คํ—˜์  ๊ฒฐ๊ณผ์— ์˜ํ•˜๋ฉด ์ œ์•ˆ๋œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ์ž˜๋ชป๋œ ์˜์—ญ์˜ ์ถ”์ถœ ์—†์ด ๋‹ค๋ฅธ ๋ ˆ๋ฒจ ์…‹ ๊ธฐ๋ฐ˜ ์•Œ๊ณ ๋ฆฌ์ฆ˜๋“ค๋ณด๋‹ค ์šฐ์ˆ˜ํ•œ ์„ฑ๋Šฅ์„ ๋ณด์ธ๋‹ค. ์ œ์•ˆ๋œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๊ฐ„๊ณผ ํ˜ˆ๊ด€์„ ๋ถ„ํ• ํ•˜๋Š” ์ƒˆ๋กœ์šด ๋ฐฉ๋ฒ•์„ ์ œ์‹œํ•œ๋‹ค. ์ œ์•ˆ๋œ ์ž๋™ ์ปจํ…์ŠคํŠธ ๊ตฌ์กฐ๋Š” ์‚ฌ๋žŒ์ด ๋””์ž์ธํ•œ ํ•™์Šต ๊ณผ์ •์ด ์ผ๋ฐ˜ํ™” ์„ฑ๋Šฅ์„ ํฌ๊ฒŒ ํ–ฅ์ƒํ•  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์ธ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ œ์•ˆ๋œ ๊ฒฝ๊ณ„์„  ํ•™์Šต ๊ธฐ๋ฒ•์œผ๋กœ CNN์„ ์‚ฌ์šฉํ•œ ์˜์ƒ ๋ถ„ํ• ์˜ ์„ฑ๋Šฅ์„ ํ–ฅ์ƒํ•  ์ˆ˜ ์žˆ์Œ์„ ๋‚ดํฌํ•œ๋‹ค. ๊ฐ„ ํ˜ˆ๊ด€์˜ ๋ถ„ํ• ์€ ์ด์ฐจ์› ์ตœ๋Œ€ ๊ฐ•๋„ ํˆฌ์˜ ๊ธฐ๋ฐ˜ ์ด๋ฏธ์ง€๋กœ๋ถ€ํ„ฐ ํš๋“๋œ ํ˜ˆ๊ด€ ํ›„๋ณด ์ ๋“ค์„ ํ†ตํ•ด ์–‡์€ ํ˜ˆ๊ด€๋“ค์ด ์„ฑ๊ณต์ ์œผ๋กœ ๋ถ„ํ• ๋  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์ธ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ ์ œ์•ˆ๋œ ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๊ฐ„์˜ ํ•ด๋ถ€ํ•™์  ๋ถ„์„๊ณผ ์ž๋™ํ™”๋œ ์ปดํ“จํ„ฐ ์ง„๋‹จ ๋ณด์กฐ ์‹œ์Šคํ…œ์„ ๊ตฌ์ถ•ํ•˜๋Š” ๋ฐ ๋งค์šฐ ์ค‘์š”ํ•œ ๊ธฐ์ˆ ์ด๋‹ค.Accurate liver and its vessel segmentation on abdominal computed tomography (CT) images is one of the most important prerequisites for computer-aided diagnosis (CAD) systems such as volumetric measurement, treatment planning, and further augmented reality-based surgical guide. In recent years, the application of deep learning in the form of convolutional neural network (CNN) has improved the performance of medical image segmentation, but it is difficult to provide high generalization performance for the actual clinical practice. Furthermore, although the contour features are an important factor in the image segmentation problem, they are hard to be employed on CNN due to many unclear boundaries on the image. In case of a liver vessel segmentation, a deep learning approach is impractical because it is difficult to obtain training data from complex vessel images. Furthermore, thin vessels are hard to be identified in the original image due to weak intensity contrasts and noise. In this dissertation, a CNN with high generalization performance and a contour learning scheme is first proposed for liver segmentation. Secondly, a liver vessel segmentation algorithm is presented that accurately segments even thin vessels. To build a CNN with high generalization performance, the auto-context algorithm is employed. The auto-context algorithm goes through two pipelines: the first predicts the overall area of a liver and the second predicts the final liver using the first prediction as a prior. This process improves generalization performance because the network internally estimates shape-prior. In addition to the auto-context, a contour learning method is proposed that uses only sparse contours rather than the entire contour. Sparse contours are obtained and trained by using only the mispredicted part of the network's final prediction. Experimental studies show that the proposed network is superior in accuracy to other modern networks. Multiple N-fold tests are also performed to verify the generalization performance. An algorithm for accurate liver vessel segmentation is also proposed by introducing vessel candidate points. To obtain confident vessel candidates, the 3D image is first reduced to 2D through maximum intensity projection. Subsequently, vessel segmentation is performed from the 2D images and the segmented pixels are back-projected into the original 3D space. Finally, a new level set function is proposed that utilizes both the original image and vessel candidate points. The proposed algorithm can segment thin vessels with high accuracy by mainly using vessel candidate points. The reliability of the points can be higher through robust segmentation in the projected 2D images where complex structures are simplified and thin vessels are more visible. Experimental results show that the proposed algorithm is superior to other active contour models. The proposed algorithms present a new method of segmenting the liver and its vessels. The auto-context algorithm shows that a human-designed curriculum (i.e., shape-prior learning) can improve generalization performance. The proposed contour learning technique can increase the accuracy of a CNN for image segmentation by focusing on its failures, represented by sparse contours. The vessel segmentation shows that minor vessel branches can be successfully segmented through vessel candidate points obtained by reducing the image dimension. The algorithms presented in this dissertation can be employed for later analysis of liver anatomy that requires accurate segmentation techniques.Chapter 1 Introduction 1 1.1 Background and motivation 1 1.2 Problem statement 3 1.3 Main contributions 6 1.4 Contents and organization 9 Chapter 2 Related Works 10 2.1 Overview 10 2.2 Convolutional neural networks 11 2.2.1 Architectures of convolutional neural networks 11 2.2.2 Convolutional neural networks in medical image segmentation 21 2.3 Liver and vessel segmentation 37 2.3.1 Classical methods for liver segmentation 37 2.3.2 Vascular image segmentation 40 2.3.3 Active contour models 46 2.3.4 Vessel topology-based active contour model 54 2.4 Motivation 60 Chapter 3 Liver Segmentation via Auto-Context Neural Network with Self-Supervised Contour Attention 62 3.1 Overview 62 3.2 Single-pass auto-context neural network 65 3.2.1 Skip-attention module 66 3.2.2 V-transition module 69 3.2.3 Liver-prior inference and auto-context 70 3.2.4 Understanding the network 74 3.3 Self-supervising contour attention 75 3.4 Learning the network 81 3.4.1 Overall loss function 81 3.4.2 Data augmentation 81 3.5 Experimental Results 83 3.5.1 Overview 83 3.5.2 Data configurations and target of comparison 84 3.5.3 Evaluation metric 85 3.5.4 Accuracy evaluation 87 3.5.5 Ablation study 93 3.5.6 Performance of generalization 110 3.5.7 Results from ground-truth variations 114 3.6 Discussion 116 Chapter 4 Liver Vessel Segmentation via Active Contour Model with Dense Vessel Candidates 119 4.1 Overview 119 4.2 Dense vessel candidates 124 4.2.1 Maximum intensity slab images 125 4.2.2 Segmentation of 2D vessel candidates and back-projection 130 4.3 Clustering of dense vessel candidates 135 4.3.1 Virtual gradient-assisted regional ACM 136 4.3.2 Localized regional ACM 142 4.4 Experimental results 145 4.4.1 Overview 145 4.4.2 Data configurations and environment 146 4.4.3 2D segmentation 146 4.4.4 ACM comparisons 149 4.4.5 Evaluation of bifurcation points 154 4.4.6 Computational performance 159 4.4.7 Ablation study 160 4.4.8 Parameter study 162 4.5 Application to portal vein analysis 164 4.6 Discussion 168 Chapter 5 Conclusion and Future Works 170 Bibliography 172 ์ดˆ๋ก 197Docto

    Camera Re-Localization with Data Augmentation by Image Rendering and Image-to-Image Translation

    Get PDF
    Die Selbstlokalisierung von Automobilen, Robotern oder unbemannten Luftfahrzeugen sowie die Selbstlokalisierung von FuรŸgรคngern ist und wird fรผr eine Vielzahl an Anwendungen von hohem Interesse sein. Eine Hauptaufgabe ist die autonome Navigation von solchen Fahrzeugen, wobei die Lokalisierung in der umgebenden Szene eine Schlรผsselkomponente darstellt. Da Kameras etablierte fest verbaute Sensoren in Automobilen, Robotern und unbemannten Luftfahrzeugen sind, ist der Mehraufwand diese auch fรผr Aufgaben der Lokalisierung zu verwenden gering bis gar nicht vorhanden. Das gleiche gilt fรผr die Selbstlokalisierung von FuรŸgรคngern, bei der Smartphones als mobile Plattformen fรผr Kameras zum Einsatz kommen. Kamera-Relokalisierung, bei der die Pose einer Kamera bezรผglich einer festen Umgebung bestimmt wird, ist ein wertvoller Prozess um eine Lรถsung oder Unterstรผtzung der Lokalisierung fรผr Fahrzeuge oder FuรŸgรคnger darzustellen. Kameras sind zudem kostengรผnstige Sensoren welche im Alltag von Menschen und Maschinen etabliert sind. Die Unterstรผtzung von Kamera-Relokalisierung ist nicht auf Anwendungen bezรผglich der Navigation begrenzt, sondern kann allgemein zur Unterstรผtzung von Bildanalyse oder Bildverarbeitung wie Szenenrekonstruktion, Detektion, Klassifizierung oder รคhnlichen Anwendungen genutzt werden. Fรผr diese Zwecke, befasst sich diese Arbeit mit der Verbesserung des Prozesses der Kamera-Relokalisierung. Da Convolutional Neural Networks (CNNs) und hybride Lรถsungen um die Posen von Kameras zu bestimmen in den letzten Jahren mit etablierten manuell entworfenen Methoden konkurrieren, ist der Fokus in dieser Thesis auf erstere Methoden gesetzt. Die Hauptbeitrรคge dieser Arbeit beinhalten den Entwurf eines CNN zur Schรคtzung von Kameraposen, wobei der Schwerpunkt auf einer flachen Architektur liegt, die den Anforderungen an mobile Plattformen genรผgt. Dieses Netzwerk erreicht Genauigkeiten in gleichem Grad wie tiefere CNNs mit umfangreicheren ModelgrรถรŸen. Desweiteren ist die Performanz von CNNs stark von der Quantitรคt und Qualitรคt der zugrundeliegenden Trainingsdaten, die fรผr die Optimierung genutzt werden, abhรคngig. Daher, befassen sich die weiteren Beitrรคge dieser Thesis mit dem Rendern von Bildern und Bild-zu-Bild Umwandlungen zur Erweiterung solcher Trainingsdaten. Das generelle Erweitern solcher Trainingsdaten wird Data Augmentation (DA) genannt. Fรผr das Rendern von Bildern zur nรผtzlichen Erweiterung von Trainingsdaten werden 3D Modelle genutzt. Generative Adversarial Networks (GANs) dienen zur Bild-zu-Bild Umwandlung. Wรคhrend das Rendern von Bildern die Quantitรคt in einem Bilddatensatz erhรถht, verbessert die Bild-zu-Bild Umwandlung die Qualitรคt dieser gerenderten Daten. Experimente werden sowohl mit erweiterten Datensรคtzen aus gerenderten Bildern als auch mit umgewandelten Bildern durchgefรผhrt. Beide Ansรคtze der DA tragen zur Verbesserung der Genauigkeit der Lokalisierung bei. Somit werden in dieser Arbeit Kamera-Relokalisierung mit modernsten Methoden durch DA verbessert

    Advanced Biometrics with Deep Learning

    Get PDF
    Biometrics, such as fingerprint, iris, face, hand print, hand vein, speech and gait recognition, etc., as a means of identity management have become commonplace nowadays for various applications. Biometric systems follow a typical pipeline, that is composed of separate preprocessing, feature extraction and classification. Deep learning as a data-driven representation learning approach has been shown to be a promising alternative to conventional data-agnostic and handcrafted pre-processing and feature extraction for biometric systems. Furthermore, deep learning offers an end-to-end learning paradigm to unify preprocessing, feature extraction, and recognition, based solely on biometric data. This Special Issue has collected 12 high-quality, state-of-the-art research papers that deal with challenging issues in advanced biometric systems based on deep learning. The 12 papers can be divided into 4 categories according to biometric modality; namely, face biometrics, medical electronic signals (EEG and ECG), voice print, and others

    System-Characterized Artificial Intelligence Approaches for Cardiac cellular systems and Molecular Signature analysis

    Get PDF
    The dissertation presents a significant advancement in the field of cardiac cellular systems and molecular signature systems by employing machine learning and generative artificial intelligence techniques. These methodologies are systematically characterized and applied to address critical challenges in these domains. A novel computational model is developed, which combines machine learning tools and multi-physics models. The main objective of this model is to accurately predict complex cellular dynamics, taking into account the intricate interactions within the cardiac cellular system. Furthermore, a comprehensive framework based on generative adversarial networks (GANs) is proposed. This framework is designed to generate synthetic data that faithfully represents an in-vitro cardiac cellular system. The generated data can be used to enhance the understanding and analysis of the systemโ€™s behavior. Additionally, a novel AI approach is formulated, which integrates deep learning and GAN techniques for Raman characterization. This approach enables efficient detection of multi-analyte mixtures by leveraging the power of deep learning algorithms and the generation of synthetic data through GANs. Overall, the integration of machine learning, generative artificial intelligence, and multi-physics modeling provides valuable insights and tools for precise prediction and efficient detection in cardiac cellular systems and molecular signature systems

    Light Field Diffusion for Single-View Novel View Synthesis

    Full text link
    Single-view novel view synthesis, the task of generating images from new viewpoints based on a single reference image, is an important but challenging task in computer vision. Recently, Denoising Diffusion Probabilistic Model (DDPM) has become popular in this area due to its strong ability to generate high-fidelity images. However, current diffusion-based methods directly rely on camera pose matrices as viewing conditions, globally and implicitly introducing 3D constraints. These methods may suffer from inconsistency among generated images from different perspectives, especially in regions with intricate textures and structures. In this work, we present Light Field Diffusion (LFD), a conditional diffusion-based model for single-view novel view synthesis. Unlike previous methods that employ camera pose matrices, LFD transforms the camera view information into light field encoding and combines it with the reference image. This design introduces local pixel-wise constraints within the diffusion models, thereby encouraging better multi-view consistency. Experiments on several datasets show that our LFD can efficiently generate high-fidelity images and maintain better 3D consistency even in intricate regions. Our method can generate images with higher quality than NeRF-based models, and we obtain sample quality similar to other diffusion-based models but with only one-third of the model size
    • โ€ฆ
    corecore