1,514 research outputs found

    Effective Geometric Restoration of Distorted Historical Document for Large-Scale Digitization

    Get PDF
    Due to storage conditions and materialโ€™s non-planar shape, geometric distortion of the 2-D content is widely present in scanned document images. Effective geometric restoration of these distorted document images considerably increases character recognition rate in large-scale digitisation. For large-scale digitisation of historical books, geometric restoration solutions expect to be accurate, generic, robust, unsupervised and reversible. However, most methods in the literature concentrate on improving restoration accuracy for specific distortion effect, but not their applicability in large-scale digitisation. This paper proposes an effective mesh based geometric restoration system, (GRLSD), for large-scale distorted historical document digitisation. In this system, an automatic mesh generation based dewarping tool is proposed to geometrically model and correct arbitrary warping historical documents. An XML based mesh recorder is proposed to record the mesh of distortion information for reversible use. A graphic user interface toolkit is designed to visually display and manually manipulate the mesh for improving geometric restoration accuracy. Experimental results show that the proposed automatic dewarping approach efficiently corrects arbitrarily warped historical documents, with an improved performance over several state-of-the-art geometric restoration methods. By using XML mesh recorder and GUI toolkit, the GRLSD system greatly aids users to flexibly monitor and correct ambiguous points of mesh for the prevention of damaging historical document images without distortions in large-scale digitalisation

    ๋ฌธ์„œ ๊ฒฝ๊ณ„์™€ 3์ฐจ์› ์žฌ๊ตฌ์„ฑ์— ๊ธฐ๋ฐ˜ํ•œ ๋ฌธ์„œ ์ด๋ฏธ์ง€ ํ‰ํŒํ™”

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(์„์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ์ž์—ฐ๊ณผํ•™๋Œ€ํ•™ ์ˆ˜๋ฆฌ๊ณผํ•™๋ถ€, 2022. 8. ํ˜„๋™ํ›ˆ.In recent days, most of the scanned images are obtained from mobile devices such as cameras, smartphones, and tablets rather than traditional flatbed scanners. Contrary to the scanning process of the traditional scanners, capturing process of mobile devices might be accompanied by distortions in various forms such as perspective distortion, fold distortion, and page curls. In this thesis, we propose robust dewarping methods which correct such distortions based on the document boundary and 3D reconstruction. In the first method, we construct a curvilinear grid on the document image using the document boundary and reconstruct the document surface in the three dimensional space. Then we rectify the image using a family of local homographies computed from the reconstructed document surface. Although some of the steps of the proposed method have been proposed separately in other research, our approach exploited and combined their advantages to propose a robust dewarping process in addition to improving the stability in the overall process. Moreover, we refined the process by correcting the distorted text region boundary and developed this process into an independent dewarping method which is concise, straight-forward, and robust while still producing a well-rectified document image.์ตœ๊ทผ์—๋Š” ๋Œ€๋ถ€๋ถ„์˜ ์Šค์บ”๋œ ์ด๋ฏธ์ง€๋“ค์ด ์ „ํ†ต์ ์ธ ํ‰ํŒ์Šค์บ๋„ˆ๊ฐ€ ์•„๋‹Œ ์นด๋ฉ”๋ผ, ์Šค๋งˆํŠธํฐ, ํƒœ๋ธ”๋ฆฟ PC ๋“ฑ์˜ ํœด๋Œ€๊ธฐ๊ธฐ๋“ค๋กœ๋ถ€ํ„ฐ ์–ป์–ด์ง„๋‹ค. ์ด์ „ ์Šค์บ๋„ˆ๋“ค์˜ ์Šค์บ๋‹ ๊ณผ์ •๊ณผ๋Š” ๋‹ค๋ฅด๊ฒŒ ํœด๋Œ€๊ธฐ๊ธฐ๋“ค์„ ์ด์šฉํ•œ ์ด๋ฏธ์ง€ ์บก์ณ๋ง ๊ณผ์ •์€ ์›๊ทผ์™œ๊ณก, ์ข…์ด์˜ ์ ‘ํž˜์œผ๋กœ ์ธํ•œ ์™œ๊ณก, ๊ทธ๋ฆฌ๊ณ  ์ข…์ด์˜ ํœ˜์–ด์ง์œผ๋กœ ์ธํ•œ ์™œ๊ณก ๋“ฑ ๋‹ค์–‘ํ•œ ์™œ๊ณก๋“ค์„ ์ˆ˜๋ฐ˜ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด ๋…ผ๋ฌธ์—์„œ๋Š” ์ด๋Ÿฌํ•œ ์™œ๊ณก๋“ค์„ ์ œ๊ฑฐํ•  ์ˆ˜ ์žˆ๋Š” ๋ฌธ์„œ ๊ฒฝ๊ณ„์™€ 3์ฐจ์› ์žฌ๊ตฌ์„ฑ์— ๊ธฐ๋ฐ˜ํ•œ ๊ฐ•๋ ฅํ•œ ๋””์›Œํ•‘ ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•˜๊ณ ์ž ํ•œ๋‹ค. ์ฒซ๋ฒˆ์งธ ๋ฐฉ๋ฒ•์—์„œ๋Š”, ๋ฌธ์„œ ๊ฒฝ๊ณ„๋ฅผ ์ด์šฉํ•˜์—ฌ ๋ฌธ์„œ ์ด๋ฏธ์ง€ ์œ„์— ๊ณก์„ ์œผ๋กœ ์ด๋ฃจ์–ด์ง„ ๊ทธ๋ฆฌ๋“œ๋ฅผ ๋งŒ๋“ค๊ณ , 3์ฐจ์› ๊ณต๊ฐ„ ์ƒ์˜ ๋ฌธ์„œ ๊ณก๋ฉด์„ ์žฌ๊ตฌ์„ฑํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์žฌ๊ตฌ์„ฑ๋œ ๋ฌธ์„œ ๊ณก๋ฉด์œผ๋กœ๋ถ€ํ„ฐ ๊ณ„์‚ฐ๋œ ๊ตญ์†Œ์  ํ˜ธ๋ชจ๊ทธ๋ž˜ํ”ผ๋“ค์„ ์ด์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€๋ฅผ ์ˆ˜์ •ํ•œ๋‹ค. ์šฐ๋ฆฌ๊ฐ€ ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์˜ ๋ช‡๋ช‡ ๋‹จ๊ณ„๋Š” ๋‹ค๋ฅธ ์—ฐ๊ตฌ์—์„œ ๊ฐœ๋ณ„์ ์œผ๋กœ ์‚ฌ์šฉ๋œ ๊ฒฝ์šฐ๋„ ์žˆ์ง€๋งŒ, ์šฐ๋ฆฌ๋Š” ์ „์ฒด์ ์ธ ๊ณผ์ •์—์„œ ์•ˆ์ •์„ฑ์„ ๋†’์ด๋Š” ๋™์‹œ์— ๊ฐ ๋ฐฉ๋ฒ•์˜ ์žฅ์ ๋“ค์„ ์ด์šฉํ•˜๊ณ  ์กฐํ•ฉํ•˜์—ฌ ๊ฐ•๋ ฅํ•œ ๋””์›Œํ•‘ ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ์ด์— ๋”ํ•˜์—ฌ, ์šฐ๋ฆฌ๋Š” ์™œ๊ณก๋œ ํ…์ŠคํŠธ ์˜์—ญ์˜ ๊ฒฝ๊ณ„๋ฅผ ์ˆ˜์ •ํ•˜์—ฌ ์ „์ฒด์ ์ธ ๊ณผ์ •์„ ๋ณด์™„ํ•˜์˜€๊ณ , ์ด ์ ˆ์ฐจ๋ฅผ ๊ฐ„๊ฒฐํ•˜๊ณ , ์ง๊ด€์ ์ด๋ฉฐ, ๊ฐ•๋ ฅํ•˜๋ฉด์„œ๋„ ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๋‚ด๋Š” ๋…๋ฆฝ์ ์ธ ๋””์›Œํ•‘ ๋ฐฉ๋ฒ•์œผ๋กœ ๊ฐœ๋ฐœํ•˜์˜€๋‹ค.1. Introduction 1 2. Review on Camera Geometry 6 2.1. Basic Camera Model 6 2.2. 3D Reconstruction Problem 8 3. Related Works 10 3.1. Dewarping Methods based on the Text-lines 10 3.2. Dewarping Methods based on the Document Boundary 11 3.3. Dewarping Methods based on the Grid Construction 12 3.4. Dewarping Methods based on the Document Surface Model in 3D Space 13 4. Document Image Dewarping based on the Document Boundary and 3D Reconstruction 15 4.1. Input Document Image Processing 17 4.1.1. Binarization of the Input Document Image 17 4.1.2. Perspective Distortion Removal using the Document Boundary 19 4.2. Grid Construction on the Document Image 21 4.3. 3D Reconstruction of the Document Surface 23 4.3.1. Geometric Model 23 4.3.2. Normalization of the Grid Corners 24 4.3.3. 3D Reconstruction of the Document Surface 26 4.4. Rectification of the Document Image under a Family of Local Homographies 27 4.5. Global Rectification of the Document Image 29 5. Document Image Dewarping by Straightening Document Boundary Curves 33 6. Conclusion 37 Appendix A. 38 A.1. 4-point Algorithm 38 A.2. Optimization of the Cost Function 40 Bibliography 42 Abstract (in Korean) 47 Acknowledgement (in Korean) 48์„

    AirCode: Unobtrusive Physical Tags for Digital Fabrication

    Full text link
    We present AirCode, a technique that allows the user to tag physically fabricated objects with given information. An AirCode tag consists of a group of carefully designed air pockets placed beneath the object surface. These air pockets are easily produced during the fabrication process of the object, without any additional material or postprocessing. Meanwhile, the air pockets affect only the scattering light transport under the surface, and thus are hard to notice to our naked eyes. But, by using a computational imaging method, the tags become detectable. We present a tool that automates the design of air pockets for the user to encode information. AirCode system also allows the user to retrieve the information from captured images via a robust decoding algorithm. We demonstrate our tagging technique with applications for metadata embedding, robotic grasping, as well as conveying object affordances.Comment: ACM UIST 2017 Technical Paper

    ํ…์ŠคํŠธ์™€ ํŠน์ง•์  ๊ธฐ๋ฐ˜์˜ ๋ชฉ์ ํ•จ์ˆ˜ ์ตœ์ ํ™”๋ฅผ ์ด์šฉํ•œ ๋ฌธ์„œ์™€ ํ…์ŠคํŠธ ํ‰ํ™œํ™” ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› : ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2014. 8. ์กฐ๋‚จ์ต.There are many techniques and applications that detect and recognize text information in the images, e.g., document retrieval using the camera-captured document image, book reader for visually impaired, and augmented reality based on text recognition. In these applications, the planar surfaces which contain the text are often distorted in the captured image due to the perspective view (e.g., road signs), curvature (e.g., unfolded books), and wrinkles (e.g., old documents). Specifically, recovering the original document texture by removing these distortions from the camera-captured document images is called the document rectification. In this dissertation, new text surface rectification algorithms are proposed, for improving text recognition accuracy and visual quality. The proposed methods are categorized into 3 types depending on the types of the input. The contributions of the proposed methods can be summarized as follows. In the first rectification algorithm, the dense text-lines in the documents are employed to rectify the images. Unlike the conventional approaches, the proposed method does not directly use the text-line. Instead, the proposed method use the discrete representation of text-lines and text-blocks which are the sets of connected components. Also, the geometric distortion caused by page curl and perspective view are modeled as generalized cylindrical surfaces and camera rotation respectively. With these distortion model and discrete representation of the features, a cost function whose minimization yields parameters of the distortion model is developed. In the cost function, the properties of the pages such as text-block alignment, line-spacing, and the straightness of text-lines are encoded. By describing the text features using the sets of discrete points, the cost function can be easily defined and well solved by Levenberg-Marquadt algorithm. Experiments show that the proposed method works well for the various layouts and curved surfaces, and compares favorably with the conventional methods on the standard dataset. The second algorithm is a unified framework to rectify and stitch multiple document images using visual feature points instead of text lines. This is similar to the method employed in general image stitching algorithm. However, the general image stitching algorithm usually assumes fixed center of camera, which is not taken for granted in capturing the document. To deal with the camera motion between images, a new parametric family of motion model is proposed in this dissertation. Besides, to remove the ambiguity in the reference plane, a new cost function is developed to impose the constraints on the reference plane. This enables the estimation of physically correct reference plane without prior knowledge. The estimated reference plane can also be used to rectify the stitching result. Furthermore, the proposed method can be applied to any other planar object such as building facades or mural paintings as well as the camera-captured document image since it employs the general features. The third rectification method is based on scene text detection algorithm, which is independent from the language model. The conventional methods assume that a character consists of a single connected component (CC) like English alphabet. However, this assumption is brittle in the Asian characters such as Korean, Chinese, and Japanese, where a single character consists of several CCs. Therefore, it is difficult to divide CCs into text lines without language model. To alleviate this problem, the proposed method clusters the candidate regions based on the similarity measure considering inter-character relation. The adjacency measure is trained on the data set labeled with the bounding box of text region. Non-text regions that remain after clustering are filtered out in text/non-text classification step. Final text regions are merged or divided into each text line considering the orientation and location. The detected text is rectified using the orientation of text-line and vertical strokes. The proposed method outperforms state-of-the-art algorithms in English as well as Asian characters in the extensive experiments.1 Introduction 1 1.1 Document rectification via text-line based optimization . . . . . . . 2 1.2 A unified approach of rectification and stitching for document images 4 1.3 Rectification via scene text detection . . . . . . . . . . . . . . . . . . 5 1.4 Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Related work 9 2.1 Document rectification . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.1 Document dewarping without text-lines . . . . . . . . . . . . 9 2.1.2 Document dewarping with text-lines . . . . . . . . . . . . . . 10 2.1.3 Text-block identification and text-line extraction . . . . . . . 11 2.2 Document stitching . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 Scene text detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3 Document rectification based on text-lines 15 3.1 Proposed approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1.1 Image acquisition model . . . . . . . . . . . . . . . . . . . . . 16 3.1.2 Proposed approach to document dewarping . . . . . . . . . . 18 3.2 Proposed cost function and its optimization . . . . . . . . . . . . . . 22 3.2.1 Design of Estr(ยท) . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2.2 Minimization of Estr(ยท) . . . . . . . . . . . . . . . . . . . . . 23 3.2.3 Alignment type classification . . . . . . . . . . . . . . . . . . 28 3.2.4 Design of Ealign(ยท) . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2.5 Design of Espacing(ยท) . . . . . . . . . . . . . . . . . . . . . . . 31 3.3 Extension to unfolded book surfaces . . . . . . . . . . . . . . . . . . 32 3.4 Experimental result . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.4.1 Experiments on synthetic data . . . . . . . . . . . . . . . . . 36 3.4.2 Experiments on real images . . . . . . . . . . . . . . . . . . . 39 3.4.3 Comparison with existing methods . . . . . . . . . . . . . . . 43 3.4.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4 Document rectification based on feature detection 49 4.1 Proposed approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.2 Proposed cost function and its optimization . . . . . . . . . . . . . . 51 4.2.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.2.2 Homography between the i-th image and E . . . . . . . . . 52 4.2.3 Proposed cost function . . . . . . . . . . . . . . . . . . . . . . 53 4.2.4 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.2.5 Relation to the model in [17] . . . . . . . . . . . . . . . . . . 55 4.3 Post-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.3.1 Classification of two cases . . . . . . . . . . . . . . . . . . . . 56 4.3.2 Skew removal . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.4 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.4.1 Quantitative evaluation on metric reconstruction performance 57 4.4.2 Experiments on real images . . . . . . . . . . . . . . . . . . . 58 5 Scene text detection and rectification 67 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.1.1 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.1.2 Proposed approach . . . . . . . . . . . . . . . . . . . . . . . . 69 5.2 Candidate region detection . . . . . . . . . . . . . . . . . . . . . . . 70 5.2.1 CC extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.2.2 Computation of similarity between CCs . . . . . . . . . . . . 70 5.2.3 CC clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.3 Rectification of candidate region . . . . . . . . . . . . . . . . . . . . 73 5.4 Text/non-text classification . . . . . . . . . . . . . . . . . . . . . . . 76 5.5 Experimental result . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.5.1 Experimental results on ICDAR 2011 dataset . . . . . . . . . 80 5.5.2 Experimental results on the Asian character dataset . . . . . 80 6 Conclusion 83 Bibliography 87 Abstract (Korean) 97Docto

    Investigation And Development Of Flattening Algorithms For Curved Latent Fingerprint Images

    Get PDF
    Fingerprint had been used to identify a person due to its uniqueness and unchangeable throughout life. However, latent fingerprint acquisition normally being performed on uneven or noisy surface with poor contrast, causing fingerprint minutiae point extracted appear to be inaccurate and affect the result of fingerprint matching. Thus, latent fingerprint required image to be pre-process and enhance before latent search. In order to increase latent matching accuracy, geometry rectification is needed to correct distortion in fingerprint images due to uneven surfaces. This research will investigate and develop flattening algorithm that can be adapted to latent fingerprint images on cylindrical surface. The boundary of an image is required to detect the curvature of an image that need to be flattened. Boundary of interested area can be acquired using a predefined algorithm or define by user using interactive drawing. The flattening algorithm required mapping from cylindrical coordinate to image coordinate. Since curved image appears to be rectangular shape, parabolic approximation and ellipse approximation are being used to design algorithms for flattening. Experimental results prove that algorithm that applies ellipse equation to flatten fingerprint images able to increase the quality of the minutiae. However, measurement results for horizontal axes shows that the distortion in horizontal axis is not being well taken care of. In summary, both algorithms developed able to flatten curved latent fingerprint images with the assumption that image that needs to be flattened is vertical cylindrical shape and boundary of cylinder must be detectable. Algorithm that applies ellipse approximation provides better performance as compared with the algorithm that developed based on parabolic approximation

    Design of Immersive Online Hotel Walkthrough System Using Image-Based (Concentric Mosaics) Rendering

    Get PDF
    Conventional hotel booking websites only represents their services in 2D photos to show their facilities. 2D photos are just static photos that cannot be move and rotate. Imagebased virtual walkthrough for the hospitality industry is a potential technology to attract more customers. In this project, a research will be carried out to create an Image-based rendering (IBR) virtual walkthrough and panoramic-based walkthrough by using only Macromedia Flash Professional 8, Photovista Panorama 3.0 and Reality Studio for the interaction of the images. The web-based of the image-based are using the Macromedia Dreamweaver Professional 8. The images will be displayed in Adobe Flash Player 8 or higher. In making image-based walkthrough, a concentric mosaic technique is used while image mosaicing technique is applied in panoramic-based walkthrough. A comparison of the both walkthrough is compared. The study is also focus on the comparison between number of pictures and smoothness of the walkthrough. There are advantages of using different techniques such as image-based walkthrough is a real time walkthrough since the user can walk around right, left, forward and backward whereas the panoramic-based cannot experience real time walkthrough because the user can only view 360 degrees from a fixed spot
    • โ€ฆ
    corecore