Search CORE

1,514 research outputs found

Effective Geometric Restoration of Distorted Historical Document for Large-Scale Digitization

Author: Antonacopoulos A
Clausner C
Pletschacher S
Qi J
Yang P
Publication venue: 'Institution of Engineering and Technology (IET)'
Publication date: 23/03/2017
Field of study

Due to storage conditions and material’s non-planar shape, geometric distortion of the 2-D content is widely present in scanned document images. Effective geometric restoration of these distorted document images considerably increases character recognition rate in large-scale digitisation. For large-scale digitisation of historical books, geometric restoration solutions expect to be accurate, generic, robust, unsupervised and reversible. However, most methods in the literature concentrate on improving restoration accuracy for specific distortion effect, but not their applicability in large-scale digitisation. This paper proposes an effective mesh based geometric restoration system, (GRLSD), for large-scale distorted historical document digitisation. In this system, an automatic mesh generation based dewarping tool is proposed to geometrically model and correct arbitrary warping historical documents. An XML based mesh recorder is proposed to record the mesh of distortion information for reversible use. A graphic user interface toolkit is designed to visually display and manually manipulate the mesh for improving geometric restoration accuracy. Experimental results show that the proposed automatic dewarping approach efficiently corrects arbitrarily warped historical documents, with an improved performance over several state-of-the-art geometric restoration methods. By using XML mesh recorder and GUI toolkit, the GRLSD system greatly aids users to flexibly monitor and correct ambiguous points of mesh for the prevention of damaging historical document images without distortions in large-scale digitalisation

LJMU Research Online (Liverpool John Moores University)

University of Salford Institutional Repository

White Rose Research Online

문서 경계와 3차원 재구성에 기반한 문서 이미지 평판화

Author: 전명재
Publication venue: 서울대학교 대학원
Publication date: 01/08/2022
Field of study

학위논문(석사) -- 서울대학교대학원 : 자연과학대학 수리과학부, 2022. 8. 현동훈.In recent days, most of the scanned images are obtained from mobile devices such as cameras, smartphones, and tablets rather than traditional flatbed scanners. Contrary to the scanning process of the traditional scanners, capturing process of mobile devices might be accompanied by distortions in various forms such as perspective distortion, fold distortion, and page curls. In this thesis, we propose robust dewarping methods which correct such distortions based on the document boundary and 3D reconstruction. In the first method, we construct a curvilinear grid on the document image using the document boundary and reconstruct the document surface in the three dimensional space. Then we rectify the image using a family of local homographies computed from the reconstructed document surface. Although some of the steps of the proposed method have been proposed separately in other research, our approach exploited and combined their advantages to propose a robust dewarping process in addition to improving the stability in the overall process. Moreover, we refined the process by correcting the distorted text region boundary and developed this process into an independent dewarping method which is concise, straight-forward, and robust while still producing a well-rectified document image.최근에는 대부분의 스캔된 이미지들이 전통적인 평판스캐너가 아닌 카메라, 스마트폰, 태블릿 PC 등의 휴대기기들로부터 얻어진다. 이전 스캐너들의 스캐닝 과정과는 다르게 휴대기기들을 이용한 이미지 캡쳐링 과정은 원근왜곡, 종이의 접힘으로 인한 왜곡, 그리고 종이의 휘어짐으로 인한 왜곡 등 다양한 왜곡들을 수반할 수 있다. 이 논문에서는 이러한 왜곡들을 제거할 수 있는 문서 경계와 3차원 재구성에 기반한 강력한 디워핑 방법을 제안하고자 한다. 첫번째 방법에서는, 문서 경계를 이용하여 문서 이미지 위에 곡선으로 이루어진 그리드를 만들고, 3차원 공간 상의 문서 곡면을 재구성한다. 그리고 재구성된 문서 곡면으로부터 계산된 국소적 호모그래피들을 이용하여 이미지를 수정한다. 우리가 제안하는 방법의 몇몇 단계는 다른 연구에서 개별적으로 사용된 경우도 있지만, 우리는 전체적인 과정에서 안정성을 높이는 동시에 각 방법의 장점들을 이용하고 조합하여 강력한 디워핑 방법을 제안한다. 이에 더하여, 우리는 왜곡된 텍스트 영역의 경계를 수정하여 전체적인 과정을 보완하였고, 이 절차를 간결하고, 직관적이며, 강력하면서도 좋은 결과를 내는 독립적인 디워핑 방법으로 개발하였다.1. Introduction 1 2. Review on Camera Geometry 6 2.1. Basic Camera Model 6 2.2. 3D Reconstruction Problem 8 3. Related Works 10 3.1. Dewarping Methods based on the Text-lines 10 3.2. Dewarping Methods based on the Document Boundary 11 3.3. Dewarping Methods based on the Grid Construction 12 3.4. Dewarping Methods based on the Document Surface Model in 3D Space 13 4. Document Image Dewarping based on the Document Boundary and 3D Reconstruction 15 4.1. Input Document Image Processing 17 4.1.1. Binarization of the Input Document Image 17 4.1.2. Perspective Distortion Removal using the Document Boundary 19 4.2. Grid Construction on the Document Image 21 4.3. 3D Reconstruction of the Document Surface 23 4.3.1. Geometric Model 23 4.3.2. Normalization of the Grid Corners 24 4.3.3. 3D Reconstruction of the Document Surface 26 4.4. Rectification of the Document Image under a Family of Local Homographies 27 4.5. Global Rectification of the Document Image 29 5. Document Image Dewarping by Straightening Document Boundary Curves 33 6. Conclusion 37 Appendix A. 38 A.1. 4-point Algorithm 38 A.2. Optimization of the Cost Function 40 Bibliography 42 Abstract (in Korean) 47 Acknowledgement (in Korean) 48석

SNU Open Repository and Archive

AirCode: Unobtrusive Physical Tags for Digital Fabrication

Author: Li Dingzeyu
Nair Avinash S.
Nayar Shree K.
Zheng Changxi
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 07/08/2017
Field of study

We present AirCode, a technique that allows the user to tag physically fabricated objects with given information. An AirCode tag consists of a group of carefully designed air pockets placed beneath the object surface. These air pockets are easily produced during the fabrication process of the object, without any additional material or postprocessing. Meanwhile, the air pockets affect only the scattering light transport under the surface, and thus are hard to notice to our naked eyes. But, by using a computational imaging method, the tags become detectable. We present a tool that automates the design of air pockets for the user to encode information. AirCode system also allows the user to retrieve the information from captured images via a robust decoding algorithm. We demonstrate our tagging technique with applications for metadata embedding, robotic grasping, as well as conveying object affordances.Comment: ACM UIST 2017 Technical Paper

arXiv.org e-Print Archive

Crossref

텍스트와 특징점 기반의 목적함수 최적화를 이용한 문서와 텍스트 평활화 기법

Author: 김범수
Publication venue: 서울대학교 대학원
Publication date: 01/08/2014
Field of study

학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2014. 8. 조남익.There are many techniques and applications that detect and recognize text information in the images, e.g., document retrieval using the camera-captured document image, book reader for visually impaired, and augmented reality based on text recognition. In these applications, the planar surfaces which contain the text are often distorted in the captured image due to the perspective view (e.g., road signs), curvature (e.g., unfolded books), and wrinkles (e.g., old documents). Specifically, recovering the original document texture by removing these distortions from the camera-captured document images is called the document rectification. In this dissertation, new text surface rectification algorithms are proposed, for improving text recognition accuracy and visual quality. The proposed methods are categorized into 3 types depending on the types of the input. The contributions of the proposed methods can be summarized as follows. In the first rectification algorithm, the dense text-lines in the documents are employed to rectify the images. Unlike the conventional approaches, the proposed method does not directly use the text-line. Instead, the proposed method use the discrete representation of text-lines and text-blocks which are the sets of connected components. Also, the geometric distortion caused by page curl and perspective view are modeled as generalized cylindrical surfaces and camera rotation respectively. With these distortion model and discrete representation of the features, a cost function whose minimization yields parameters of the distortion model is developed. In the cost function, the properties of the pages such as text-block alignment, line-spacing, and the straightness of text-lines are encoded. By describing the text features using the sets of discrete points, the cost function can be easily defined and well solved by Levenberg-Marquadt algorithm. Experiments show that the proposed method works well for the various layouts and curved surfaces, and compares favorably with the conventional methods on the standard dataset. The second algorithm is a unified framework to rectify and stitch multiple document images using visual feature points instead of text lines. This is similar to the method employed in general image stitching algorithm. However, the general image stitching algorithm usually assumes fixed center of camera, which is not taken for granted in capturing the document. To deal with the camera motion between images, a new parametric family of motion model is proposed in this dissertation. Besides, to remove the ambiguity in the reference plane, a new cost function is developed to impose the constraints on the reference plane. This enables the estimation of physically correct reference plane without prior knowledge. The estimated reference plane can also be used to rectify the stitching result. Furthermore, the proposed method can be applied to any other planar object such as building facades or mural paintings as well as the camera-captured document image since it employs the general features. The third rectification method is based on scene text detection algorithm, which is independent from the language model. The conventional methods assume that a character consists of a single connected component (CC) like English alphabet. However, this assumption is brittle in the Asian characters such as Korean, Chinese, and Japanese, where a single character consists of several CCs. Therefore, it is difficult to divide CCs into text lines without language model. To alleviate this problem, the proposed method clusters the candidate regions based on the similarity measure considering inter-character relation. The adjacency measure is trained on the data set labeled with the bounding box of text region. Non-text regions that remain after clustering are filtered out in text/non-text classification step. Final text regions are merged or divided into each text line considering the orientation and location. The detected text is rectified using the orientation of text-line and vertical strokes. The proposed method outperforms state-of-the-art algorithms in English as well as Asian characters in the extensive experiments.1 Introduction 1 1.1 Document rectification via text-line based optimization . . . . . . . 2 1.2 A unified approach of rectification and stitching for document images 4 1.3 Rectification via scene text detection . . . . . . . . . . . . . . . . . . 5 1.4 Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2 Related work 9 2.1 Document rectification . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.1.1 Document dewarping without text-lines . . . . . . . . . . . . 9 2.1.2 Document dewarping with text-lines . . . . . . . . . . . . . . 10 2.1.3 Text-block identification and text-line extraction . . . . . . . 11 2.2 Document stitching . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.3 Scene text detection . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3 Document rectification based on text-lines 15 3.1 Proposed approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.1.1 Image acquisition model . . . . . . . . . . . . . . . . . . . . . 16 3.1.2 Proposed approach to document dewarping . . . . . . . . . . 18 3.2 Proposed cost function and its optimization . . . . . . . . . . . . . . 22 3.2.1 Design of Estr(·) . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2.2 Minimization of Estr(·) . . . . . . . . . . . . . . . . . . . . . 23 3.2.3 Alignment type classification . . . . . . . . . . . . . . . . . . 28 3.2.4 Design of Ealign(·) . . . . . . . . . . . . . . . . . . . . . . . . 29 3.2.5 Design of Espacing(·) . . . . . . . . . . . . . . . . . . . . . . . 31 3.3 Extension to unfolded book surfaces . . . . . . . . . . . . . . . . . . 32 3.4 Experimental result . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.4.1 Experiments on synthetic data . . . . . . . . . . . . . . . . . 36 3.4.2 Experiments on real images . . . . . . . . . . . . . . . . . . . 39 3.4.3 Comparison with existing methods . . . . . . . . . . . . . . . 43 3.4.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 4 Document rectification based on feature detection 49 4.1 Proposed approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.2 Proposed cost function and its optimization . . . . . . . . . . . . . . 51 4.2.1 Notations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 4.2.2 Homography between the i-th image and E . . . . . . . . . 52 4.2.3 Proposed cost function . . . . . . . . . . . . . . . . . . . . . . 53 4.2.4 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 4.2.5 Relation to the model in [17] . . . . . . . . . . . . . . . . . . 55 4.3 Post-processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.3.1 Classification of two cases . . . . . . . . . . . . . . . . . . . . 56 4.3.2 Skew removal . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 4.4 Experimental results . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.4.1 Quantitative evaluation on metric reconstruction performance 57 4.4.2 Experiments on real images . . . . . . . . . . . . . . . . . . . 58 5 Scene text detection and rectification 67 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.1.1 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.1.2 Proposed approach . . . . . . . . . . . . . . . . . . . . . . . . 69 5.2 Candidate region detection . . . . . . . . . . . . . . . . . . . . . . . 70 5.2.1 CC extraction . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 5.2.2 Computation of similarity between CCs . . . . . . . . . . . . 70 5.2.3 CC clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . 73 5.3 Rectification of candidate region . . . . . . . . . . . . . . . . . . . . 73 5.4 Text/non-text classification . . . . . . . . . . . . . . . . . . . . . . . 76 5.5 Experimental result . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 5.5.1 Experimental results on ICDAR 2011 dataset . . . . . . . . . 80 5.5.2 Experimental results on the Asian character dataset . . . . . 80 6 Conclusion 83 Bibliography 87 Abstract (Korean) 97Docto

SNU Open Repository and Archive

Investigation And Development Of Flattening Algorithms For Curved Latent Fingerprint Images

Author: Chong Bei Wei
Publication venue
Publication date: 01/01/2015
Field of study

Fingerprint had been used to identify a person due to its uniqueness and unchangeable throughout life. However, latent fingerprint acquisition normally being performed on uneven or noisy surface with poor contrast, causing fingerprint minutiae point extracted appear to be inaccurate and affect the result of fingerprint matching. Thus, latent fingerprint required image to be pre-process and enhance before latent search. In order to increase latent matching accuracy, geometry rectification is needed to correct distortion in fingerprint images due to uneven surfaces. This research will investigate and develop flattening algorithm that can be adapted to latent fingerprint images on cylindrical surface. The boundary of an image is required to detect the curvature of an image that need to be flattened. Boundary of interested area can be acquired using a predefined algorithm or define by user using interactive drawing. The flattening algorithm required mapping from cylindrical coordinate to image coordinate. Since curved image appears to be rectangular shape, parabolic approximation and ellipse approximation are being used to design algorithms for flattening. Experimental results prove that algorithm that applies ellipse equation to flatten fingerprint images able to increase the quality of the minutiae. However, measurement results for horizontal axes shows that the distortion in horizontal axis is not being well taken care of. In summary, both algorithms developed able to flatten curved latent fingerprint images with the assumption that image that needs to be flattened is vertical cylindrical shape and boundary of cylinder must be detectable. Algorithm that applies ellipse approximation provides better performance as compared with the algorithm that developed based on parabolic approximation

Repository@USM

Design of Immersive Online Hotel Walkthrough System Using Image-Based (Concentric Mosaics) Rendering

Author: Abdul Liyo Nedor Nor Farhana
Publication venue: Universiti Teknologi PETRONAS
Publication date: 01/07/2007
Field of study

Conventional hotel booking websites only represents their services in 2D photos to show their facilities. 2D photos are just static photos that cannot be move and rotate. Imagebased virtual walkthrough for the hospitality industry is a potential technology to attract more customers. In this project, a research will be carried out to create an Image-based rendering (IBR) virtual walkthrough and panoramic-based walkthrough by using only Macromedia Flash Professional 8, Photovista Panorama 3.0 and Reality Studio for the interaction of the images. The web-based of the image-based are using the Macromedia Dreamweaver Professional 8. The images will be displayed in Adobe Flash Player 8 or higher. In making image-based walkthrough, a concentric mosaic technique is used while image mosaicing technique is applied in panoramic-based walkthrough. A comparison of the both walkthrough is compared. The study is also focus on the comparison between number of pictures and smoothness of the walkthrough. There are advantages of using different techniques such as image-based walkthrough is a real time walkthrough since the user can walk around right, left, forward and backward whereas the panoramic-based cannot experience real time walkthrough because the user can only view 360 degrees from a fixed spot

UTPedia