1,204 research outputs found

    A Multi-Feature Selection Approach for Gender Identification of Handwriting based on Kernel Mutual Information

    Get PDF
    This paper presents a new flexible approach to predict the gender of the writers from their handwriting samples. Handwriting features like slant, curvature, line separation, chain code, character shapes, and more, can be extracted from different methods. Therefore, the multi-feature sets are irrelevant and redundant. The conflict of the features exists in the sets, which affects the accuracy of classification and the computing cost. This paper proposes an approach, named Kernel Mutual Information (KMI), that focuses on feature selection. The KMI approach can decrease redundancies and conflicts. In addition, it extracts an optimal subset of features from the writing samples produced by male and female writers. To ensure that KMI can apply the various features, this paper describes the handwriting segmentation and handwritten text recognition technology used. The classification is carried out using a Support Vector Machine (SVM) on two databases. The first database comes from the ICDAR 2013 competition on gender prediction, which provides the samples in both Arabic and English. The other database contains the Registration-Document-Form (RDF) database in Chinese. The proposed and compared methods were evaluated on both databases. Results from the methods highlight the importance of feature selection for gender prediction from handwriting

    ์ •๋ ฌ ํŠน์„ฑ๋“ค ๊ธฐ๋ฐ˜์˜ ๋ฌธ์„œ ๋ฐ ์žฅ๋ฉด ํ…์ŠคํŠธ ์˜์ƒ ํ‰ํ™œํ™” ๊ธฐ๋ฒ•

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ (๋ฐ•์‚ฌ)-- ์„œ์šธ๋Œ€ํ•™๊ต ๋Œ€ํ•™์› ๊ณต๊ณผ๋Œ€ํ•™ ์ „๊ธฐยท์ปดํ“จํ„ฐ๊ณตํ•™๋ถ€, 2017. 8. ์กฐ๋‚จ์ต.์นด๋ฉ”๋ผ๋กœ ์ดฌ์˜ํ•œ ํ…์ŠคํŠธ ์˜์ƒ์— ๋Œ€ํ•ด์„œ, ๊ด‘ํ•™ ๋ฌธ์ž ์ธ์‹(OCR)์€ ์ดฌ์˜๋œ ์žฅ๋ฉด์„ ๋ถ„์„ํ•˜๋Š”๋ฐ ์žˆ์–ด์„œ ๋งค์šฐ ์ค‘์š”ํ•˜๋‹ค. ํ•˜์ง€๋งŒ ์˜ฌ๋ฐ”๋ฅธ ํ…์ŠคํŠธ ์˜์—ญ ๊ฒ€์ถœ ํ›„์—๋„, ์ดฌ์˜ํ•œ ์˜์ƒ์— ๋Œ€ํ•œ ๋ฌธ์ž ์ธ์‹์€ ์—ฌ์ „ํžˆ ์–ด๋ ค์šด ๋ฌธ์ œ๋กœ ์—ฌ๊ฒจ์ง„๋‹ค. ์ด๋Š” ์ข…์ด์˜ ๊ตฌ๋ถ€๋Ÿฌ์ง๊ณผ ์นด๋ฉ”๋ผ ์‹œ์ ์— ์˜ํ•œ ๊ธฐํ•˜ํ•™์ ์ธ ์™œ๊ณก ๋•Œ๋ฌธ์ด๊ณ , ๋”ฐ๋ผ์„œ ์ด๋Ÿฌํ•œ ํ…์ŠคํŠธ ์˜์ƒ์— ๋Œ€ํ•œ ํ‰ํ™œํ™”๋Š” ๋ฌธ์ž ์ธ์‹์— ์žˆ์–ด์„œ ํ•„์ˆ˜์ ์ธ ์ „์ฒ˜๋ฆฌ ๊ณผ์ •์œผ๋กœ ์—ฌ๊ฒจ์ง„๋‹ค. ์ด๋ฅผ ์œ„ํ•œ ์™œ๊ณก๋œ ์ดฌ์˜ ์˜์ƒ์„ ์ •๋ฉด ์‹œ์ ์œผ๋กœ ๋ณต์›ํ•˜๋Š” ํ…์ŠคํŠธ ์˜์ƒ ํ‰ํ™œํ™” ๋ฐฉ๋ฒ•๋“ค์€ ํ™œ๋ฐœํžˆ ์—ฐ๊ตฌ๋˜์–ด์ง€๊ณ  ์žˆ๋‹ค. ์ตœ๊ทผ์—๋Š”, ํ‰ํ™œํ™”๊ฐ€ ์ž˜ ๋œ ํ…์ŠคํŠธ์˜ ์„ฑ์งˆ์— ์ดˆ์ ์„ ๋งž์ถ˜ ์—ฐ๊ตฌ๋“ค์ด ์ฃผ๋กœ ์ง„ํ–‰๋˜๊ณ  ์žˆ๋‹ค. ์ด๋Ÿฌํ•œ ๊ด€์ ์—์„œ, ๋ณธ ํ•™์œ„ ๋…ผ๋ฌธ์€ ํ…์ŠคํŠธ ์˜์ƒ ํ‰ํ™œํ™”๋ฅผ ์œ„ํ•˜์—ฌ ์ƒˆ๋กœ์šด ์ •๋ ฌ ํŠน์„ฑ๋“ค์„ ๋‹ค๋ฃฌ๋‹ค. ์ด๋Ÿฌํ•œ ์ •๋ ฌ ํŠน์„ฑ๋“ค์€ ๋น„์šฉ ํ•จ์ˆ˜๋กœ ์„ค๊ณ„๋˜์–ด์ง€๊ณ , ๋น„์šฉ ํ•จ์ˆ˜๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ํ†ตํ•ด์„œ ํ‰ํ™œํ™”์— ์‚ฌ์šฉ๋˜์–ด์ง€๋Š” ํ‰ํ™œํ™” ๋ณ€์ˆ˜๋“ค์ด ๊ตฌํ•ด์ง„๋‹ค. ๋ณธ ํ•™์œ„ ๋…ผ๋ฌธ์€ ๋ฌธ์„œ ์˜์ƒ ํ‰ํ™œํ™”, ์žฅ๋ฉด ํ…์ŠคํŠธ ํ‰ํ™œํ™”, ์ผ๋ฐ˜ ๋ฐฐ๊ฒฝ ์†์˜ ํœ˜์–ด์ง„ ํ‘œ๋ฉด ํ‰ํ™œํ™”์™€ ๊ฐ™์ด 3๊ฐ€์ง€ ์„ธ๋ถ€ ์ฃผ์ œ๋กœ ๋‚˜๋ˆ ์ง„๋‹ค. ์ฒซ ๋ฒˆ์งธ๋กœ, ๋ณธ ํ•™์œ„ ๋…ผ๋ฌธ์€ ํ…์ŠคํŠธ ๋ผ์ธ๋“ค๊ณผ ์„ ๋ถ„๋“ค์˜ ์ •๋ ฌ ํŠน์„ฑ์— ๊ธฐ๋ฐ˜์˜ ๋ฌธ์„œ ์˜์ƒ ํ‰ํ™œํ™” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๊ธฐ์กด์˜ ํ…์ŠคํŠธ ๋ผ์ธ ๊ธฐ๋ฐ˜์˜ ๋ฌธ์„œ ์˜์ƒ ํ‰ํ™œํ™” ๋ฐฉ๋ฒ•๋“ค์˜ ๊ฒฝ์šฐ, ๋ฌธ์„œ๊ฐ€ ๋ณต์žกํ•œ ๋ ˆ์ด์•„์›ƒ ํ˜•ํƒœ์ด๊ฑฐ๋‚˜ ์ ์€ ์ˆ˜์˜ ํ…์ŠคํŠธ ๋ผ์ธ์„ ํฌํ•จํ•˜๊ณ  ์žˆ์„ ๋•Œ ๋ฌธ์ œ๊ฐ€ ๋ฐœ์ƒํ•œ๋‹ค. ์ด๋Š” ๋ฌธ์„œ์— ํ…์ŠคํŠธ ๋Œ€์‹  ๊ทธ๋ฆผ, ๊ทธ๋ž˜ํ”„ ํ˜น์€ ํ‘œ์™€ ๊ฐ™์€ ์˜์—ญ์ด ๋งŽ์€ ๊ฒฝ์šฐ์ด๋‹ค. ๋”ฐ๋ผ์„œ ๋ ˆ์ด์•„์›ƒ์— ๊ฐ•์ธํ•œ ๋ฌธ์„œ ์˜์ƒ ํ‰ํ™œํ™”๋ฅผ ์œ„ํ•˜์—ฌ ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ์ •๋ ฌ๋œ ํ…์ŠคํŠธ ๋ผ์ธ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์„ ๋ถ„๋“ค๋„ ์ด์šฉํ•œ๋‹ค. ์˜ฌ๋ฐ”๋ฅด๊ฒŒ ํ‰ํ™œํ™” ๋œ ์„ ๋ถ„๋“ค์€ ์—ฌ์ „ํžˆ ์ผ์ง์„ ์˜ ํ˜•ํƒœ์ด๊ณ , ๋Œ€๋ถ€๋ถ„ ๊ฐ€๋กœ ํ˜น์€ ์„ธ๋กœ ๋ฐฉํ–ฅ์œผ๋กœ ์ •๋ ฌ๋˜์–ด ์žˆ๋‹ค๋Š” ๊ฐ€์ • ๋ฐ ๊ด€์ธก์— ๊ทผ๊ฑฐํ•˜์—ฌ, ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ์ด๋Ÿฌํ•œ ์„ฑ์งˆ๋“ค์„ ์ˆ˜์‹ํ™”ํ•˜๊ณ  ์ด๋ฅผ ํ…์ŠคํŠธ ๋ผ์ธ ๊ธฐ๋ฐ˜์˜ ๋น„์šฉ ํ•จ์ˆ˜์™€ ๊ฒฐํ•ฉํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋น„์šฉ ํ•จ์ˆ˜๋ฅผ ์ตœ์†Œํ™” ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ํ†ตํ•ด, ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ์ข…์ด์˜ ๊ตฌ๋ถ€๋Ÿฌ์ง, ์นด๋ฉ”๋ผ ์‹œ์ , ์ดˆ์  ๊ฑฐ๋ฆฌ์™€ ๊ฐ™์€ ํ‰ํ™œํ™” ๋ณ€์ˆ˜๋“ค์„ ์ถ”์ •ํ•œ๋‹ค. ๋˜ํ•œ, ์˜ค๊ฒ€์ถœ๋œ ํ…์ŠคํŠธ ๋ผ์ธ๋“ค๊ณผ ์ž„์˜์˜ ๋ฐฉํ–ฅ์„ ๊ฐ€์ง€๋Š” ์„ ๋ถ„๋“ค๊ณผ ๊ฐ™์€ ์ด์ƒ์ (outlier)์„ ๊ณ ๋ คํ•˜์—ฌ, ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ๋ฐ˜๋ณต์ ์ธ ๋‹จ๊ณ„๋กœ ์„ค๊ณ„๋œ๋‹ค. ๊ฐ ๋‹จ๊ณ„์—์„œ, ์ •๋ ฌ ํŠน์„ฑ์„ ๋งŒ์กฑํ•˜์ง€ ์•Š๋Š” ์ด์ƒ์ ๋“ค์€ ์ œ๊ฑฐ๋˜๊ณ , ์ œ๊ฑฐ๋˜์ง€ ์•Š์€ ํ…์ŠคํŠธ ๋ผ์ธ ๋ฐ ์„ ๋ถ„๋“ค๋งŒ์ด ๋น„์šฉํ•จ์ˆ˜ ์ตœ์ ํ™”์— ์ด์šฉ๋œ๋‹ค. ์ˆ˜ํ–‰ํ•œ ์‹คํ—˜ ๊ฒฐ๊ณผ๋“ค์€ ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ๋‹ค์–‘ํ•œ ๋ ˆ์ด์•„์›ƒ์— ๋Œ€ํ•˜์—ฌ ๊ฐ•์ธํ•จ์„ ๋ณด์—ฌ์ค€๋‹ค. ๋‘ ๋ฒˆ์งธ๋กœ๋Š”, ๋ณธ ๋…ผ๋ฌธ์€ ์žฅ๋ฉด ํ…์ŠคํŠธ ํ‰ํ™œํ™” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ•œ๋‹ค. ๊ธฐ์กด ์žฅ๋ฉด ํ…์ŠคํŠธ ํ‰ํ™œํ™” ๋ฐฉ๋ฒ•๋“ค์˜ ๊ฒฝ์šฐ, ๊ฐ€๋กœ/์„ธ๋กœ ๋ฐฉํ–ฅ์˜ ํš, ๋Œ€์นญ ํ˜•ํƒœ์™€ ๊ฐ™์€ ๋ฌธ์ž๊ฐ€ ๊ฐ€์ง€๋Š” ๊ณ ์œ ์˜ ์ƒ๊น€์ƒˆ์— ๊ด€๋ จ๋œ ํŠน์„ฑ์„ ์ด์šฉํ•œ๋‹ค. ํ•˜์ง€๋งŒ, ์ด๋Ÿฌํ•œ ๋ฐฉ๋ฒ•๋“ค์€ ๋ฌธ์ž๋“ค์˜ ์ •๋ ฌ ํ˜•ํƒœ๋Š” ๊ณ ๋ คํ•˜์ง€ ์•Š๊ณ , ๊ฐ๊ฐ ๊ฐœ๋ณ„ ๋ฌธ์ž์— ๋Œ€ํ•œ ํŠน์„ฑ๋“ค๋งŒ์„ ์ด์šฉํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์—ฌ๋Ÿฌ ๋ฌธ์ž๋“ค๋กœ ๊ตฌ์„ฑ๋œ ํ…์ŠคํŠธ์— ๋Œ€ํ•ด์„œ ์ž˜ ์ •๋ ฌ๋˜์ง€ ์•Š์€ ๊ฒฐ๊ณผ๋ฅผ ์ถœ๋ ฅํ•œ๋‹ค. ์ด๋Ÿฌํ•œ ๋ฌธ์ œ์ ์„ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•˜์—ฌ, ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ๋ฌธ์ž๋“ค์˜ ์ •๋ ฌ ์ •๋ณด๋ฅผ ์ด์šฉํ•œ๋‹ค. ์ •ํ™•ํ•˜๊ฒŒ๋Š”, ๋ฌธ์ž ๊ณ ์œ ์˜ ๋ชจ์–‘๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ •๋ ฌ ํŠน์„ฑ๋“ค๋„ ํ•จ๊ป˜ ๋น„์šฉํ•จ์ˆ˜๋กœ ์ˆ˜์‹ํ™”๋˜๊ณ , ๋น„์šฉํ•จ์ˆ˜๋ฅผ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ํ†ตํ•ด์„œ ํ‰ํ™œํ™”๊ฐ€ ์ง„ํ–‰๋œ๋‹ค. ๋˜ํ•œ, ๋ฌธ์ž๋“ค์˜ ์ •๋ ฌ ํŠน์„ฑ์„ ์ˆ˜์‹ํ™”ํ•˜๊ธฐ ์œ„ํ•˜์—ฌ, ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ํ…์ŠคํŠธ๋ฅผ ๊ฐ๊ฐ ๊ฐœ๋ณ„ ๋ฌธ์ž๋“ค๋กœ ๋ถ„๋ฆฌํ•˜๋Š” ๋ฌธ์ž ๋ถ„๋ฆฌ ๋˜ํ•œ ์ˆ˜ํ–‰ํ•œ๋‹ค. ๊ทธ ๋’ค, ํ…์ŠคํŠธ์˜ ์œ„, ์•„๋ž˜ ์„ ๋“ค์„ RANSAC ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ด์šฉํ•œ ์ตœ์†Œ ์ œ๊ณฑ๋ฒ•์„ ํ†ตํ•ด ์ถ”์ •ํ•œ๋‹ค. ์ฆ‰, ์ „์ฒด ์•Œ๊ณ ๋ฆฌ์ฆ˜์€ ๋ฌธ์ž ๋ถ„๋ฆฌ์™€ ์„  ์ถ”์ •, ํ‰ํ™œํ™”๊ฐ€ ๋ฐ˜๋ณต์ ์œผ๋กœ ์ˆ˜ํ–‰๋œ๋‹ค. ์ œ์•ˆํ•˜๋Š” ๋น„์šฉํ•จ์ˆ˜๋Š” ๋ณผ๋ก(convex)ํ˜•ํƒœ๊ฐ€ ์•„๋‹ˆ๊ณ  ๋˜ํ•œ ๋งŽ์€ ๋ณ€์ˆ˜๋“ค์„ ํฌํ•จํ•˜๊ณ  ์žˆ๊ธฐ ๋•Œ๋ฌธ์—, ์ด๋ฅผ ์ตœ์ ํ™”ํ•˜๊ธฐ ์œ„ํ•˜์—ฌ Augmented Lagrange Multiplier ๋ฐฉ๋ฒ•์„ ์ด์šฉํ•œ๋‹ค. ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ์ผ๋ฐ˜ ์ดฌ์˜ ์˜์ƒ๊ณผ ํ•ฉ์„ฑ๋œ ํ…์ŠคํŠธ ์˜์ƒ์„ ํ†ตํ•ด ์‹คํ—˜์ด ์ง„ํ–‰๋˜์—ˆ๊ณ , ์‹คํ—˜ ๊ฒฐ๊ณผ๋“ค์€ ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ๊ธฐ์กด ๋ฐฉ๋ฒ•๋“ค์— ๋น„ํ•˜์—ฌ ๋†’์€ ์ธ์‹ ์„ฑ๋Šฅ์„ ๋ณด์ด๋ฉด์„œ ๋™์‹œ์— ์‹œ๊ฐ์ ์œผ๋กœ๋„ ์ข‹์€ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์ž„์„ ๋ณด์—ฌ์ค€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ์ผ๋ฐ˜ ๋ฐฐ๊ฒฝ ์†์˜ ํœ˜์–ด์ง„ ํ‘œ๋ฉด ํ‰ํ™œํ™” ๋ฐฉ๋ฒ•์œผ๋กœ๋„ ํ™•์žฅ๋œ๋‹ค. ์ผ๋ฐ˜ ๋ฐฐ๊ฒฝ์— ๋Œ€ํ•ด์„œ, ์•ฝ๋ณ‘์ด๋‚˜ ์Œ๋ฃŒ์ˆ˜ ์บ”๊ณผ ๊ฐ™์ด ์›ํ†ต ํ˜•ํƒœ์˜ ๋ฌผ์ฒด๋Š” ๋งŽ์ด ์กด์žฌํ•œ๋‹ค. ๊ทธ๋“ค์˜ ํ‘œ๋ฉด์€ ์ผ๋ฐ˜ ์›ํ†ต ํ‘œ๋ฉด(GCS)์œผ๋กœ ๋ชจ๋ธ๋ง์ด ๊ฐ€๋Šฅํ•˜๋‹ค. ์ด๋Ÿฌํ•œ ํœ˜์–ด์ง„ ํ‘œ๋ฉด๋“ค์€ ๋งŽ์€ ๋ฌธ์ž์™€ ๊ทธ๋ฆผ๋“ค์„ ํฌํ•จํ•˜๊ณ  ์žˆ์ง€๋งŒ, ํฌํ•จ๋œ ๋ฌธ์ž๋Š” ๋ฌธ์„œ์— ๋น„ํ•ด์„œ ๋งค์šฐ ๋ถˆ๊ทœ์น™์ ์ธ ๊ตฌ์กฐ๋ฅผ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ๋”ฐ๋ผ์„œ ๊ธฐ์กด์˜ ๋ฌธ์„œ ์˜์ƒ ํ‰ํ™œํ™” ๋ฐฉ๋ฒ•๋“ค๋กœ๋Š” ์ผ๋ฐ˜ ๋ฐฐ๊ฒฝ ์† ํœ˜์–ด์ง„ ํ‘œ๋ฉด ์˜์ƒ์„ ํ‰ํ™œํ™”ํ•˜๊ธฐ ํž˜๋“ค๋‹ค. ๋งŽ์€ ํœ˜์–ด์ง„ ํ‘œ๋ฉด์€ ์ž˜ ์ •๋ ฌ๋œ ์„ ๋ถ„๋“ค (ํ…Œ๋‘๋ฆฌ ์„  ํ˜น์€ ๋ฐ”์ฝ”๋“œ)์„ ํฌํ•จํ•˜๊ณ  ์žˆ๋‹ค๋Š” ๊ด€์ธก์— ๊ทผ๊ฑฐํ•˜์—ฌ, ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์€ ์•ž์„œ ์ œ์•ˆํ•œ ์„ ๋ถ„๋“ค์— ๋Œ€ํ•œ ํ•จ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ ํœ˜์–ด์ง„ ํ‘œ๋ฉด์„ ํ‰ํ™œํ™”ํ•œ๋‹ค. ๋‹ค์–‘ํ•œ ๋‘ฅ๊ทผ ๋ฌผ์ฒด์˜ ํœ˜์–ด์ง„ ํ‘œ๋ฉด ์˜์ƒ๋“ค์— ๋Œ€ํ•œ ์‹คํ—˜ ๊ฒฐ๊ณผ๋“ค์€ ์ œ์•ˆํ•˜๋Š” ๋ฐฉ๋ฒ•์ด ํ‰ํ™œํ™”๋ฅผ ์ •ํ™•ํ•˜๊ฒŒ ์ˆ˜ํ–‰ํ•จ์„ ๋ณด์—ฌ์ค€๋‹ค.The optical character recognition (OCR) of text images captured by cameras plays an important role for scene understanding. However, the OCR of camera-captured image is still considered a challenging problem, even after the text detection (localization). It is mainly due to the geometric distortions caused by page curve and perspective view, therefore their rectification has been an essential pre-processing step for their recognition. Thus, there have been many text image rectification methods which recover the fronto-parallel view image from a single distorted image. Recently, many researchers have focused on the properties of the well-rectified text. In this respect, this dissertation presents novel alignment properties for text image rectification, which are encoded into the proposed cost functions. By minimizing the cost functions, the transformation parameters for rectification are obtained. In detail, they are applied to three topics: document image dewarping, scene text rectification, and curved surface dewarping in real scene. First, a document image dewarping method is proposed based on the alignments of text-lines and line segments. Conventional text-line based document dewarping methods have problems when handling complex layout and/or very few text-lines. When there are few aligned text-lines in the image, this usually means that photos, graphics and/or tables take large portion of the input instead. Hence, for the robust document dewarping, the proposed method uses line segments in the image in addition to the aligned text-lines. Based on the assumption and observation that all the transformed line segments are still straight (line to line mapping), and many of them are horizontally or vertically aligned in the well-rectified images, the proposed method encodes this properties into the cost function in addition to the text-line based cost. By minimizing the function, the proposed method can obtain transformation parameters for page curve, camera pose, and focal length, which are used for document image rectification. Considering that there are many outliers in line segment directions and miss-detected text-lines in some cases, the overall algorithm is designed in an iterative manner. At each step, the proposed method removes the text-lines and line segments that are not well aligned, and then minimizes the cost function with the updated information. Experimental results show that the proposed method is robust to the variety of page layouts. This dissertation also presents a method for scene text rectification. Conventional methods for scene text rectification mainly exploited the glyph property, which means that the characters in many language have horizontal/vertical strokes and also some symmetric shapes. However, since they consider the only shape properties of individual character, without considering the alignments of characters, they work well for only images with a single character, and still yield mis-aligned results for images with multiple characters. In order to alleviate this problem, the proposed method explicitly imposes alignment constraints on rectified results. To be precise, character alignments as well as glyph properties are encoded in the proposed cost function, and the transformation parameters are obtained by minimizing the function. Also, in order to encode the alignments of characters into the cost function, the proposed method separates the text into individual characters using a projection profile method before optimizing the cost function. Then, top and bottom lines are estimated using a least squares line fitting with RANSAC. Overall algorithm is designed to perform character segmentation, line fitting, and rectification iteratively. Since the cost function is non-convex and many variables are involved in the function, the proposed method also develops an optimization method using Augmented Lagrange Multiplier method. This dissertation evaluates the proposed method on real and synthetic text images and experimental results show that the proposed method achieves higher OCR accuracy than the conventional approach and also yields visually pleasing results. Finally, the proposed method can be extended to the curved surface dewarping in real scene. In real scene, there are many circular objects such as medicine bottles or cans of drinking water, and their curved surfaces can be modeled as Generalized Cylindrical Surfaces (GCS). These curved surfaces include many significant text and figures, however their text has irregular structure compared to documents. Therefore, the conventional dewarping methods based on the properties of well-rectified text have problems in their rectification. Based on the observation that many curved surfaces include well-aligned line segments (boundary lines of objects or barcode), the proposed method rectifies the curved surfaces by exploiting the proposed line segment terms. Experimental results on a range of images with curved surfaces of circular objects show that the proposed method performs rectification robustly.1 Introduction 1 1.1 Document image dewarping 3 1.2 Scene text rectification 5 1.3 Curved surface dewarping in real scene 7 1.4 Contents 8 2 Related work 9 2.1 Document image dewarping 9 2.1.1 Dewarping methods using additional information 9 2.1.2 Text-line based dewarping methods 10 2.2 Scene text rectification 11 2.3 Curved surface dewarping in real scene 12 3 Document image dewarping 15 3.1 Proposed cost function 15 3.1.1 Parametric model of dewarping process 15 3.1.2 Cost function design 18 3.1.3 Line segment properties and cost function 19 3.2 Outlier removal and optimization 26 3.2.1 Jacobian matrix of the proposed cost function 27 3.3 Document region detection and dewarping 31 3.4 Experimental results 32 3.4.1 Experimental results on text-abundant document images 33 3.4.2 Experimental results on non conventional document images 34 3.5 Summary 47 4 Scene text rectification 49 4.1 Proposed cost function for rectification 49 4.1.1 Cost function design 49 4.1.2 Character alignment properties and alignment terms 51 4.2 Overall algorithm 54 4.2.1 Initialization 55 4.2.2 Character segmentation 56 4.2.3 Estimation of the alignment parameters 57 4.2.4 Cost function optimization for rectification 58 4.3 Experimental results 63 4.4 Summary 66 5 Curved surface dewarping in real scene 73 5.1 Proposed curved surface dewarping method 73 5.1.1 Pre-processing 73 5.1 Experimental results 74 5.2 Summary 76 6 Conclusions 83 Bibliography 85 Abstract (Korean) 93Docto

    Extraction of textual information from image for information retrieval

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    A Work and its Shapers: The "Most High Scripture of the Rectifying Methods of the Three Heavens" in Early Medieval China

    Get PDF
    abstract: Religions, following Max Mรผller, have often been seen by scholars in religious studies as uniform collections of beliefs and practices encoded in stable โ€œsacred booksโ€ that direct the conduct of religious actors. These texts were the chief focus of academic students of religion through much of the 20th century, and this approach remains strong in the 21st. However, a growing chorus of dissidents has begun to focus on the lived experience of practitioners and the material objects that structure that experience, and some textual scholars have begun extending this materialist framework to the study of texts. This dissertation is a contribution in that vein from the field of Daoist studies. Now split between two separate texts, the Most High Scripture of the Rectifying Methods of the Three Heavens began as a 4th-century collection of apocalyptic predictions and apotropaic devices designed to deliver a select group of Chinese literati to the heavens of Highest Clarity. Later editors during the early medieval period (ca. 220-589 CE) took one of two paths: for their own reasons, they altered the Rectifying Methods to emphasize either the worldโ€™s end or its continuation. Detailed study of these alterations and their contexts shows how individuals and groups used and modified the Rectifying Methods in in ways that challenge the conventional relationship between religious text and religious actor.Dissertation/ThesisDoctoral Dissertation Religious Studies 201

    Geometric correction of historical Arabic documents

    Get PDF
    Geometric deformations in historical documents significantly influence the success of both Optical Character Recognition (OCR) techniques and human readability. They may have been introduced at any time during the life cycle of a document, from when it was first printed to the time it was digitised by an imaging device. This Thesis focuses on the challenging domain of geometric correction of Arabic historical documents, where background research has highlighted that existing approaches for geometric correction of Latin-script historical documents are not sensitive to the characteristics of text in Arabic documents and therefore cannot be applied successfully. Text line segmentation and baseline detection algorithms have been investigated to propose a new more suitable one for warped Arabic historical document images. Advanced ideas for performing dewarping and geometric restoration on historical Arabic documents, as dictated by the specific characteristics of the problem have been implemented.In addition to developing an algorithm to detect accurate baselines of historical printed Arabic documents the research also contributes a new dataset consisting of historical Arabic documents with different degrees of warping severity.Overall, a new dewarping system, the first for Historical Arabic documents, has been developed taking into account both global and local features of the text image and the patterns of the smooth distortion between text lines. By using the results of the proposed line segmentation and baseline detection methods, it can cope with a variety of distortions, such as page curl, arbitrary warping and fold

    Examining Second Language Reading: A Critical Review of the Singapore Cambridge General Certificate of Education Ordinary-Level Chinese Language Examination

    Get PDF
    This mixed methods study critically reviews how the Singapore-Cambridge General Certificate of Education Ordinary-Level Chinese Language Examination (GCE 1162) examines second language reading. The main research question asks, โ€˜To what degree have the intended measurement objectives of the GCE 1162 reading examination been achieved?โ€™ Four sub research questions address issues of specifications and administration, test-taker characteristics, cognitive parameters and contextual parameters. Resources drawn on include Singapore Ministry of Education and Singapore Examinations and Assessment Board documents, specifically, examination information booklets, syllabuses, committee reports and annual reviews. Subject matter experts were appointed to analyse the reading comprehension passages and test items from 22 sets of GCE 1162 reading examination papers from 2006 to 2016. Semi-structured interviews were carried out with 22 stakeholders involved in coordination, test design, item construction, marking and reviewing. The interviewees included members of an elite policy group with privileged access to test specifications and procedures. Further interviews were carried out with secondary school Chinese language teachers and students, whose perspectives are seldom considered in validation processes. Opinions were also sought from experts in the field of Chinese as a second language, reading and assessment. The study begins with an account of the concepts of validity and reading constructs. Chapter 2 discusses the Singapore education and examination system, foregrounding the history of Chinese language education and the bilingual policy introduced in 1966. A methodology chapter follows. Chapters 4 to 8 address separately each of the four sub research questions in which claims, assumptions, supporting evidence and rebuttals are presented. The final chapter, Chapter 9, addresses a posteriori inferences, including scoring, criterion-related components, and washback and impact. A cautious conclusion is drawn, namely that the measurement quality of the GCE 1162 reading examination is at a moderately unsatisfactory level

    A word image coding technique and its applications in information retrieval from imaged documents

    Get PDF
    Master'sMASTER OF SCIENC
    • โ€ฆ
    corecore