    A Novel Approach to remove Ink Bleed through Degraded Document Images

    There are numerous recorded reports which manages the corruption of paper because of paper maturing, foundation variety because of clamor, uneven brightening or dim spots and characterization of loss of literary data in debased archives like light introduction which causes blurring of content or ink chipping, Degradation of the written work medium i.e. clouded or missing content brought about because of the vicinity of mould,parasites,dampness or weakness in the medium,When ink has leaked through posterior or a page to front reasons seep through interference,whenever we digitize our record it may present some clamor curios which may make debasement the printed information.There are numerous corrupted yet truly vital old original copies and reports dispersed crosswise over libraries and chronicles the world over. Because of entry of time ink of rear begins to meddle with the ink of front side which hampers the intelligibility of archives. Be that as it may, because of the significance of such archives it is essential to restore such records. In this paper, different calculations are used in pre handling steps like Bernsen calculation, Improved Bernsen calculation, Canny edge discovery method are used for initialising the outcomes. Subsequently post preparing steps are proposed toward the end so that the calculation finishes up with improved and productive results

    Improved wolf algorithm on document images detection using optimum mean technique

    Detection text from handwriting in historical documents provides high-level features for the challenging problem of handwriting recognition. Such handwriting often contains noise, faint or incomplete strokes, strokes with gaps, and competing lines when embedded in a table or form, making it unsuitable for local line following algorithms or associated binarization schemes. In this paper, a proposed method based on the optimum threshold value and namely as the Optimum Mean method was presented. Besides, Wolf method unsuccessful in order to detect the thin text in the non-uniform input image. However, the proposed method was suggested to overcome the Wolf method problem by suggesting a maximum threshold value using optimum mean. Based on the calculation, the proposed method obtained a higher F-measure (74.53), PSNR (14.77) and lowest NRM (0.11) compared to the Wolf method. In conclusion, the proposed method successful and effective to solve the wolf problem by producing a high-quality output image

    Improvement of binarization performance using local otsu thresholding

    Ancient document usually contains multiple noises such as uneven-background, show-through, water-spilling, spots, and blur text. The noise will affect the binarization process. Binarization is an extremely important process in image processing, especially for character recognition. This paper presents an improvement to Nina binarization technique. Improvements were achieved by reducing processing steps and replacing median filtering by Wiener filtering. First, the document background was approximated by using Wiener filter, and then image subtraction was applied. Furthermore, the manuscript contrast was adjusted by mapping intensity of image value using intensity transformation method. Next, the local Otsu thresholding was applied. For removing spotting noise, we applied labeled connected component. The proposed method had been testing on H-DIBCO 2014 and degraded Jawi handwritten ancient documents. It performed better regarding recall and precision values, as compared to Otsu, Niblack, Sauvola, Lu, Su, and Nina, especially in the documents with show-through, water-spilling and combination noises

    Identificación de las características biométricas de la mano humana mediante visión por computador para el diseño de un prototipo

    El proyecto Identificación de las características biométricas de la mano humana mediante visión por computador para el diseño de un prototipo, se realizó con el fin de proponer unos descriptores válidos y un método adecuado de clasificación para optimizar la identificación biométrica de la mano y así designar los elementos adecuados para la realización de la propuesta de un prototipo. De acuerdo con esto, se diseñó un prototipo para obtener una base de datos con diferentes personas, para así sacar las características más relevantes que diferenciara las manos unas de otras, y de esta manera lograr identificarlas. La mayoría de dispositivos de adquisición de la geometría de la mano humana están basados en diseños por contacto lo cual es poco aceptado por los usuarios, ya que cuenta con problemas higiénicos (transmisión de gérmenes), sociales (en algunas naciones conservativas) y de seguridad (uso ilegal de huellas latentes en la superficie del sistema). Por esta razón el sistema que se diseñó, se enfocó en la adquisición de datos sin contacto mejorando las medidas de higiene, seguridad y el nivel de aceptación de los usuarios. Este sistema aparte de ser sin contacto contó con un sensor el cual ayudó a la adquisición de las características biométricas; este fue un sensor ultrasónico el cual le informaba al usuario si la mano estaba ubicada en una distancia deseada para la toma de la imagen, dándole a este sistema biométrico la confiabilidad adecuada para la realización del proyecto

    Automatic Road Crack Segmentation Using Thresholding Methods

    Maintenance of good condition of roads are very essential to the economy and everyday life of people in a every country. Road cracks are one of the important indicators that show degradations of road surfaces. Inspection of roads that have been done manually took a very long time and tedious. Hence, an automatic road crack segmentation using thresholding methods have been proposed in this study. In this study, ten road crack images have been pre-processed as an initial step. Then, normalization techniques, L1-Sqrt norm have been applied onto images to reduce the variation of intensities that skewed to the right. Then, thresholding methods, Otsu and Sauvola methods have been used to binarize the images.  From the experiment of ten road crack images that have been done, normalization technique, L1-Sqrt norm can help to increase performance of road crack segmentation for Otsu and Sauvola methods. The results also show that Sauvola method outperform Otsu method in detecting road cracks

    Development of Machine Learning Based Binarization Technique of Hand-drawn Floor Plans for Automatic Extraction of Indoor Spatial Information

    학위논문(석사) -- 서울대학교대학원 : 공과대학 건설환경공학부, 2022. 8. 유기윤.최근 인공지능, 사물인터넷 등의 발전과 함께 사용자의 위치를 파악하여 실시간 정보를 제공하는 실내 위치기반 서비스에 대한 사회적 관심도가 높다. 이러한 실내 위치기반 서비스의 활성화를 위해서는 실내 공간의 모습을 표현하는 실내 구조 형상화 및 모델링이 필수적이다. 이에 따라 레이저 스캐너, 건축도면 이미지, CAD플랜 등 다양한 원천 데이터로부터 실내 공간을 재현하는 연구들이 진행되어 왔다. 특히 실내 공간정보를 자동 추출 기술은 수동 모델링 대비 경제적으로 매우 효율적이다. 이에 2차원 건축도면 이미지 데이터로부터 벽, 창문, 계단과 같은 실내 객체를 자동 추출하여 3D 모델링 데이터를 구축하는 도면 해석 연구가 활발히 진행 중에 있다. 기존의 2차원 사진 기반 도면 해석 연구들은 객체와 배경이 명확히 구분되며 객체가 일정한 색으로 표현된 전자 도면을 대상으로 연구를 수행하였다. 하지만, 펜과 잉크를 사용해 작성된 핸드드로잉 도면의 경우 기존 연구에 사용된 도면에 비해 노이즈가 많고 배경 패턴이 불규칙적이다. 또한 사용된 펜이나 잉크에 따라 객체의 색상값이 일정하지 않기 때문에 기존 실내 공간 객체 추출 알고리즘을 적용하는 데에 한계가 존재한다. 이에 본 연구는 노이즈가 심하고 불규칙적인 핸드드로잉 건축도면을 대상으로 실내 공간을 구성하는 객체와 배경을 구분하는 이진화를 수행하고자 한다. 본 연구는 전자 도면 대상의 기존 실내 공간정보 자동 추출 연구의 범위를 역사적 건축물이나 건축 연도가 오래되어 아날로그 방식으로 작성된 건축도면만 존재하는 건물을 대상으로 확장하는 것을 목표로 한다. 분석 데이터로서 1900년대 초반에 작성된 일제시기 건축도면을 활용하여 연구를 수행하였다. 본 연구에 사용된 일제시기 건축도면은 종이류 문화재 특성상 보관 및 디지털화 과정에서 다양한 형태의 노이즈가 존재하며 작성 시 사용된 필기류 종류에 따라 객체의 색상 값이 일정하지 못하다. 또한 핸드드로잉 건축도면 이미지마다 나타나는 노이즈의 픽셀값과 실내 객체의 선명도가 다르기 때문에 머신러닝 모델을 사용한 학습 기반 이진화 기법을 적용하였다. 이진화는 제거하고자 하는 노이즈의 형태에 따라 크게 두 가지 단계로 진행된다. 첫 번째 단계는 가우시안 혼합 모델을 사용하여 도면 이미지의 배경에 전체적으로 넓게 분포하는 노이즈를 감소시키는 단계이다. 두 번째 단계는 랜덤포레스트 모델을 기반으로 객체와 배경을 구분하는 특징을 추출하여 면적이 작고 다양한 형태의 노이즈를 학습 및 제거시키는 단계이다. 마지막으로 제안한 방법론에 대한 검증을 수행하기 위해 학습 과정에 사용되지 않은 테스트 셋에 대한 분류 모델 성능 평가와 최종 결과 이미지에 대한 이미지 품질 평가를 진행했다. 실험 결과, 분류 모델 성능 평가의 경우 랜덤포레스트 모델의 평균 정밀도 및 재현율은 각각 0.985와 0.99이고 최종 이진화 결과 이미지의 신호 대비 잡음 비 지표는 16.543의 결과를 얻었다. 이진화 결과, 선행 연구 대비 다양한 두께로 구성된 벽, 창문, 가벽과 같은 실내 공간 객체와 배경을 성공적으로 분리하였다. 또한 모델의 일반화 성능 검증을 위해 베르사유 궁전 건축도면에 대해 본 연구의 이진화 알고리즘을 적용하였다. 적용 결과, 정밀도 및 재현율은 각각 0.998와 0.969이고 결과 이미지의 품질을 평가하는 지표 역시 테스트 셋과 유사하게 우수한 성능을 나타냈다. 본 연구는 기존 도면 해석 연구의 활용처를 핸드드로잉 건축도면으로 확장하는 기반을 마련했다는 점에서 의의가 있다.Along with the recent development of artificial intelligence and the Internet of Things, social interest in indoor location-based services providing real-time information from user location is getting high. For location-based service development, indoor spatial modelling is essential to represent indoor topology. Therefore, many studies have been conducted to extract indoor structure information from various types of data such as laser scanners, architectural drawing images, and CAD plans. In particular, the automatic extraction technology of indoor space information is economically efficient compared to manual modeling, so algorithms for automatic extraction of floor plan entities like walls, windows, and stairs from 2D floor plan image are actively developed. Previous studies mostly used “clean” floor images that floor plan entities and background are clearly distinguished. However, in the case of hand-drawing architectural floor plans created using various types of pens and ink, there are large numbers of noise in background. In addition, since the pixel intensities of every floor plan entities are not constant depending on the pen or ink used, there is a limit to applying the previous algorithms. Therefore, this study aims to perform binarization to distinguish floor plan entities from background with noise and irregular patterns. The purpose of this study is to expand the scope of previous floor plan analysis studies to historical and old buildings. For dataset, we use architectural drawings of the Japanese colonial period written in the early 1900s. The Japanese architectural drawings used in this study have various types of noise made during the process of storage and digitization. Also, floor plan entities consist of all different colors depending on the type of materials used. We apply learning-based binarizaiton algorithm and our algorithm can be divided into two main steps. The first step is to reduce the noise that is widely distributed across the background of the drawing image using a Gaussian mixture model. The second step is to extract features that distinguish objects and backgrounds based on the random forest model, and to learn various forms of small noise. For evaluation, we perform the classification performance of suggested algorithm on test set. Our binarization algorithm results in 98.5% precision and 99.0% F1-score rate. This study has two main contributions. First, our algorithm successfully distinguishes various types of floor plan entities with different thickness. Second, study scope of automatic extraction of spatial information from floor plan image can be expanded from electronic floor plan image to hand-drawing architectural floor plans.1. 서론 1 1.1 연구 배경 및 목적 1 1.2 이진화 연구 동향 4 1.2.1 규칙 기반 이진화 방법론 7 1.2.2 학습 기반 이진화 방법론 10 1.2.3 시사점 및 결론 12 1.3 연구 범위 및 방법 14 2. 연구 방법 17 2.1 데이터 수집 및 전처리 17 2.2 배경 예측 및 제거 19 2.2.1 픽셀값 빈도 분석 19 2.2.2 이상값 필터링 21 2.2.3 가우시안 혼합 모델 24 2.2.4 배경 제거 이미지 생성 26 2.3 머신러닝 기반 도면 이진화 27 2.3.1 특징 추출 27 통계적 특성 30 명암도 동시행렬의 통계적 특성 31 수직-수평 연속성 행렬 35 2.3.2 랜덤포레스트 모델 38 2.3.3 재귀적 특징 제거법 42 2.3.4 평가지표 44 2.4 후처리 47 3. 실험 적용 및 결과 49 3.1 데이터 수집 및 전처리 결과 49 3.2 배경 예측 및 제거 결과 52 3.3 특징 추출 결과 56 3.3.1 명암도 동시발생 행렬 특징 추출 결과 56 3.3.2 수직-수평 연속성 행렬 특징 추출 결과 58 3.4 머신러닝 기반 도면 이진화 평가 결과 59 3.4.1 특징 중요도 및 최적 특징 조합 59 3.4.2 분류 모델 성능 비교 63 3.4.3 이진화 결과 이미지의 품질 비교 65 3.5 머신러닝 기반 도면 이진화 적용 결과 68 3.5.1 소축척 도면에서의 이진화 적용 결과 70 3.5.2 대축척 도면에서의 이진화 적용 결과 72 3.6 다양한 핸드드로잉 건축도면의 이진화 평가 및 적용 결과 74 3.6.1 직선 객체로 구성된 베르사유 궁전 건축도면의 이진화 75 3.6.2 곡선 객체를 포함하는 베르사유 궁전 건축도면의 이진화 76 4. 결론 79 참 고 문 헌 82 부 록 86 Abstract 112석

    A recursive Otsu thresholding method for scanned document binarization

    Image Analysis Algorithms for Single-Cell Study in Systems Biology

    With the contiguous shift of biology from a qualitative toward a quantitative field of research, digital microscopy and image-based measurements are drawing increased interest. Several methods have been developed for acquiring images of cells and intracellular organelles. Traditionally, acquired images are analyzed manually through visual inspection. The increasing volume of data is challenging the scope of manual analysis, and there is a need to develop methods for automated analysis. This thesis examines the development and application of computational methods for acquisition and analysis of images from single-cell assays. The thesis proceeds with three different aspects.First, a study evaluates several methods for focusing microscopes and proposes a novel strategy to perform focusing in time-lapse imaging. The method relies on the nature of the focus-drift and its predictability. The study shows that focus-drift is a dynamical system with a small randomness. Therefore, a prediction-based method is employed to track the focus-drift overtime. A prototype implementation of the proposed method is created by extending the Nikon EZ-C1 Version 3.30 (Tokyo, Japan) imaging platform for acquiring images with a Nikon Eclipse (TE2000-U, Nikon, Japan) microscope.Second, a novel method is formulated to segment individual cells from a dense cluster. The method incorporates multi-resolution analysis with maximum-likelihood estimation (MAMLE) for cell detection. The MAMLE performs cell segmentation in two phases. The initial phase relies on a cutting-edge filter, edge detection in multi-resolution with a morphological operator, and threshold decomposition for adaptive thresholding. It estimates morphological features from the initial results. In the next phase, the final segmentation is constructed by boosting the initial results with the estimated parameters. The MAMLE method is evaluated with de novo data sets as well as with benchmark data from public databases. An empirical evaluation of the MAMLE method confirms its accuracy.Third, a comparative study is carried out on performance evaluation of state-ofthe-art methods for the detection of subcellular organelles. This study includes eleven algorithms developed in different fields for segmentation. The evaluation procedure encompasses a broad set of samples, ranging from benchmark data to synthetic images. The result from this study suggests that there is no particular method which performs superior to others in the test samples. Next, the effect of tetracycline on transcription dynamics of tetA promoter in Escherichia coli (E. coli ) cells is studied. This study measures expressions of RNA by tagging the MS2d-GFP vector with a target gene. The RNAs are observed as intracellular spots in confocal images. The kernel density estimation (KDE) method for detecting the intracellular spots is employed to quantify the individual RNA molecules.The thesis summarizes the results from five publications. Most of the publications are associated with different methods for imaging and analysis of microscopy. Confocal images with E. coli cells are targeted as the primary area of application. However, potential applications beyond the primary target are also made evident. The findings of the research are confirmed empirically