5,052 research outputs found

    Page boundary extraction of bound historical herbaria

    Get PDF
    When digitizing bound historical collections such as herbaria it is important to extract the main page region so that it could be used for automated processing. The thickness of the herbaria books also gives rise to deformations during imaging which reduces the efficiency of automatic detection tasks. In this work we address these problems by proposing an automatic page detection algorithm that estimates all the boundaries of the page and performs morphological corrections in order to reduce deformations. The algorithm extracts features from Hue, Saturation and Value transformations of an RGB image to detect the main page polygon. The algorithm was evaluated on multiple textual and herbaria type historical collections and obtains over 94% mean intersection over union on all these datasets. Additionally, the algorithm was also subjected to an ablation test to demonstrate the importance of morphological corrections

    Extraction of Scores and Average From Algerian High-School Degree Transcripts

    Get PDF
    A system for extracting scores and average from Algerian High School Degree Transcripts is proposed. The system extracts the scores and the average based on the localization of the tables gathering this information and it consists of several stages. After preprocessing, the system locates the tables using ruling-lines information as well as other text information. Therefore, the adopted localization approach can work even in the absence of certain ruling-lines or the erasure and discontinuity of lines. After that, the localized tables are segmented into columns and the columns into information cells. Finally, cells labeling is done based on the prior knowledge of the tables structure allowing to identify the scores and the average. Experiments have been conducted on a local dataset in order to evaluate the performances of our system and compare it with three public systems at three levels, and the obtained results show the effectiveness of our system

    Real-time quality control of heat sealed bottles

    Get PDF
    The present document describes a system for controlling the quality of heat sealed bottles. The system detects defective seals to identify bottles that can not be sold. A prototype was developed to validate and test the system proposed. In the production line, the bottles are filled with a toxic substance and can only be sold when properly sealed. A leak can be harmful to humans and the environment. Because the seals are not visible from outside the bottle, images from each seal are obtained using a thermal camera. The hot glue used in the sealing process makes the seal visible in the infrared image. The image is cleaned and converted to black and white only keeping the seal in the final image. Black pixels present the value 0 and white pixels present the value 1. Then a signature composed by two arrays containing the sum of the number of white pixels in each column and in each row is calculated. Both arrays present a U shape when the bottle is sealed. The signature is then fed to an artificial neural network which was trained to identify correctly sealed bottles. The classification results are stored in a database. The trained neural net presented an accuracy of 98.7 % and an F1 score of 96.0 % in the testing phase. The results shows the inspection process is effective in identifying defective seals and because it is automated it can be scaled up to large bottle processing plants. All classified images can be seen though a web application where a user has the option of validating the operation and identifying errors which will be individually fitted to improve the machine learning model performance. The system is non invasive, automated, and can be applied to common conveyor belts currently used in industrial plants. It can also be adapted to detect different prob lems in bottles of different shapes.Nesta dissertação é descrito um sistema de controlo de qualidade de selos em garrafas. Foi contruído um protótipo com o objetivo de testar e validar o funcionamento do sistema. Na linha de produção, as garrafas são cheias com uma substância tóxica e apenas podem ser vendidas quando corretamente seladas pois uma fuga põe em risco a saúde do utilizador. A dificuldade deste processo deve-se ao facto de o selo não ser visível pois encontra-se debaixo da tampa opaca da garrafa. Dado o uso de cola quente no processo de selagem, com uma câmara térmica é possível obter uma imagem do selo. Esta imagem é depois processada com o intuito de isolar o selo na imagem final. Da imagem final gera-se uma assinatura que consiste na juncão de duas listas contendo a soma do número de pixels brancos por coluna e por linha. Ambas as listas apresentam uma forma de ‘U’ quando a garrafa está corretamente selada. Uma rede neuronal utiliza a assinatura para classificar a imagem, identificando garrafas mal seladas. O resultado obtido é registado numa base de dados. A rede neuronal treinada apresentou uma accuracy de 98,7 % e um F1 score de 96,0 % na fase de treino mostrando que é eficiente na identificação de selos defeituosos. O sistema inclui a possibilidade de validar as classificações usando uma aplicação web onde é possível analisar o histórico de imagens. Quando uma imagem incorretamente classificada é identificada, esta deve ser selecionada e novamente treinada para corrigir o erro e permitir que o modelo tenha capacidade de aprendizagem. Este método não é invasivo nem destrutivo, é automatizado e pode ser usado na produção de produtos diferentes desde que o processo de selagem seja semelhante

    An IoT System for Converting Handwritten Text to Editable Format via Gesture Recognition

    Get PDF
    Evaluation of traditional classroom has led to electronic classroom i.e. e-learning. Growth of traditional classroom doesn’t stop at e-learning or distance learning. Next step to electronic classroom is a smart classroom. Most popular features of electronic classroom is capturing video/photos of lecture content and extracting handwriting for note-taking. Numerous techniques have been implemented in order to extract handwriting from video/photo of the lecture but still the deficiency of few techniques can be resolved, and which can turn electronic classroom into smart classroom. In this thesis, we present a real-time IoT system to convert handwritten text into editable format by implementing hand gesture recognition (HGR) with Raspberry Pi and camera. Hand Gesture Recognition (HGR) is built using edge detection algorithm and HGR is used in this system to reduce computational complexity of previous systems i.e. removal of redundant images and lecture’s body from image, recollecting text from previous images to fill area from where lecture’s body has been removed. Raspberry Pi is used to retrieve, perceive HGR and to build a smart classroom based on IoT. Handwritten images are converted into editable format by using OpenCV and machine learning algorithms. In text conversion, recognition of uppercase and lowercase alphabets, numbers, special characters, mathematical symbols, equations, graphs and figures are included with recognition of word, lines, blocks, and paragraphs. With the help of Raspberry Pi and IoT, the editable format of lecture notes is given to students via desktop application which helps students to edit notes and images according to their necessity

    A practical vision system for the detection of moving objects

    Get PDF
    The main goal of this thesis is to review and offer robust and efficient algorithms for the detection (or the segmentation) of foreground objects in indoor and outdoor scenes using colour image sequences captured by a stationary camera. For this purpose, the block diagram of a simple vision system is offered in Chapter 2. First this block diagram gives the idea of a precise order of blocks and their tasks, which should be performed to detect moving foreground objects. Second, a check mark () on the top right corner of a block indicates that this thesis contains a review of the most recent algorithms and/or some relevant research about it. In many computer vision applications, segmenting and extraction of moving objects in video sequences is an essential task. Background subtraction has been widely used for this purpose as the first step. In this work, a review of the efficiency of a number of important background subtraction and modelling algorithms, along with their major features, are presented. In addition, two background approaches are offered. The first approach is a Pixel-based technique whereas the second one works at object level. For each approach, three algorithms are presented. They are called Selective Update Using Non-Foreground Pixels of the Input Image , Selective Update Using Temporal Averaging and Selective Update Using Temporal Median , respectively in this thesis. The first approach has some deficiencies, which makes it incapable to produce a correct dynamic background. Three methods of the second approach use an invariant colour filter and a suitable motion tracking technique, which selectively exclude foreground objects (or blobs) from the background frames. The difference between the three algorithms of the second approach is in updating process of the background pixels. It is shown that the Selective Update Using Temporal Median method produces the correct background image for each input frame. Representing foreground regions using their boundaries is also an important task. Thus, an appropriate RLE contour tracing algorithm has been implemented for this purpose. However, after the thresholding process, the boundaries of foreground regions often have jagged appearances. Thus, foreground regions may not correctly be recognised reliably due to their corrupted boundaries. A very efficient boundary smoothing method based on the RLE data is proposed in Chapter 7. It just smoothes the external and internal boundaries of foreground objects and does not distort the silhouettes of foreground objects. As a result, it is very fast and does not blur the image. Finally, the goal of this thesis has been presenting simple, practical and efficient algorithms with little constraints which can run in real time

    문서 경계와 3차원 재구성에 기반한 문서 이미지 평판화

    Get PDF
    학위논문(석사) -- 서울대학교대학원 : 자연과학대학 수리과학부, 2022. 8. 현동훈.In recent days, most of the scanned images are obtained from mobile devices such as cameras, smartphones, and tablets rather than traditional flatbed scanners. Contrary to the scanning process of the traditional scanners, capturing process of mobile devices might be accompanied by distortions in various forms such as perspective distortion, fold distortion, and page curls. In this thesis, we propose robust dewarping methods which correct such distortions based on the document boundary and 3D reconstruction. In the first method, we construct a curvilinear grid on the document image using the document boundary and reconstruct the document surface in the three dimensional space. Then we rectify the image using a family of local homographies computed from the reconstructed document surface. Although some of the steps of the proposed method have been proposed separately in other research, our approach exploited and combined their advantages to propose a robust dewarping process in addition to improving the stability in the overall process. Moreover, we refined the process by correcting the distorted text region boundary and developed this process into an independent dewarping method which is concise, straight-forward, and robust while still producing a well-rectified document image.최근에는 대부분의 스캔된 이미지들이 전통적인 평판스캐너가 아닌 카메라, 스마트폰, 태블릿 PC 등의 휴대기기들로부터 얻어진다. 이전 스캐너들의 스캐닝 과정과는 다르게 휴대기기들을 이용한 이미지 캡쳐링 과정은 원근왜곡, 종이의 접힘으로 인한 왜곡, 그리고 종이의 휘어짐으로 인한 왜곡 등 다양한 왜곡들을 수반할 수 있다. 이 논문에서는 이러한 왜곡들을 제거할 수 있는 문서 경계와 3차원 재구성에 기반한 강력한 디워핑 방법을 제안하고자 한다. 첫번째 방법에서는, 문서 경계를 이용하여 문서 이미지 위에 곡선으로 이루어진 그리드를 만들고, 3차원 공간 상의 문서 곡면을 재구성한다. 그리고 재구성된 문서 곡면으로부터 계산된 국소적 호모그래피들을 이용하여 이미지를 수정한다. 우리가 제안하는 방법의 몇몇 단계는 다른 연구에서 개별적으로 사용된 경우도 있지만, 우리는 전체적인 과정에서 안정성을 높이는 동시에 각 방법의 장점들을 이용하고 조합하여 강력한 디워핑 방법을 제안한다. 이에 더하여, 우리는 왜곡된 텍스트 영역의 경계를 수정하여 전체적인 과정을 보완하였고, 이 절차를 간결하고, 직관적이며, 강력하면서도 좋은 결과를 내는 독립적인 디워핑 방법으로 개발하였다.1. Introduction 1 2. Review on Camera Geometry 6 2.1. Basic Camera Model 6 2.2. 3D Reconstruction Problem 8 3. Related Works 10 3.1. Dewarping Methods based on the Text-lines 10 3.2. Dewarping Methods based on the Document Boundary 11 3.3. Dewarping Methods based on the Grid Construction 12 3.4. Dewarping Methods based on the Document Surface Model in 3D Space 13 4. Document Image Dewarping based on the Document Boundary and 3D Reconstruction 15 4.1. Input Document Image Processing 17 4.1.1. Binarization of the Input Document Image 17 4.1.2. Perspective Distortion Removal using the Document Boundary 19 4.2. Grid Construction on the Document Image 21 4.3. 3D Reconstruction of the Document Surface 23 4.3.1. Geometric Model 23 4.3.2. Normalization of the Grid Corners 24 4.3.3. 3D Reconstruction of the Document Surface 26 4.4. Rectification of the Document Image under a Family of Local Homographies 27 4.5. Global Rectification of the Document Image 29 5. Document Image Dewarping by Straightening Document Boundary Curves 33 6. Conclusion 37 Appendix A. 38 A.1. 4-point Algorithm 38 A.2. Optimization of the Cost Function 40 Bibliography 42 Abstract (in Korean) 47 Acknowledgement (in Korean) 48석
    corecore