95 research outputs found

    3D Image Processing, Analysis, and Software Development of Khmer Inscriptions

    Get PDF
    Khmer inscriptions are primary sources of information on the history of Cambodia. Weathering,pollution, intentional damage or destruction and the acts of war have caused the inscriptions in a bad condition. Historians and Linguists have already intensively used the Khmer inscriptions to gain a better understanding of the ancient Khmer civilization. Still, there has so far not been any research project using modern methods of scientific computing in order to - appropriately capture Khmer inscriptions digitally, - enhance these data using image processing techniques and transform them into 2D representations, - optimally repair or if possible supplement damaged or incomplete script, - provide tools for digital image processing and analysis of texts in ancient Khmer inscriptions, - and thereby contribute to the reconstruction, readability and preservation of these valuable historic documents. In doing so, challenges arise that can only be addressed by the use of new concepts and methods. This is caused by the following facts: - The inscriptions usually exist as 3D data. - Their size or condition is sometimes problematic. - The number as well as the complexity of the characters is large. In regard to the possible combinations of consonants, sub-consonants and dependent vowels, they are comparatively huge. The aim of this research project is to overcome these challenges and to contribute to the filling the above mentioned gaps. For this purpose, the original data of the inscriptions was acquired by a 3D scanning, then processed with image processing methods and transformed into a 2D representation, which could then be further analyzed. Single characters were isolated, and methods for their identification were developed. To identify the characters, their topological and geometrical structures were used. The features of the characters were captured by indices that are easy to compute. The identification of a given character was done by repeated filtering. The methods were developed by the example of the Khmer script of pre-Angkorian period but it can be extended to be used on Khmer scripts of other periods. As it does not exist Unicode for pre-Angkorian period script, a digital code was developed. Based on the designed mathematical methods and algorithms, a software tool was developed with which ancient Khmer characters and inscriptions can be processed and analyzed. This software tool is a first experimental version, which will be expanded in a further project for general use

    PLoS One

    Get PDF
    Quantitative analysis of the vascular network anatomy is critical for the understanding of the vasculature structure and function. In this study, we have combined microcomputed tomography (microCT) and computational analysis to provide quantitative three-dimensional geometrical and topological characterization of the normal kidney vasculature, and to investigate how 2 core genes of the Wnt/planar cell polarity, Frizzled4 and Frizzled6, affect vascular network morphogenesis. Experiments were performed on frizzled4 (Fzd4-/-) and frizzled6 (Fzd6-/-) deleted mice and littermate controls (WT) perfused with a contrast medium after euthanasia and exsanguination. The kidneys were scanned with a high-resolution (16 μm) microCT imaging system, followed by 3D reconstruction of the arterial vasculature. Computational treatment includes decomposition of 3D networks based on Diameter-Defined Strahler Order (DDSO). We have calculated quantitative (i) Global scale parameters, such as the volume of the vasculature and its fractal dimension (ii) Structural parameters depending on the DDSO hierarchical levels such as hierarchical ordering, diameter, length and branching angles of the vessel segments, and (iii) Functional parameters such as estimated resistance to blood flow alongside the vascular tree and average density of terminal arterioles. In normal kidneys, fractal dimension was 2.07±0.11 (n = 7), and was significantly lower in Fzd4-/- (1.71±0.04; n = 4), and Fzd6-/- (1.54±0.09; n = 3) kidneys. The DDSO number was 5 in WT and Fzd4-/-, and only 4 in Fzd6-/-. Scaling characteristics such as diameter and length of vessel segments were altered in mutants, whereas bifurcation angles were not different from WT. Fzd4 and Fzd6 deletion increased vessel resistance, calculated using the Hagen-Poiseuille equation, for each DDSO, and decreased the density and the homogeneity of the distal vessel segments. Our results show that our methodology is suitable for 3D quantitative characterization of vascular networks, and that Fzd4 and Fzd6 genes have a deep patterning effect on arterial vessel morphogenesis that may determine its functional efficiency

    Extraction of Text from Images and Videos

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    De Novo Protein Structure Modeling and Energy Function Design

    Get PDF
    The two major challenges in protein structure prediction problems are (1) the lack of an accurate energy function and (2) the lack of an efficient search algorithm. A protein energy function accurately describing the interaction between residues is able to supervise the optimization of a protein conformation, as well as select native or native-like structures from numerous possible conformations. An efficient search algorithm must be able to reduce a conformational space to a reasonable size without missing the native conformation. My PhD research studies focused on these two directions. A protein energy function—the distance and orientation dependent energy function of amino acid key blocks (DOKB), containing a distance term, an orientation term, and a highly packed term—was proposed to evaluate the stability of proteins. In this energy function, key blocks of each amino acids were used to represent each residue; a novel reference state was used to normalize block distributions. The dependent relationship between the orientation term and the distance term was revealed, representing the preference of different orientations at different distances between key blocks. Compared with four widely used energy functions using six general benchmark decoy sets, the DOKB appeared to perform very well in recognizing native conformations. Additionally, the highly packed term in the DOKB played its important role in stabilizing protein structures containing highly packed residues. The cluster potential adjusted the reference state of highly packed areas and significantly improved the recognition of the native conformations in the ig_structal data set. The DOKB is not only an alternative protein energy function for protein structure prediction, but it also provides a different view of the interaction between residues. The top-k search algorithm was optimized to be used for proteins containing both α-helices and β-sheets. Secondary structure elements (SSEs) are visible in cryo-electron microscopy (cryo-EM) density maps. Combined with the SSEs predicted in a protein sequence, it is feasible to determine the topologies referring to the order and direction of the SSEs in the cryo-EM density map with respect to the SSEs in the protein sequence. Our group member Dr. Al Nasr proposed the top-k search algorithm, searching the top-k possible topologies for a target protein. It was the most effective algorithm so far. However, this algorithm only works well for pure a-helix proteins due to the complexity of the topologies of β-sheets. Based on the known protein structures in the Protein Data Bank (PDB), we noticed that some topologies in β-sheets had a high preference; on the contrary, some topologies never appeared. The preference of different topologies of β-sheets was introduced into the optimized top-k search algorithm to adjust the edge weight between nodes. Compared with the previous results, this optimization significantly improved the performance of the top-k algorithm in the proteins containing both α-helices and β-sheets

    Off-line Arabic Handwriting Recognition System Using Fast Wavelet Transform

    Get PDF
    In this research, off-line handwriting recognition system for Arabic alphabet is introduced. The system contains three main stages: preprocessing, segmentation and recognition stage. In the preprocessing stage, Radon transform was used in the design of algorithms for page, line and word skew correction as well as for word slant correction. In the segmentation stage, Hough transform approach was used for line extraction. For line to words and word to characters segmentation, a statistical method using mathematic representation of the lines and words binary image was used. Unlike most of current handwriting recognition system, our system simulates the human mechanism for image recognition, where images are encoded and saved in memory as groups according to their similarity to each other. Characters are decomposed into a coefficient vectors, using fast wavelet transform, then, vectors, that represent a character in different possible shapes, are saved as groups with one representative for each group. The recognition is achieved by comparing a vector of the character to be recognized with group representatives. Experiments showed that the proposed system is able to achieve the recognition task with 90.26% of accuracy. The system needs only 3.41 seconds a most to recognize a single character in a text of 15 lines where each line has 10 words on average

    End-Shape Analysis for Automatic Segmentation of Arabic Handwritten Texts

    Get PDF
    Word segmentation is an important task for many methods that are related to document understanding especially word spotting and word recognition. Several approaches of word segmentation have been proposed for Latin-based languages while a few of them have been introduced for Arabic texts. The fact that Arabic writing is cursive by nature and unconstrained with no clear boundaries between the words makes the processing of Arabic handwritten text a more challenging problem. In this thesis, the design and implementation of an End-Shape Letter (ESL) based segmentation system for Arabic handwritten text is presented. This incorporates four novel aspects: (i) removal of secondary components, (ii) baseline estimation, (iii) ESL recognition, and (iv) the creation of a new off-line CENPARMI ESL database. Arabic texts include small connected components, also called secondary components. Removing these components can improve the performance of several systems such as baseline estimation. Thus, a robust method to remove secondary components that takes into consideration the challenges in the Arabic handwriting is introduced. The methods reconstruct the image based on some criteria. The results of this method were subsequently compared with those of two other methods that used the same database. The results show that the proposed method is effective. Baseline estimation is a challenging task for Arabic texts since it includes ligature, overlapping, and secondary components. Therefore, we propose a learning-based approach that addresses these challenges. Our method analyzes the image and extracts baseline dependent features. Then, the baseline is estimated using a classifier. Algorithms dealing with text segmentation usually analyze the gaps between connected components. These algorithms are based on metric calculation, finding threshold, and/or gap classification. We use two well-known metrics: bounding box and convex hull to test metric-based method on Arabic handwritten texts, and to include this technique in our approach. To determine the threshold, an unsupervised learning approach, known as the Gaussian Mixture Model, is used. Our ESL-based segmentation approach extracts the final letter of a word using rule-based technique and recognizes these letters using the implemented ESL classifier. To demonstrate the benefit of text segmentation, a holistic word spotting system is implemented. For this system, a word recognition system is implemented. A series of experiments with different sets of features are conducted. The system shows promising results

    Recognition of off-line handwritten cursive text

    Get PDF
    The author presents novel algorithms to design unconstrained handwriting recognition systems organized in three parts: In Part One, novel algorithms are presented for processing of Arabic text prior to recognition. Algorithms are described to convert a thinned image of a stroke to a straight line approximation. Novel heuristic algorithms and novel theorems are presented to determine start and end vertices of an off-line image of a stroke. A straight line approximation of an off-line stroke is converted to a one-dimensional representation by a novel algorithm which aims to recover the original sequence of writing. The resulting ordering of the stroke segments is a suitable preprocessed representation for subsequent handwriting recognition algorithms as it helps to segment the stroke. The algorithm was tested against one data set of isolated handwritten characters and another data set of cursive handwriting, each provided by 20 subjects, and has been 91.9% and 91.8% successful for these two data sets, respectively. In Part Two, an entirely novel fuzzy set-sequential machine character recognition system is presented. Fuzzy sequential machines are defined to work as recognizers of handwritten strokes. An algorithm to obtain a deterministic fuzzy sequential machine from a stroke representation, that is capable of recognizing that stroke and its variants, is presented. An algorithm is developed to merge two fuzzy machines into one machine. The learning algorithm is a combination of many described algorithms. The system was tested against isolated handwritten characters provided by 20 subjects resulting in 95.8% recognition rate which is encouraging and shows that the system is highly flexible in dealing with shape and size variations. In Part Three, also an entirely novel text recognition system, capable of recognizing off-line handwritten Arabic cursive text having a high variability is presented. This system is an extension of the above recognition system. Tokens are extracted from a onedimensional representation of a stroke. Fuzzy sequential machines are defined to work as recognizers of tokens. It is shown how to obtain a deterministic fuzzy sequential machine from a token representation that is capable'of recognizing that token and its variants. An algorithm for token learning is presented. The tokens of a stroke are re-combined to meaningful strings of tokens. Algorithms to recognize and learn token strings are described. The. recognition stage uses algorithms of the learning stage. The process of extracting the best set of basic shapes which represent the best set of token strings that constitute an unknown stroke is described. A method is developed to extract lines from pages of handwritten text, arrange main strokes of extracted lines in the same order as they were written, and present secondary strokes to main strokes. Presented secondary strokes are combined with basic shapes to obtain the final characters by formulating and solving assignment problems for this purpose. Some secondary strokes which remain unassigned are individually manipulated. The system was tested against the handwritings of 20 subjects yielding overall subword and character recognition rates of 55.4% and 51.1%, respectively

    Visual-based decision for iterative quality enhancement in robot drawing.

    Get PDF
    Kwok, Ka Wai.Thesis (M.Phil.)--Chinese University of Hong Kong, 2005.Includes bibliographical references (leaves 113-116).Abstracts in English and Chinese.ABSTRACT --- p.iChapter 1. --- INTRODUCTION --- p.1Chapter 1.1 --- Artistic robot in western art --- p.1Chapter 1.2 --- Chinese calligraphy robot --- p.2Chapter 1.3 --- Our robot drawing system --- p.3Chapter 1.4 --- Thesis outline --- p.3Chapter 2. --- ROBOT DRAWING SYSTEM --- p.5Chapter 2.1 --- Robot drawing manipulation --- p.5Chapter 2.2 --- Input modes --- p.6Chapter 2.3 --- Visual-feedback system --- p.8Chapter 2.4 --- Footprint study setup --- p.8Chapter 2.5 --- Chapter summary --- p.10Chapter 3. --- LINE STROKE EXTRACTION AND ORDER ASSIGNMENT --- p.11Chapter 3.1 --- Skeleton-based line trajectory generation --- p.12Chapter 3.2 --- Line stroke vectorization --- p.15Chapter 3.3 --- Skeleton tangential slope evaluation using MIC --- p.16Chapter 3.4 --- Skeleton-based vectorization using Bezier curve interpolation --- p.21Chapter 3.5 --- Line stroke extraction --- p.25Chapter 3.6 --- Line stroke order assignment --- p.30Chapter 3.7 --- Chapter summary --- p.33Chapter 4. --- PROJECTIVE RECTIFICATION AND VISION-BASED CORRECTION --- p.34Chapter 4.1 --- Projective rectification --- p.34Chapter 4.2 --- Homography transformation by selected correspondences --- p.35Chapter 4.3 --- Homography transformation using GA --- p.39Chapter 4.4 --- Visual-based iterative correction example --- p.45Chapter 4.5 --- Chapter summary --- p.49Chapter 5. --- ITERATIVE ENHANCEMENT ON OFFSET EFFECT AND BRUSH THICKNESS --- p.52Chapter 5.1 --- Offset painting effect by Chinese brush pen --- p.52Chapter 5.2 --- Iterative robot drawing process --- p.53Chapter 5.3 --- Iterative line drawing experimental results --- p.56Chapter 5.4 --- Chapter summary --- p.67Chapter 6. --- GA-BASED BRUSH STROKE GENERATION --- p.68Chapter 6.1 --- Brush trajectory representation --- p.69Chapter 6.2 --- Brush stroke modeling --- p.70Chapter 6.3 --- Stroke simulation using GA --- p.72Chapter 6.4 --- Evolutionary computing results --- p.77Chapter 6.5 --- Chapter summary --- p.95Chapter 7. --- BRUSH STROKE FOOTPRINT CHARACTERIZATION --- p.96Chapter 7.1 --- Footprint video capturing --- p.97Chapter 7.2 --- Footprint image property --- p.98Chapter 7.3 --- Experimental results --- p.102Chapter 7.4 --- Chapter summary --- p.109Chapter 8. --- CONCLUSIONS AND FUTURE WORKS --- p.111BIBLIOGRAPHY --- p.11
    corecore