7 research outputs found

    Recognition of Arabic handwritten words

    Get PDF
    Recognizing Arabic handwritten words is a difficult problem due to the deformations of different writing styles. Moreover, the cursive nature of the Arabic writing makes correct segmentation of characters an almost impossible task. While there are many sub systems in an Arabic words recognition system, in this work we develop a sub system to recognize Part of Arabic Words (PAW). We try to solve this problem using three different approaches, implicit segmentation and two variants of holistic approach. While Rothacker found similar conclusions while this work is being prepared, we report the difficulty in locating characters in PAW using Scale Invariant Feature Transforms under the first approach. In the second and third approaches, we use holistic approach to recognize PAW using Support Vector Machine (SVM) and Active Shape Models (ASM). While there are few works that use SVM to recognize PAW, they use a small dataset; we use a large dataset and a different set of features. We also explain the errors SVM and ASM make and propose some remedies to these errors as future work

    Neural network based image capture for 3D reconstruction

    Get PDF
    The aim of the thesis is to build a neural network, which is capable of choosing frames from a video, which have important information for building a 3D map of the depicted structure without losing the 3D map accuracy. Many times, consecutive frames have redundant information, which do not add to 3D map any significant information or some frames might be, for example, distorted, which do not add to 3D map at all. It all depends on how a camera is moved around when a video is filmed. If all the frames of the video are used in the reconstruction of the 3D map, it will take a long time and it will require a lot of resources, which is problematic especially in the embedded devices. In this thesis it has been considered that embedded device would choose the most informative frames for building the 3D map, but the 3D map itself would be built afterwards with the saved frames on a desktop computer. A database is built from video feeds for neural network training and testing. To build the data base for training a neural network a visual simultaneous localization and mapping algorithm is used to extract features, connecting points between frames and estimate the camera movement from each frame of the video feed. To get more training samples and make the training less time consuming, video feeds have been divided into short sequences of frames. A structure from motion algorithm is used to construct a 3D point cloud of image subsets. A 3D point cloud is then constructed after each frame. To determine whether a frame is a frame with important information for 3D point cloud construction, chamfer distance is used to calculate how close the 3D point cloud is after each added frame to the 3D point cloud constructed with all the video frames. Based on the chamfer distance change then class label is determined for each frame. For the neural network a long short-term memory recurrent neural network structure was chosen, because it can learn from the entire sequence of data. The data base construction, neural network training and validation all were done with Matlab. The result of this master’s thesis is a simple long short-term memory neural network that can choose the important frames from a short sequence of images, but the accuracy needs to be further improved to use the presented method in real embedded device. The custom loss function developed in the thesis did not perform well enough that any of the similar consecutive frames could be chosen, but not more than one of those.Diplomityön tarkoitus on rakentaa neuroverkko, joka pystyy valitsemaan tärkeät kuvat 3D-mallinnusta varten videosta ilman kuvauksen tarkkuuden heikentymistä verrattuna kuvaukseen, joka on tehty kaikilla videon kuvilla. Useasti peräkkäiset kuvat videossa sisältävät samanlaista tietoa, joka ei lisää 3D-mallinnukseen tarkkuutta. Kuinka paljon kuvissa on uutta tietoa verrattuna edelliseen kuvaan, riippuu kameran liikkeestä ja liikkeen nopeudesta. 3D-mallinnuksen rakentamiseen kuluu paljon aikaa ja laskentakapasiteettia, jos kaikkia videon kuvia käytetään 3D-mallinnuksen rakentamiseen, mikä on ongelmallista sulautetuissa järjestelmissä. Tässä työssä on käytetty oletusta, että sulautettu laite kuvaisi ympäristöä ja valitsisi kuvat, joissa on tärkeää informaatiota 3D kuvauksen tekemistä varten, jonka jälkeen valitut kuvat tallennettaisiin laitteen muistiin. Itse 3D-mallinnus tehtäisiin jälkikäteen pöytätietokoneella. Työssä on tehty tietokanta neuroverkkojen opetusta varten kokonaan pöytäkoneella. Tietokanta opetusta varten on tehty vSLAM-menetelmällä, jossa kuvista poimitaan piirteitä, joita voidaan yhdistää kuvien välillä ja niistä laskea kameran liike kuvien välillä. Jotta opetustietokantaa saadaan enemmän näytteitä, käytetyt videot on jaettu lyhyisiin kuvasarjoihin. Näin saadaan myös opetukseen käytettyä laskenta-aikaa lyhennettyä. SfM-menetelmällä on laskettu 3D-mallinnus kuvista, työssä on käytetty pistepilveä. Pistepilvet on laskettu jokaisen kuvan jälkeen. Kuva on määritelty tärkeäksi, jos sen lisääminen pistepilven laskentaan tekee pistepilvestä samanlaisemman viiste-etäisyydellä kuin pistepilvi, joka on laskettu kaikilla kuvasarjan kuvilla. Pistepilvien samanlaisuutta on mitattu viiste etäisyydellä jokaisen pistepilven laskentaan lisätyn kuvan jälkeen. Riippuen kuinka paljon viiste etäisyys pienenee kuvalle määritellään luokka. Neuroverkon rakenteena käytetään LSTM takaisinkytkeytyvää neuroverkkoa, koska se pystyy luokittelemaan jokaisen kuvan koko aikaisemman kuvajonon perusteella, eikä vain sen kuvan perusteella, jota parhaillaan luokitellaan. Matlab-ohjelmistoa on käytetty diplomityössä tietokannan ja neuroverkkojen rakentamiseen. Diplomityön tuloksena LTSM takaisinkytkeytyvä neuroverkko pystyy valitsemaan tärkeimpiä kuvia lyhyistä kuvasarjoista, mutta kuvien valintatarkkuutta pitää vielä tulevaisuudessa parantaa ennen kuin esitettyä järjestelmää voisi käyttää sulautetussa järjestelmässä. Neuroverkko ei oppinut valitsemaan yhtä ja vain yhtä kuvaa samanlaista tietoa sisältävien kuvien joukosta työssä käytetyillä riskifunktioilla

    Using contour information and segmentation for object registration, modeling and retrieval

    Get PDF
    This thesis considers different aspects of the utilization of contour information and syntactic and semantic image segmentation for object registration, modeling and retrieval in the context of content-based indexing and retrieval in large collections of images. Target applications include retrieval in collections of closed silhouettes, holistic w ord recognition in handwritten historical manuscripts and shape registration. Also, the thesis explores the feasibility of contour-based syntactic features for improving the correspondence of the output of bottom-up segmentation to semantic objects present in the scene and discusses the feasibility of different strategies for image analysis utilizing contour information, e.g. segmentation driven by visual features versus segmentation driven by shape models or semi-automatic in selected application scenarios. There are three contributions in this thesis. The first contribution considers structure analysis based on the shape and spatial configuration of image regions (socalled syntactic visual features) and their utilization for automatic image segmentation. The second contribution is the study of novel shape features, matching algorithms and similarity measures. Various applications of the proposed solutions are presented throughout the thesis providing the basis for the third contribution which is a discussion of the feasibility of different recognition strategies utilizing contour information. In each case, the performance and generality of the proposed approach has been analyzed based on extensive rigorous experimentation using as large as possible test collections

    Social work with airports passengers

    Get PDF
    Social work at the airport is in to offer to passengers social services. The main methodological position is that people are under stress, which characterized by a particular set of characteristics in appearance and behavior. In such circumstances passenger attracts in his actions some attention. Only person whom he trusts can help him with the documents or psychologically

    First Annual Workshop on Space Operations Automation and Robotics (SOAR 87)

    Get PDF
    Several topics relative to automation and robotics technology are discussed. Automation of checkout, ground support, and logistics; automated software development; man-machine interfaces; neural networks; systems engineering and distributed/parallel processing architectures; and artificial intelligence/expert systems are among the topics covered

    Maritime expressions:a corpus based exploration of maritime metaphors

    Get PDF
    This study uses a purpose-built corpus to explore the linguistic legacy of Britain’s maritime history found in the form of hundreds of specialised ‘Maritime Expressions’ (MEs), such as TAKEN ABACK, ANCHOR and ALOOF, that permeate modern English. Selecting just those expressions commencing with ’A’, it analyses 61 MEs in detail and describes the processes by which these technical expressions, from a highly specialised occupational discourse community, have made their way into modern English. The Maritime Text Corpus (MTC) comprises 8.8 million words, encompassing a range of text types and registers, selected to provide a cross-section of ‘maritime’ writing. It is analysed using WordSmith analytical software (Scott, 2010), with the 100 million-word British National Corpus (BNC) as a reference corpus. Using the MTC, a list of keywords of specific salience within the maritime discourse has been compiled and, using frequency data, concordances and collocations, these MEs are described in detail and their use and form in the MTC and the BNC is compared. The study examines the transformation from ME to figurative use in the general discourse, in terms of form and metaphoricity. MEs are classified according to their metaphorical strength and their transference from maritime usage into new registers and domains such as those of business, politics, sports and reportage etc. A revised model of metaphoricity is developed and a new category of figurative expression, the ‘resonator’, is proposed. Additionally, developing the work of Lakov and Johnson, Kovesces and others on Conceptual Metaphor Theory (CMT), a number of Maritime Conceptual Metaphors are identified and their cultural significance is discussed
    corecore