21 research outputs found

    HSI-MSER: Hyperspectral Image Registration Algorithm based on MSER and SIFT

    Get PDF
    Image alignment is an essential task in many applications of hyperspectral remote sensing images. Before any processing, the images must be registered. The Maximally Stable Extremal Regions (MSER) is a feature detection algorithm which extracts regions by thresholding the image at different grey levels. These extremal regions are invariant to image transformations making them ideal for registration. The Scale-Invariant Feature Transform (SIFT) is a well-known keypoint detector and descriptor based on the construction of a Gaussian scale-space. This article presents a hyperspectral remote sensing image registration method based on MSER for feature detection and SIFT for feature description. It efficiently exploits the information contained in the different spectral bands to improve the image alignment. The experimental results over nine hyperspectral images show that the proposed method achieves a higher number of correct registration cases using less computational resources than other hyperspectral registration methods. Results are evaluated in terms of accuracy of the registration and also in terms of execution timeMinisterio de Ciencia e Innovaci贸n, Government of Spain PID2019-104834GB-I00; Conseller铆a de Cultura, Educaci贸n e Universidade (Grant Number: ED431C 2018/19 and 2019-2022 ED431G-2019/04); Junta de Castilla y Le贸n under Project VA226P20; 10.13039/501100008530-European Regional Development Fund; Ministerio de Universidades, Government of Spain (Grant Number: FPU16/03537)S

    Efficient Algorithms for Large-Scale Image Analysis

    Get PDF
    This work develops highly efficient algorithms for analyzing large images. Applications include object-based change detection and screening. The algorithms are 10-100 times as fast as existing software, sometimes even outperforming FGPA/GPU hardware, because they are designed to suit the computer architecture. This thesis describes the implementation details and the underlying algorithm engineering methodology, so that both may also be applied to other applications

    Text-detection and -recognition from natural images

    Get PDF
    Text detection and recognition from images could have numerous functional applications for document analysis, such as assistance for visually impaired people; recognition of vehicle license plates; evaluation of articles containing tables, street signs, maps, and diagrams; keyword-based image exploration; document retrieval; recognition of parts within industrial automation; content-based extraction; object recognition; address block location; and text-based video indexing. This research exploited the advantages of artificial intelligence (AI) to detect and recognise text from natural images. Machine learning and deep learning were used to accomplish this task.In this research, we conducted an in-depth literature review on the current detection and recognition methods used by researchers to identify the existing challenges, wherein the differences in text resulting from disparity in alignment, style, size, and orientation combined with low image contrast and a complex background make automatic text extraction a considerably challenging and problematic task. Therefore, the state-of-the-art suggested approaches obtain low detection rates (often less than 80%) and recognition rates (often less than 60%). This has led to the development of new approaches. The aim of the study was to develop a robust text detection and recognition method from natural images with high accuracy and recall, which would be used as the target of the experiments. This method could detect all the text in the scene images, despite certain specific features associated with the text pattern. Furthermore, we aimed to find a solution to the two main problems concerning arbitrarily shaped text (horizontal, multi-oriented, and curved text) detection and recognition in a low-resolution scene and with various scales and of different sizes.In this research, we propose a methodology to handle the problem of text detection by using novel combination and selection features to deal with the classification algorithms of the text/non-text regions. The text-region candidates were extracted from the grey-scale images by using the MSER technique. A machine learning-based method was then applied to refine and validate the initial detection. The effectiveness of the features based on the aspect ratio, GLCM, LBP, and HOG descriptors was investigated. The text-region classifiers of MLP, SVM, and RF were trained using selections of these features and their combinations. The publicly available datasets ICDAR 2003 and ICDAR 2011 were used to evaluate the proposed method. This method achieved the state-of-the-art performance by using machine learning methodologies on both databases, and the improvements were significant in terms of Precision, Recall, and F-measure. The F-measure for ICDAR 2003 and ICDAR 2011 was 81% and 84%, respectively. The results showed that the use of a suitable feature combination and selection approach could significantly increase the accuracy of the algorithms.A new dataset has been proposed to fill the gap of character-level annotation and the availability of text in different orientations and of curved text. The proposed dataset was created particularly for deep learning methods which require a massive completed and varying range of training data. The proposed dataset includes 2,100 images annotated at the character and word levels to obtain 38,500 samples of English characters and 12,500 words. Furthermore, an augmentation tool has been proposed to support the proposed dataset. The missing of object detection augmentation tool encroach to proposed tool which has the ability to update the position of bounding boxes after applying transformations on images. This technique helps to increase the number of samples in the dataset and reduce the time of annotations where no annotation is required. The final part of the thesis presents a novel approach for text spotting, which is a new framework for an end-to-end character detection and recognition system designed using an improved SSD convolutional neural network, wherein layers are added to the SSD networks and the aspect ratio of the characters is considered because it is different from that of the other objects. Compared with the other methods considered, the proposed method could detect and recognise characters by training the end-to-end model completely. The performance of the proposed method was better on the proposed dataset; it was 90.34. Furthermore, the F-measure of the method鈥檚 accuracy on ICDAR 2015, ICDAR 2013, and SVT was 84.5, 91.9, and 54.8, respectively. On ICDAR13, the method achieved the second-best accuracy. The proposed method could spot text in arbitrarily shaped (horizontal, oriented, and curved) scene text.</div

    Moving Object Detection and Segmentation for Remote Aerial Video Surveillance

    Get PDF
    Unmanned Aerial Vehicles (UAVs) equipped with video cameras are a flexible support to ensure civil and military safety and security. In this thesis, a video processing chain is presented for moving object detection in aerial video surveillance. A Track-Before-Detect (TBD) algorithm is applied to detect motion that is independent of the camera motion. Novel robust and fast object detection and segmentation approaches improve the baseline TBD and outperform current state-of-the-art methods

    Synthetic Aperture Radar (SAR) Meets Deep Learning

    Get PDF
    This reprint focuses on the application of the combination of synthetic aperture radars and depth learning technology. It aims to further promote the development of SAR image intelligent interpretation technology. A synthetic aperture radar (SAR) is an important active microwave imaging sensor, whose all-day and all-weather working capacity give it an important place in the remote sensing community. Since the United States launched the first SAR satellite, SAR has received much attention in the remote sensing community, e.g., in geological exploration, topographic mapping, disaster forecast, and traffic monitoring. It is valuable and meaningful, therefore, to study SAR-based remote sensing applications. In recent years, deep learning represented by convolution neural networks has promoted significant progress in the computer vision community, e.g., in face recognition, the driverless field and Internet of things (IoT). Deep learning can enable computational models with multiple processing layers to learn data representations with multiple-level abstractions. This can greatly improve the performance of various applications. This reprint provides a platform for researchers to handle the above significant challenges and present their innovative and cutting-edge research results when applying deep learning to SAR in various manuscript types, e.g., articles, letters, reviews and technical reports

    GEOBIA 2016 : Solutions and Synergies., 14-16 September 2016, University of Twente Faculty of Geo-Information and Earth Observation (ITC): open access e-book

    Get PDF

    Sustainable Agriculture and Advances of Remote Sensing (Volume 1)

    Get PDF
    Agriculture, as the main source of alimentation and the most important economic activity globally, is being affected by the impacts of climate change. To maintain and increase our global food system production, to reduce biodiversity loss and preserve our natural ecosystem, new practices and technologies are required. This book focuses on the latest advances in remote sensing technology and agricultural engineering leading to the sustainable agriculture practices. Earth observation data, in situ and proxy-remote sensing data are the main source of information for monitoring and analyzing agriculture activities. Particular attention is given to earth observation satellites and the Internet of Things for data collection, to multispectral and hyperspectral data analysis using machine learning and deep learning, to WebGIS and the Internet of Things for sharing and publishing the results, among others

    Robust density modelling using the student's t-distribution for human action recognition

    Full text link
    The extraction of human features from videos is often inaccurate and prone to outliers. Such outliers can severely affect density modelling when the Gaussian distribution is used as the model since it is highly sensitive to outliers. The Gaussian distribution is also often used as base component of graphical models for recognising human actions in the videos (hidden Markov model and others) and the presence of outliers can significantly affect the recognition accuracy. In contrast, the Student's t-distribution is more robust to outliers and can be exploited to improve the recognition rate in the presence of abnormal data. In this paper, we present an HMM which uses mixtures of t-distributions as observation probabilities and show how experiments over two well-known datasets (Weizmann, MuHAVi) reported a remarkable improvement in classification accuracy. 漏 2011 IEEE

    Contributions to the content-based image retrieval using pictorial queries

    Get PDF
    Descripci贸 del recurs: el 02 de novembre de 2010L'acc茅s massiu a les c脿meres digitals, els ordinadors personals i a Internet, ha propiciat la creaci贸 de grans volums de dades en format digital. En aquest context, cada vegada adquireixen major rellev脿ncia totes aquelles eines dissenyades per organitzar la informaci贸 i facilitar la seva cerca. Les imatges s贸n un cas particular de dades que requereixen t猫cniques espec铆fiques de descripci贸 i indexaci贸. L'脿rea de la visi贸 per computador encarregada de l'estudi d'aquestes t猫cniques rep el nom de Recuperaci贸 d'Imatges per Contingut, en angl猫s Content-Based Image Retrieval (CBIR). Els sistemes de CBIR no utilitzen descripcions basades en text sin贸 que es basen en caracter铆stiques extretes de les pr貌pies imatges. En contrast a les m茅s de 6000 lleng眉es parlades en el m贸n, les descripcions basades en caracter铆stiques visuals representen una via d'expressi贸 universal. La intensa recerca en el camp dels sistemes de CBIR s'ha aplicat en 脿rees de coneixement molt diverses. Aix铆 doncs s'han desenvolupat aplicacions de CBIR relacionades amb la medicina, la protecci贸 de la propietat intel路lectual, el periodisme, el disseny gr脿fic, la cerca d'informaci贸 en Internet, la preservaci贸 dels patrimoni cultural, etc. Un dels punts importants d'una aplicaci贸 de CBIR resideix en el disseny de les funcions de l'usuari. L'usuari 茅s l'encarregat de formular les consultes a partir de les quals es fa la cerca de les imatges. Nosaltres hem centrat l'atenci贸 en aquells sistemes en qu猫 la consulta es formula a partir d'una representaci贸 pict貌rica. Hem plantejat una taxonomia dels sistemes de consulta en composada per quatre paradigmes diferents: Consulta-segons-Selecci贸, Consulta-segons-Composici贸-Ic貌nica, Consulta-segons-Esbo莽 i Consulta-segons-Il路lustraci贸. Cada paradigma incorpora un nivell diferent en el potencial expressiu de l'usuari. Des de la simple selecci贸 d'una imatge, fins a la creaci贸 d'una il路lustraci贸 en color, l'usuari 茅s qui pren el control de les dades d'entrada del sistema. Al llarg dels cap铆tols d'aquesta tesi hem analitzat la influ猫ncia que cada paradigma de consulta exerceix en els processos interns d'un sistema de CBIR. D'aquesta manera tamb茅 hem proposat un conjunt de contribucions que hem exemplificat des d'un punt de vista pr脿ctic mitjan莽ant una aplicaci贸 final
    corecore