12,401 research outputs found

    Fast Shadow Detection from a Single Image Using a Patched Convolutional Neural Network

    Full text link
    In recent years, various shadow detection methods from a single image have been proposed and used in vision systems; however, most of them are not appropriate for the robotic applications due to the expensive time complexity. This paper introduces a fast shadow detection method using a deep learning framework, with a time cost that is appropriate for robotic applications. In our solution, we first obtain a shadow prior map with the help of multi-class support vector machine using statistical features. Then, we use a semantic- aware patch-level Convolutional Neural Network that efficiently trains on shadow examples by combining the original image and the shadow prior map. Experiments on benchmark datasets demonstrate the proposed method significantly decreases the time complexity of shadow detection, by one or two orders of magnitude compared with state-of-the-art methods, without losing accuracy.Comment: 6 pages, 5 figures, Submitted to IROS 201

    Recording behaviour of indoor-housed farm animals automatically using machine vision technology: a systematic review

    Get PDF
    Large-scale phenotyping of animal behaviour traits is time consuming and has led to increased demand for technologies that can automate these procedures. Automated tracking of animals has been successful in controlled laboratory settings, but recording from animals in large groups in highly variable farm settings presents challenges. The aim of this review is to provide a systematic overview of the advances that have occurred in automated, high throughput image detection of farm animal behavioural traits with welfare and production implications. Peer-reviewed publications written in English were reviewed systematically following Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. After identification, screening, and assessment for eligibility, 108 publications met these specifications and were included for qualitative synthesis. Data collected from the papers included camera specifications, housing conditions, group size, algorithm details, procedures, and results. Most studies utilized standard digital colour video cameras for data collection, with increasing use of 3D cameras in papers published after 2013. Papers including pigs (across production stages) were the most common (n = 63). The most common behaviours recorded included activity level, area occupancy, aggression, gait scores, resource use, and posture. Our review revealed many overlaps in methods applied to analysing behaviour, and most studies started from scratch instead of building upon previous work. Training and validation sample sizes were generally small (mean±s.d. groups = 3.8±5.8) and in data collection and testing took place in relatively controlled environments. To advance our ability to automatically phenotype behaviour, future research should build upon existing knowledge and validate technology under commercial settings and publications should explicitly describe recording conditions in detail to allow studies to be reproduced

    Automatic human behaviour anomaly detection in surveillance video

    Get PDF
    This thesis work focusses upon developing the capability to automatically evaluate and detect anomalies in human behaviour from surveillance video. We work with static monocular cameras in crowded urban surveillance scenarios, particularly air- ports and commercial shopping areas. Typically a person is 100 to 200 pixels high in a scene ranging from 10 - 20 meters width and depth, populated by 5 to 40 peo- ple at any given time. Our procedure evaluates human behaviour unobtrusively to determine outlying behavioural events, agging abnormal events to the operator. In order to achieve automatic human behaviour anomaly detection we address the challenge of interpreting behaviour within the context of the social and physical environment. We develop and evaluate a process for measuring social connectivity between individuals in a scene using motion and visual attention features. To do this we use mutual information and Euclidean distance to build a social similarity matrix which encodes the social connection strength between any two individuals. We de- velop a second contextual basis which acts by segmenting a surveillance environment into behaviourally homogeneous subregions which represent high tra c slow regions and queuing areas. We model the heterogeneous scene in homogeneous subgroups using both contextual elements. We bring the social contextual information, the scene context, the motion, and visual attention features together to demonstrate a novel human behaviour anomaly detection process which nds outlier behaviour from a short sequence of video. The method, Nearest Neighbour Ranked Outlier Clusters (NN-RCO), is based upon modelling behaviour as a time independent se- quence of behaviour events, can be trained in advance or set upon a single sequence. We nd that in a crowded scene the application of Mutual Information-based social context permits the ability to prevent self-justifying groups and propagate anomalies in a social network, granting a greater anomaly detection capability. Scene context uniformly improves the detection of anomalies in all the datasets we test upon. We additionally demonstrate that our work is applicable to other data domains. We demonstrate upon the Automatic Identi cation Signal data in the maritime domain. Our work is capable of identifying abnormal shipping behaviour using joint motion dependency as analogous for social connectivity, and similarly segmenting the shipping environment into homogeneous regions

    Object detection, recognition and classification using computer vision and artificial intelligence approaches

    Get PDF
    Object detection and recognition has been used extensively in recent years to solve numerus challenges in different fields. Due to the vital roles they play, object detection and recognition has enabled quantum leaps in many industry fields by helping to overcome some serious challenges and obstacles. For example, worldwide security concerns have drawn the attention and stimulated the use of highly intelligent computer vision technology to provide security in different environments and in diverse terrains. In addition, some wildlife is at present exposed to danger and extinction worldwide. Therefore, early detection and recognition of potential threats to wildlife have become essential and timely. The extent of using computer vision and artificial intelligence to convert the seemingly insecure world to a more secure one has been widely accepted. Such technologies are used in monitoring, tracking, organising, analysing objects in a scene and for a number of other countless purposes. [Continues.

    Facial Expression Recognition

    Get PDF

    Local Binary Pattern based algorithms for the discrimination and detection of crops and weeds with similar morphologies

    Get PDF
    In cultivated agricultural fields, weeds are unwanted species that compete with the crop plants for nutrients, water, sunlight and soil, thus constraining their growth. Applying new real-time weed detection and spraying technologies to agriculture would enhance current farming practices, leading to higher crop yields and lower production costs. Various weed detection methods have been developed for Site-Specific Weed Management (SSWM) aimed at maximising the crop yield through efficient control of weeds. Blanket application of herbicide chemicals is currently the most popular weed eradication practice in weed management and weed invasion. However, the excessive use of herbicides has a detrimental impact on the human health, economy and environment. Before weeds are resistant to herbicides and respond better to weed control strategies, it is necessary to control them in the fallow, pre-sowing, early post-emergent and in pasture phases. Moreover, the development of herbicide resistance in weeds is the driving force for inventing precision and automation weed treatments. Various weed detection techniques have been developed to identify weed species in crop fields, aimed at improving the crop quality, reducing herbicide and water usage and minimising environmental impacts. In this thesis, Local Binary Pattern (LBP)-based algorithms are developed and tested experimentally, which are based on extracting dominant plant features from camera images to precisely detecting weeds from crops in real time. Based on the efficient computation and robustness of the first LBP method, an improved LBP-based method is developed based on using three different LBP operators for plant feature extraction in conjunction with a Support Vector Machine (SVM) method for multiclass plant classification. A 24,000-image dataset, collected using a testing facility under simulated field conditions (Testbed system), is used for algorithm training, validation and testing. The dataset, which is published online under the name “bccr-segset”, consists of four subclasses: background, Canola (Brassica napus), Corn (Zea mays), and Wild radish (Raphanus raphanistrum). In addition, the dataset comprises plant images collected at four crop growth stages, for each subclass. The computer-controlled Testbed is designed to rapidly label plant images and generate the “bccr-segset” dataset. Experimental results show that the classification accuracy of the improved LBP-based algorithm is 91.85%, for the four classes. Due to the similarity of the morphologies of the canola (crop) and wild radish (weed) leaves, the conventional LBP-based method has limited ability to discriminate broadleaf crops from weeds. To overcome this limitation and complex field conditions (illumination variation, poses, viewpoints, and occlusions), a novel LBP-based method (denoted k-FLBPCM) is developed to enhance the classification accuracy of crops and weeds with similar morphologies. Our contributions include (i) the use of opening and closing morphological operators in pre-processing of plant images, (ii) the development of the k-FLBPCM method by combining two methods, namely, the filtered local binary pattern (LBP) method and the contour-based masking method with a coefficient k, and (iii) the optimal use of SVM with the radial basis function (RBF) kernel to precisely identify broadleaf plants based on their distinctive features. The high performance of this k-FLBPCM method is demonstrated by experimentally attaining up to 98.63% classification accuracy at four different growth stages for all classes of the “bccr-segset” dataset. To evaluate performance of the k-FLBPCM algorithm in real-time, a comparison analysis between our novel method (k-FLBPCM) and deep convolutional neural networks (DCNNs) is conducted on morphologically similar crops and weeds. Various DCNN models, namely VGG-16, VGG-19, ResNet50 and InceptionV3, are optimised, by fine-tuning their hyper-parameters, and tested. Based on the experimental results on the “bccr-segset” dataset collected from the laboratory and the “fieldtrip_can_weeds” dataset collected from the field under practical environments, the classification accuracies of the DCNN models and the k-FLBPCM method are almost similar. Another experiment is conducted by training the algorithms with plant images obtained at mature stages and testing them at early stages. In this case, the new k-FLBPCM method outperformed the state-of-the-art CNN models in identifying small leaf shapes of canola-radish (crop-weed) at early growth stages, with an order of magnitude lower error rates in comparison with DCNN models. Furthermore, the execution time of the k-FLBPCM method during the training and test phases was faster than the DCNN counterparts, with an identification time difference of approximately 0.224ms per image for the laboratory dataset and 0.346ms per image for the field dataset. These results demonstrate the ability of the k-FLBPCM method to rapidly detect weeds from crops of similar appearance in real time with less data, and generalize to different size plants better than the CNN-based methods

    Fundus image analysis for automatic screening of ophthalmic pathologies

    Full text link
    En los ultimos años el número de casos de ceguera se ha reducido significativamente. A pesar de este hecho, la Organización Mundial de la Salud estima que un 80% de los casos de pérdida de visión (285 millones en 2010) pueden ser evitados si se diagnostican en sus estadios más tempranos y son tratados de forma efectiva. Para cumplir esta propuesta se pretende que los servicios de atención primaria incluyan un seguimiento oftalmológico de sus pacientes así como fomentar campañas de cribado en centros proclives a reunir personas de alto riesgo. Sin embargo, estas soluciones exigen una alta carga de trabajo de personal experto entrenado en el análisis de los patrones anómalos propios de cada enfermedad. Por lo tanto, el desarrollo de algoritmos para la creación de sistemas de cribado automáticos juga un papel vital en este campo. La presente tesis persigue la identificacion automática del daño retiniano provocado por dos de las patologías más comunes en la sociedad actual: la retinopatía diabética (RD) y la degenaración macular asociada a la edad (DMAE). Concretamente, el objetivo final de este trabajo es el desarrollo de métodos novedosos basados en la extracción de características de la imagen de fondo de ojo y clasificación para discernir entre tejido sano y patológico. Además, en este documento se proponen algoritmos de pre-procesado con el objetivo de normalizar la alta variabilidad existente en las bases de datos publicas de imagen de fondo de ojo y eliminar la contribución de ciertas estructuras retinianas que afectan negativamente en la detección del daño retiniano. A diferencia de la mayoría de los trabajos existentes en el estado del arte sobre detección de patologías en imagen de fondo de ojo, los métodos propuestos a lo largo de este manuscrito evitan la necesidad de segmentación de las lesiones o la generación de un mapa de candidatos antes de la fase de clasificación. En este trabajo, Local binary patterns, perfiles granulométricos y la dimensión fractal se aplican de manera local para extraer información de textura, morfología y tortuosidad de la imagen de fondo de ojo. Posteriormente, esta información se combina de diversos modos formando vectores de características con los que se entrenan avanzados métodos de clasificación formulados para discriminar de manera óptima entre exudados, microaneurismas, hemorragias y tejido sano. Mediante diversos experimentos, se valida la habilidad del sistema propuesto para identificar los signos más comunes de la RD y DMAE. Para ello se emplean bases de datos públicas con un alto grado de variabilidad sin exlcuir ninguna imagen. Además, la presente tesis también cubre aspectos básicos del paradigma de deep learning. Concretamente, se presenta un novedoso método basado en redes neuronales convolucionales (CNNs). La técnica de transferencia de conocimiento se aplica mediante el fine-tuning de las arquitecturas de CNNs más importantes en el estado del arte. La detección y localización de exudados mediante redes neuronales se lleva a cabo en los dos últimos experimentos de esta tesis doctoral. Cabe destacar que los resultados obtenidos mediante la extracción de características "manual" y posterior clasificación se comparan de forma objetiva con las predicciones obtenidas por el mejor modelo basado en CNNs. Los prometedores resultados obtenidos en esta tesis y el bajo coste y portabilidad de las cámaras de adquisión de imagen de retina podrían facilitar la incorporación de los algoritmos desarrollados en este trabajo en un sistema de cribado automático que ayude a los especialistas en la detección de patrones anomálos característicos de las dos enfermedades bajo estudio: RD y DMAE.In last years, the number of blindness cases has been significantly reduced. Despite this promising news, the World Health Organisation estimates that 80% of visual impairment (285 million cases in 2010) could be avoided if diagnosed and treated early. To accomplish this purpose, eye care services need to be established in primary health and screening campaigns should be a common task in centres with people at risk. However, these solutions entail a high workload for trained experts in the analysis of the anomalous patterns of each eye disease. Therefore, the development of algorithms for automatic screening system plays a vital role in this field. This thesis focuses on the automatic identification of the retinal damage provoked by two of the most common pathologies in the current society: diabetic retinopathy (DR) and age-related macular degeneration (AMD). Specifically, the final goal of this work is to develop novel methods, based on fundus image description and classification, to characterise the healthy and abnormal tissue in the retina background. In addition, pre-processing algorithms are proposed with the aim of normalising the high variability of fundus images and removing the contribution of some retinal structures that could hinder in the retinal damage detection. In contrast to the most of the state-of-the-art works in damage detection using fundus images, the methods proposed throughout this manuscript avoid the necessity of lesion segmentation or the candidate map generation before the classification stage. Local binary patterns, granulometric profiles and fractal dimension are locally computed to extract texture, morphological and roughness information from retinal images. Different combinations of this information feed advanced classification algorithms formulated to optimally discriminate exudates, microaneurysms, haemorrhages and healthy tissues. Through several experiments, the ability of the proposed system to identify DR and AMD signs is validated using different public databases with a large degree of variability and without image exclusion. Moreover, this thesis covers the basics of the deep learning paradigm. In particular, a novel approach based on convolutional neural networks is explored. The transfer learning technique is applied to fine-tune the most important state-of-the-art CNN architectures. Exudate detection and localisation tasks using neural networks are carried out in the last two experiments of this thesis. An objective comparison between the hand-crafted feature extraction and classification process and the prediction models based on CNNs is established. The promising results of this PhD thesis and the affordable cost and portability of retinal cameras could facilitate the further incorporation of the developed algorithms in a computer-aided diagnosis (CAD) system to help specialists in the accurate detection of anomalous patterns characteristic of the two diseases under study: DR and AMD.En els últims anys el nombre de casos de ceguera s'ha reduït significativament. A pesar d'este fet, l'Organització Mundial de la Salut estima que un 80% dels casos de pèrdua de visió (285 milions en 2010) poden ser evitats si es diagnostiquen en els seus estadis més primerencs i són tractats de forma efectiva. Per a complir esta proposta es pretén que els servicis d'atenció primària incloguen un seguiment oftalmològic dels seus pacients així com fomentar campanyes de garbellament en centres regentats per persones d'alt risc. No obstant això, estes solucions exigixen una alta càrrega de treball de personal expert entrenat en l'anàlisi dels patrons anòmals propis de cada malaltia. Per tant, el desenrotllament d'algoritmes per a la creació de sistemes de garbellament automàtics juga un paper vital en este camp. La present tesi perseguix la identificació automàtica del dany retiniano provocat per dos de les patologies més comunes en la societat actual: la retinopatia diabètica (RD) i la degenaración macular associada a l'edat (DMAE) . Concretament, l'objectiu final d'este treball és el desenrotllament de mètodes novedodos basats en l'extracció de característiques de la imatge de fons d'ull i classificació per a discernir entre teixit sa i patològic. A més, en este document es proposen algoritmes de pre- processat amb l'objectiu de normalitzar l'alta variabilitat existent en les bases de dades publiques d'imatge de fons d'ull i eliminar la contribució de certes estructures retinianas que afecten negativament en la detecció del dany retiniano. A diferència de la majoria dels treballs existents en l'estat de l'art sobre detecció de patologies en imatge de fons d'ull, els mètodes proposats al llarg d'este manuscrit eviten la necessitat de segmentació de les lesions o la generació d'un mapa de candidats abans de la fase de classificació. En este treball, Local binary patterns, perfils granulometrics i la dimensió fractal s'apliquen de manera local per a extraure informació de textura, morfologia i tortuositat de la imatge de fons d'ull. Posteriorment, esta informació es combina de diversos modes formant vectors de característiques amb els que s'entrenen avançats mètodes de classificació formulats per a discriminar de manera òptima entre exsudats, microaneurismes, hemorràgies i teixit sa. Per mitjà de diversos experiments, es valida l'habilitat del sistema proposat per a identificar els signes més comuns de la RD i DMAE. Per a això s'empren bases de dades públiques amb un alt grau de variabilitat sense exlcuir cap imatge. A més, la present tesi també cobrix aspectes bàsics del paradigma de deep learning. Concretament, es presenta un nou mètode basat en xarxes neuronals convolucionales (CNNs) . La tècnica de transferencia de coneixement s'aplica per mitjà del fine-tuning de les arquitectures de CNNs més importants en l'estat de l'art. La detecció i localització d'exudats per mitjà de xarxes neuronals es du a terme en els dos últims experiments d'esta tesi doctoral. Cal destacar que els resultats obtinguts per mitjà de l'extracció de característiques "manual" i posterior classificació es comparen de forma objectiva amb les prediccions obtingudes pel millor model basat en CNNs. Els prometedors resultats obtinguts en esta tesi i el baix cost i portabilitat de les cambres d'adquisión d'imatge de retina podrien facilitar la incorporació dels algoritmes desenrotllats en este treball en un sistema de garbellament automàtic que ajude als especialistes en la detecció de patrons anomálos característics de les dos malalties baix estudi: RD i DMAE.Colomer Granero, A. (2018). Fundus image analysis for automatic screening of ophthalmic pathologies [Tesis doctoral no publicada]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/99745TESI

    Development of Mining Sector Applications for Emerging Remote Sensing and Deep Learning Technologies

    Get PDF
    This thesis uses neural networks and deep learning to address practical, real-world problems in the mining sector. The main focus is on developing novel applications in the area of object detection from remotely sensed data. This area has many potential mining applications and is an important part of moving towards data driven strategic decision making across the mining sector. The scientific contributions of this research are twofold; firstly, each of the three case studies demonstrate new applications which couple remote sensing and neural network based technologies for improved data driven decision making. Secondly, the thesis presents a framework to guide implementation of these technologies in the mining sector, providing a guide for researchers and professionals undertaking further studies of this type. The first case study builds a fully connected neural network method to locate supporting rock bolts from 3D laser scan data. This method combines input features from the remote sensing and mobile robotics research communities, generating accuracy scores up to 22% higher than those found using either feature set in isolation. The neural network approach also is compared to the widely used random forest classifier and is shown to outperform this classifier on the test datasets. Additionally, the algorithms’ performance is enhanced by adding a confusion class to the training data and by grouping the output predictions using density based spatial clustering. The method is tested on two datasets, gathered using different laser scanners, in different types of underground mines which have different rock bolting patterns. In both cases the method is found to be highly capable of detecting the rock bolts with recall scores of 0.87-0.96. The second case study investigates modern deep learning for LiDAR data. Here, multiple transfer learning strategies and LiDAR data representations are examined for the task of identifying historic mining remains. A transfer learning approach based on a Lunar crater detection model is used, due to the task similarities between both the underlying data structures and the geometries of the objects to be detected. The relationship between dataset resolution and detection accuracy is also examined, with the results showing that the approach is capable of detecting pits and shafts to a high degree of accuracy with precision and recall scores between 0.80-0.92, provided the input data is of sufficient quality and resolution. Alongside resolution, different LiDAR data representations are explored, showing that the precision-recall balance varies depending on the input LiDAR data representation. The third case study creates a deep convolutional neural network model to detect artisanal scale mining from multispectral satellite data. This model is trained from initialisation without transfer learning and demonstrates that accurate multispectral models can be built from a smaller training dataset when appropriate design and data augmentation strategies are adopted. Alongside the deep learning model, novel mosaicing algorithms are developed both to improve cloud cover penetration and to decrease noise in the final prediction maps. When applied to the study area, the results from this model provide valuable information about the expansion, migration and forest encroachment of artisanal scale mining in southwestern Ghana over the last four years. Finally, this thesis presents an implementation framework for these neural network based object detection models, to generalise the findings from this research to new mining sector deep learning tasks. This framework can be used to identify applications which would benefit from neural network approaches; to build the models; and to apply these algorithms in a real world environment. The case study chapters confirm that the neural network models are capable of interpreting remotely sensed data to a high degree of accuracy on real world mining problems, while the framework guides the development of new models to solve a wide range of related challenges
    corecore