5,274 research outputs found

    Automatic image annotation applied to habitat classification

    Get PDF
    Habitat classification, the process of mapping a site with its habitats, is a crucial activity for monitoring environmental biodiversity. Phase 1 classification, a 10-class four-tier hierarchical scheme, is the most widely used scheme in the UK. Currently, no automatic approaches have been developed and its classification is carried out exclusively by ecologists. This manual approach using surveyors is laborious, expensive and subjective. To this date, no automatic approach has been developed. This thesis presents the first automatic system for Phase 1 classification. Our main contribution is an Automatic Image Annotation (AIA) framework for the automatic classification of Phase 1 habitats. This framework combines five elements to annotate unseen photographs: ground-taken geo-referenced photography, low-level visual features, medium-level semantic information, random projections forests and location-based weighted predictions. Our second contribution are two fully-annotated ground-taken photograph datasets, the first publicly available databases specifically designed for the development of multimedia analysis techniques for ecological applications. Habitat 1K has over 1,000 photographs and 4,000 annotated habitats and Habitat 3K has over 3,000 images and 11,000 annotated habitats. This is the first time ground-taken photographs have been used with such ecological purposes. Our third contribution is a novel Random Forest-based classifier: Random Projection Forests (RPF). RPFs use Random Projections as a dimensionality reduction mechanism in their split nodes. This new design makes their training and testing phase more efficient than those of the traditional implementation of Random Forests. Our fourth contribution arises from the limitations that low-level features have when classifying similarly visual classes. Low-level features have been proven to be inadequate for discriminating high-level semantic concepts, such as habitat classes. Currently, only humans posses such high-level knowledge. In order to obtain this knowledge, we create a new type of feature, called medium-level features, which use a Human-In-The-Loop approach to extract crucial semantic information. Our final contribution is a location-based voting system for RPFs. We benefit from the geographical properties of habitats to weight the predictions from the RPFs according to the geographical distance between unseen test photographs and photographs in the training set. Results will show that ground-taken photographs are a promising source of information that can be successfully applied to Phase 1 classification. Experiments will demonstrate that our AIA approach outperforms traditional Random Forests in terms of recall and precision. Moreover, both our modifications, the inclusion of medium-level knowledge and a location-based voting system, greatly improve the recall and precision of even the most complex habitats. This makes our complete image-annotation system, to the best of our knowledge, the most accurate automatic alternative to manual habitat classification for the complete categorization of Phase 1 habitats

    Automatic image annotation applied to habitat classification

    Get PDF
    Habitat classification, the process of mapping a site with its habitats, is a crucial activity for monitoring environmental biodiversity. Phase 1 classification, a 10-class four-tier hierarchical scheme, is the most widely used scheme in the UK. Currently, no automatic approaches have been developed and its classification is carried out exclusively by ecologists. This manual approach using surveyors is laborious, expensive and subjective. To this date, no automatic approach has been developed. This thesis presents the first automatic system for Phase 1 classification. Our main contribution is an Automatic Image Annotation (AIA) framework for the automatic classification of Phase 1 habitats. This framework combines five elements to annotate unseen photographs: ground-taken geo-referenced photography, low-level visual features, medium-level semantic information, random projections forests and location-based weighted predictions. Our second contribution are two fully-annotated ground-taken photograph datasets, the first publicly available databases specifically designed for the development of multimedia analysis techniques for ecological applications. Habitat 1K has over 1,000 photographs and 4,000 annotated habitats and Habitat 3K has over 3,000 images and 11,000 annotated habitats. This is the first time ground-taken photographs have been used with such ecological purposes. Our third contribution is a novel Random Forest-based classifier: Random Projection Forests (RPF). RPFs use Random Projections as a dimensionality reduction mechanism in their split nodes. This new design makes their training and testing phase more efficient than those of the traditional implementation of Random Forests. Our fourth contribution arises from the limitations that low-level features have when classifying similarly visual classes. Low-level features have been proven to be inadequate for discriminating high-level semantic concepts, such as habitat classes. Currently, only humans posses such high-level knowledge. In order to obtain this knowledge, we create a new type of feature, called medium-level features, which use a Human-In-The-Loop approach to extract crucial semantic information. Our final contribution is a location-based voting system for RPFs. We benefit from the geographical properties of habitats to weight the predictions from the RPFs according to the geographical distance between unseen test photographs and photographs in the training set. Results will show that ground-taken photographs are a promising source of information that can be successfully applied to Phase 1 classification. Experiments will demonstrate that our AIA approach outperforms traditional Random Forests in terms of recall and precision. Moreover, both our modifications, the inclusion of medium-level knowledge and a location-based voting system, greatly improve the recall and precision of even the most complex habitats. This makes our complete image-annotation system, to the best of our knowledge, the most accurate automatic alternative to manual habitat classification for the complete categorization of Phase 1 habitats

    Automated classification of three-dimensional reconstructions of coral reefs using convolutional neural networks

    Get PDF
    © The Author(s), 2020. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Hopkinson, B. M., King, A. C., Owen, D. P., Johnson-Roberson, M., Long, M. H., & Bhandarkar, S. M. Automated classification of three-dimensional reconstructions of coral reefs using convolutional neural networks. PLoS One, 15(3), (2020): e0230671, doi: 10.1371/journal.pone.0230671.Coral reefs are biologically diverse and structurally complex ecosystems, which have been severally affected by human actions. Consequently, there is a need for rapid ecological assessment of coral reefs, but current approaches require time consuming manual analysis, either during a dive survey or on images collected during a survey. Reef structural complexity is essential for ecological function but is challenging to measure and often relegated to simple metrics such as rugosity. Recent advances in computer vision and machine learning offer the potential to alleviate some of these limitations. We developed an approach to automatically classify 3D reconstructions of reef sections and assessed the accuracy of this approach. 3D reconstructions of reef sections were generated using commercial Structure-from-Motion software with images extracted from video surveys. To generate a 3D classified map, locations on the 3D reconstruction were mapped back into the original images to extract multiple views of the location. Several approaches were tested to merge information from multiple views of a point into a single classification, all of which used convolutional neural networks to classify or extract features from the images, but differ in the strategy employed for merging information. Approaches to merging information entailed voting, probability averaging, and a learned neural-network layer. All approaches performed similarly achieving overall classification accuracies of ~96% and >90% accuracy on most classes. With this high classification accuracy, these approaches are suitable for many ecological applications.This study was funded by grants from the Alfred P. Sloan Foundation (BMH, BR2014-049; https://sloan.org), and the National Science Foundation (MHL, OCE-1657727; https://www.nsf.gov). The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript

    GELDA: A generative language annotation framework to reveal visual biases in datasets

    Full text link
    Bias analysis is a crucial step in the process of creating fair datasets for training and evaluating computer vision models. The bottleneck in dataset analysis is annotation, which typically requires: (1) specifying a list of attributes relevant to the dataset domain, and (2) classifying each image-attribute pair. While the second step has made rapid progress in automation, the first has remained human-centered, requiring an experimenter to compile lists of in-domain attributes. However, an experimenter may have limited foresight leading to annotation "blind spots," which in turn can lead to flawed downstream dataset analyses. To combat this, we propose GELDA, a nearly automatic framework that leverages large generative language models (LLMs) to propose and label various attributes for a domain. GELDA takes a user-defined domain caption (e.g., "a photo of a bird," "a photo of a living room") and uses an LLM to hierarchically generate attributes. In addition, GELDA uses the LLM to decide which of a set of vision-language models (VLMs) to use to classify each attribute in images. Results on real datasets show that GELDA can generate accurate and diverse visual attribute suggestions, and uncover biases such as confounding between class labels and background features. Results on synthetic datasets demonstrate that GELDA can be used to evaluate the biases of text-to-image diffusion models and generative adversarial networks. Overall, we show that while GELDA is not accurate enough to replace human annotators, it can serve as a complementary tool to help humans analyze datasets in a cheap, low-effort, and flexible manner.Comment: 21 pages, 15 figures, 9 table

    Camera methods for the assessment of coastal biodiversity in low visibility environments

    Get PDF
    Coastal marine environments are important ecological, economic and social areas providing valuable services such as coastal protection, areas of recreation and tourism, fishing, climate regulation, biotic materials and biofuels. Marine renewable energy developments in the coastal environment are becoming a key objective for many countries globally. Assessing and monitoring the impacts of these developments on features, such as coastal biodiversity, becomes a difficult prospect in these environments due to the complexity of marine process at the locations in which these developments are targeted. This thesis explores the main challenges faced when assessing biodiversity in dynamic coastal environments, in particular those susceptible to high levels of turbidity. Various underwater camera techniques were trialled in reduced visibility environments including baited remote underwater video (BRUV), drop-down video and hydroacoustic methods. This research successfully refined BRUV guidelines in the North-East Atlantic region and identified key methodological and environmental factors influencing data collected BRUV deployments. Key findings included mackerel as the recommended bait type in this region and highlighting the importance of collecting consistent metadata when using these methods. In areas of high turbidity, clear liquid optical chambers (CLOCs) were successfully used to enhance the quality of information gathered using underwater cameras when monitoring benthic fauna and fish assemblages. CLOCs were applied to both conventional BRUV camera systems and benthic drop-down camera systems. Improvements included image quality, species and habitat level identification, and taxonomic richness. Evaluations of the ARIS 3000 imaging sonar and its capability of visualising distinguishing identifying features in low visibility environments for motile fauna showed mixed results with morphologically distinct species such as elasmobranchs much clearer in the footage compared to individuals belonging to finfish families. A combined approach of optical and hydroacoustic camera methods may be most suitable for adequately assessing coastal biodiversity in low visibility environments

    Feature discovery and visualization of robot mission data using convolutional autoencoders and Bayesian nonparametric topic models

    Full text link
    The gap between our ability to collect interesting data and our ability to analyze these data is growing at an unprecedented rate. Recent algorithmic attempts to fill this gap have employed unsupervised tools to discover structure in data. Some of the most successful approaches have used probabilistic models to uncover latent thematic structure in discrete data. Despite the success of these models on textual data, they have not generalized as well to image data, in part because of the spatial and temporal structure that may exist in an image stream. We introduce a novel unsupervised machine learning framework that incorporates the ability of convolutional autoencoders to discover features from images that directly encode spatial information, within a Bayesian nonparametric topic model that discovers meaningful latent patterns within discrete data. By using this hybrid framework, we overcome the fundamental dependency of traditional topic models on rigidly hand-coded data representations, while simultaneously encoding spatial dependency in our topics without adding model complexity. We apply this model to the motivating application of high-level scene understanding and mission summarization for exploratory marine robots. Our experiments on a seafloor dataset collected by a marine robot show that the proposed hybrid framework outperforms current state-of-the-art approaches on the task of unsupervised seafloor terrain characterization.Comment: 8 page

    Analysis of Coastal Areas Using SAR Images: A Case Study of the Dutch Wadden Sea Region

    Get PDF
    The increased availability of civil synthetic aperture radar (SAR) satellite images with different resolution allows us to compare the imaging capabilities of these instruments, to assess the quality of the available data and to investigate different areas (e.g., the Wadden Sea region). In our investigation, we propose to explore the content of TerraSAR-X and Sentinel-1A satellite images via a data mining approach in which the main steps are patch tiling, feature extraction, classification, semantic annotation and visual-statistical analytics. Once all the extracted categories are mapped and quantified, then the next step is to interpret them from an environmental point of view. The objective of our study is the application of semi-automated SAR image interpretation. Its novelty is the automated multiclass categorisation of coastal areas. We found out that the north-west of the Netherlands can be interpreted routinely as land surfaces by our satellite image analyses, while for the Wadden Sea, we can discriminate the different water levels and their impact on the visibility of the tidal flats. This necessitates a selection of time series data spanning a full tidal cycle

    Perspectives in visual imaging for marine biology and ecology: from acquisition to understanding

    Get PDF
    Durden J, Schoening T, Althaus F, et al. Perspectives in Visual Imaging for Marine Biology and Ecology: From Acquisition to Understanding. In: Hughes RN, Hughes DJ, Smith IP, Dale AC, eds. Oceanography and Marine Biology: An Annual Review. 54. Boca Raton: CRC Press; 2016: 1-72

    A realistic fish-habitat dataset to evaluate algorithms for underwater visual analysis

    Get PDF
    Visual analysis of complex fish habitats is an important step towards sustainable fisheries for human consumption and environmental protection. Deep Learning methods have shown great promise for scene analysis when trained on large-scale datasets. However, current datasets for fish analysis tend to focus on the classification task within constrained, plain environments which do not capture the complexity of underwater fish habitats. To address this limitation, we present DeepFish as a benchmark suite with a large-scale dataset to train and test methods for several computer vision tasks. The dataset consists of approximately 40 thousand images collected underwater from 20 habitats in the marine-environments of tropical Australia. The dataset originally contained only classification labels. Thus, we collected point-level and segmentation labels to have a more comprehensive fish analysis benchmark. These labels enable models to learn to automatically monitor fish count, identify their locations, and estimate their sizes. Our experiments provide an in-depth analysis of the dataset characteristics, and the performance evaluation of several state-of-the-art approaches based on our benchmark. Although models pre-trained on ImageNet have successfully performed on this benchmark, there is still room for improvement. Therefore, this benchmark serves as a testbed to motivate further development in this challenging domain of underwater computer vision
    • …
    corecore