935 research outputs found
Automated shark detection using computer vision
With the technological advancements of UAVs, researchers are finding more ways to harness their capabilities to reduce expenses in everyday society. Machine vision is at the forefront of this research and in particular image recognition. Training a machine to identify objects and diâ”erentiate them from others plays an integral role in the advancement of artificial intelligence. This project aims to design an algorithm capable of automatically detecting sharks from a UAV. Testing is performed by post-processing aerial footage of sharks taken from helicopters and drones, and analysing the reliability of the algorithm.
Initially this research project involved analysing aerial photography of sharks, dissecting the images into the individual colour channels that made up the RGB and HSV colour spaces and identifying methods to detect the shark blobs. Once an adaptive threshold of the brightness channel was designed, filters were curated specific to the environments presented in the obtained aerial footage to reject false positives. These methods were considerably successful in both rejecting false positives and consistently detecting the sharks in the video feed.
The methods produced in this dissertation leave room for future work in the shark detection field. By acquiring more reliable data, improvements such as using a kalman filter to detect and track moving blobs could be implemented to produce a robust shark detection and tracking system
Automatic Detectors for Underwater Soundscape Measurements
Environmental impact regulations require that marine industrial operators quantify their contribution to underwater noise scenes. Automation of such assessments becomes feasible with the successful categorisation of sounds into broader classes based on source types â biological, anthropogenic and physical. Previous approaches to passive acoustic monitoring have mostly been limited to a few specific sources of interest. In this study, source-independent signal detectors are developed and a framework is presented for the automatic categorisation of underwater sounds into the aforementioned classes
A realistic fish-habitat dataset to evaluate algorithms for underwater visual analysis
Visual analysis of complex fish habitats is an important step towards sustainable fisheries for human consumption and environmental protection. Deep Learning methods have shown great promise for scene analysis when trained on large-scale datasets. However, current datasets for fish analysis tend to focus on the classification task within constrained, plain environments which do not capture the complexity of underwater fish habitats. To address this limitation, we present DeepFish as a benchmark suite with a large-scale dataset to train and test methods for several computer vision tasks. The dataset consists of approximately 40 thousand images collected underwater from 20 habitats in the marine-environments of tropical Australia. The dataset originally contained only classification labels. Thus, we collected point-level and segmentation labels to have a more comprehensive fish analysis benchmark. These labels enable models to learn to automatically monitor fish count, identify their locations, and estimate their sizes. Our experiments provide an in-depth analysis of the dataset characteristics, and the performance evaluation of several state-of-the-art approaches based on our benchmark. Although models pre-trained on ImageNet have successfully performed on this benchmark, there is still room for improvement. Therefore, this benchmark serves as a testbed to motivate further development in this challenging domain of underwater computer vision
Recommended from our members
A biologically inspired spiking model of visual processing for image feature detection
To enable fast reliable feature matching or tracking in scenes, features need to be discrete and meaningful, and hence edge or corner features, commonly called interest points are often used for this purpose. Experimental research has illustrated that biological vision systems use neuronal circuits to extract particular features such as edges or corners from visual scenes. Inspired by this biological behaviour, this paper proposes a biologically inspired spiking neural network for the purpose of image feature extraction. Standard digital images are processed and converted to spikes in a manner similar to the processing that transforms light into spikes in the retina. Using a hierarchical spiking network, various types of biologically inspired receptive fields are used to extract progressively complex image features. The performance of the network is assessed by examining the repeatability of extracted features with visual results presented using both synthetic and real images
PICES Press, Vol. 24, No. 1, Winter 2016
PICES science in 2015: A note from the Science Board Chairman (pp. 1-7); 2015 PICES awards (pp. 8-10); Face to face with oceanographers: PICES outreach (pp. 11-13); An update on the FUTURE science program (pp. 14-15); International Scientific Symposium on âHarmful algal blooms and climate changeâ (pp. 16-17); International Scientific Conference on âOur common future under climate changeâ (pp. 18-19); PICES/ICES Workshop on âModelling effects of climate change on fish and fisheriesâ (pp. 20-23); The mussel Mytilus galloprovincialis on Japanese tsunami marine debris (pp. 24-28); Moving towards more sustainable shrimp and tilapia aquaculture in Karawang, Indonesia (pp. 29-30); New leadership in PICES (pp. 31-21); Alexander S. Bychkov â Connecting regional organizations on a global scale (pp. 33-33); Japanese translation of âGuide to Best Practices for Ocean CO2 Measurementsâ (pp. 34-34); Global ocean carbon dioxide (CO2) uptake: Distribution and temporal variation (pp. 35-35); For the e-bookshelf: âImpacts of the Fukushima Nuclear Accident on Fish and Fishing Groundsâ (pp. 36-37); PICES interns (pp. 38-38); PICES calendar of events (pp. 39-39); The state of the western North Pacific during the 2015 warm season (pp. 40-41); The Bering Sea: Current status and recent trends (pp. 42-45); The Blob (Part Three): Going, going, gone? (pp. 46-48
Dynamic Physical-Layer Secured Link in a Mobile MIMO VLC System
This paper proposes a novel approach to provide a privately secured multiple-input and multiple-output visible light communication (VLC) in the mobility conditions. In the proposed system, a private secured VLC link is adaptively allocated to a mobile user all the time thanks to the movement tracking assistance by a camera-based detection system. The generation of the dynamic location-based scrambling matrix will be introduced providing a secured communication zone within a full normal coverage illumination area. An extensive range of numerical evaluation and practical experiments is carried out to demonstrate and evaluate the proposed system performance in different environment configurations including the mobility, camera resolutions, link range, and environment light intensity. We demonstrate that the proposed system is fully capable of securely steering the information with respect to a receiver location with a high level of reliability
Semantics and planning based workflow composition and execution for video processing
Traditional workflow systems have several drawbacks, e.g. in their inabilities to rapidly
react to changes, to construct workflow automatically (or with user involvement) and
to improve performance autonomously (or with user involvement) in an incremental
manner according to specified goals. Overcoming these limitations would be highly
beneficial for complex domains where such adversities are exhibited. Video processing
is one such domain that increasingly requires attention as larger amounts of images and
videos are becoming available to persons who are not technically adept in modelling
the processes that are involved in constructing complex video processing workflows.
Conventional video and image processing systems, on the other hand, are developed
by programmers possessing image processing expertise. These systems are tailored
to produce highly specialised hand-crafted solutions for very specific tasks, making
them rigid and non-modular. The knowledge-based vision community have attempted
to produce more modular solutions by incorporating ontologies. However,
they have not been maximally utilised to encompass aspects such as application context
descriptions (e.g. lighting and clearness effects) and qualitative measures.
This thesis aims to tackle some of the research gaps yet to be addressed by the
workflow and knowledge-based image processing communities by proposing a novel
workflow composition and execution approach within an integrated framework. This
framework distinguishes three levels of abstraction via the design, workflow and processing
layers. The core technologies that drive the workflow composition mechanism
are ontologies and planning. Video processing problems provide a fitting domain for
investigating the effectiveness of this integratedmethod as tackling such problems have
not been fully explored by the workflow, planning and ontological communities despite
their combined beneficial traits to confront this known hard problem. In addition, the
pervasiveness of video data has proliferated the need for more automated assistance
for image processing-naive users, but no adequate support has been provided as of yet.
A video and image processing ontology that comprises three sub-ontologies was
constructed to capture the goals, video descriptions and capabilities (video and image
processing tools). The sub-ontologies are used for representation and inference. In
particular, they are used in conjunction with an enhanced Hierarchical Task Network
(HTN) domain independent planner to help with performance-based selection of solution
steps based on preconditions, effects and postconditions. The planner, in turn,
makes use of process models contained in a process library when deliberating on the
steps and then consults the capability ontology to retrieve a suitable tool at each step. Two key features of the planner are the ability to support workflow execution (interleaves
planning with execution) and can perform in automatic or semi-automatic
(interactive) mode. The first feature is highly desirable for video processing problems
because execution of image processing steps yield visual results that are intuitive
and verifiable by the human user, as automatic validation is non trivial. In the semiautomaticmode,
the planner is interactive and prompts the user tomake a tool selection
when there is more than one tool available to perform a task. The user makes the tool
selection based on the recommended descriptions provided by the workflow system.
Once planning is complete, the result of applying the tool of their choice is presented
to the user textually and visually for verification. This plays a pivotal role in providing
the user with control and the ability to make informed decisions. Hence, the planner
extends the capabilities of typical planners by guiding the user to construct more
optimal solutions. Video processing problems can also be solved in more modular,
reusable and adaptable ways as compared to conventional image processing systems.
The integrated approach was evaluated on a test set consisting of videos originating
from open sea environment of varying quality. Experiments to evaluate the efficiency,
adaptability to userâs changing needs and user learnability of this approach were conducted
on users who did not possess image processing expertise. The findings indicate
that using this integrated workflow composition and execution method: 1) provides a
speed up of over 90% in execution time for video classification tasks using full automatic
processing compared to manual methods without loss of accuracy; 2) is more
flexible and adaptable in response to changes in user requests (be it in the task, constraints
to the task or descriptions of the video) than modifying existing image processing
programs when the domain descriptions are altered; 3) assists the user in selecting
optimal solutions by providing recommended descriptions
Application of statistical learning theory to plankton image analysis
Submitted to the Joint Program in Applied Ocean Science and Engineering
in partial fulfillment of the requirements for the degree of Doctor of Philosophy
At the Massachusetts Institute of Technology
and the Woods Hole Oceanographic Institution
June 2006A fundamental problem in limnology and oceanography is the inability to quickly
identify and map distributions of plankton. This thesis addresses the problem by
applying statistical machine learning to video images collected by an optical sampler,
the Video Plankton Recorder (VPR). The research is focused on development
of a real-time automatic plankton recognition system to estimate plankton abundance.
The system includes four major components: pattern representation/feature
measurement, feature extraction/selection, classification, and abundance estimation.
After an extensive study on a traditional learning vector quantization (LVQ)
neural network (NN) classifier built on shape-based features and different pattern
representation methods, I developed a classification system combined multi-scale cooccurrence matrices feature with support vector machine classifier. This new method
outperforms the traditional shape-based-NN classifier method by 12% in classification
accuracy. Subsequent plankton abundance estimates are improved in the regions of
low relative abundance by more than 50%.
Both the NN and SVM classifiers have no rejection metrics. In this thesis, two
rejection metrics were developed. One was based on the Euclidean distance in the
feature space for NN classifier. The other used dual classifier (NN and SVM) voting as
output. Using the dual-classification method alone yields almost as good abundance
estimation as human labeling on a test-bed of real world data. However, the distance
rejection metric for NN classifier might be more useful when the training samples are
not âgoodâ ie, representative of the field data.
In summary, this thesis advances the current state-of-the-art plankton recognition
system by demonstrating multi-scale texture-based features are more suitable
for classifying field-collected images. The system was verified on a very large realworld
dataset in systematic way for the first time. The accomplishments include developing a multi-scale occurrence matrices and support vector machine system, a dual-classification system, automatic correction in abundance estimation, and ability to get accurate abundance estimation from real-time automatic classification. The methods developed are generic and are likely to work on range of other image classification applications.This work was supported by National Science Foundation Grants OCE-9820099
and Woods Hole Oceanographic Institution academic program
- âŠ