1,603 research outputs found

    Machine learning methods for histopathological image analysis

    Full text link
    Abundant accumulation of digital histopathological images has led to the increased demand for their analysis, such as computer-aided diagnosis using machine learning techniques. However, digital pathological images and related tasks have some issues to be considered. In this mini-review, we introduce the application of digital pathological image analysis using machine learning algorithms, address some problems specific to such analysis, and propose possible solutions.Comment: 23 pages, 4 figure

    A Survey on Audio-Video based Defect Detection through Deep Learning in Railway Maintenance

    Get PDF
    Within Artificial Intelligence, Deep Learning (DL) represents a paradigm that has been showing unprecedented performance in image and audio processing by supporting or even replacing humans in defect and anomaly detection. The Railway sector is expected to benefit from DL applications, especially in predictive maintenance applications, where smart audio and video sensors can be leveraged yet kept distinct from safety-critical functions. Such separation is crucial, as it allows for improving system dependability with no impact on its safety certification. This is further supported by the development of DL in other transportation domains, such as automotive and avionics, opening for knowledge transfer opportunities and highlighting the potential of such a paradigm in railways. In order to summarize the recent state-of-the-art while inquiring about future opportunities, this paper reviews DL approaches for the analysis of data generated by acoustic and visual sensors in railway maintenance applications that have been published until August 31st, 2021. In this paper, the current state of the research is investigated and evaluated using a structured and systematic method, in order to highlight promising approaches and successful applications, as well as to identify available datasets, current limitations, open issues, challenges, and recommendations about future research directions

    Deep Learning Methods for Remote Sensing

    Get PDF
    Remote sensing is a field where important physical characteristics of an area are exacted using emitted radiation generally captured by satellite cameras, sensors onboard aerial vehicles, etc. Captured data help researchers develop solutions to sense and detect various characteristics such as forest fires, flooding, changes in urban areas, crop diseases, soil moisture, etc. The recent impressive progress in artificial intelligence (AI) and deep learning has sparked innovations in technologies, algorithms, and approaches and led to results that were unachievable until recently in multiple areas, among them remote sensing. This book consists of sixteen peer-reviewed papers covering new advances in the use of AI for remote sensing

    Weed Recognition in Agriculture: A Mask R-CNN Approach

    Get PDF
    Recent interdisciplinary collaboration on deep learning has led to a growing interest in its application in the agriculture domain. Weed control and management are some of the crucial tasks in agriculture to maintain high crop productivity. The inception phase of weed control and management is to successfully recognize the weed plants, followed by providing a suitable management plan. Due to the complexities in agriculture images, such as similar colour and texture, we need to incorporate a deep neural network that uses pixel-wise grouping for identifying the plant species. In this thesis, we analysed the performance of one of the most popular deep neural networks aimed to solve the instance segmentation (pixel-wise analysis) problems: Mask R-CNN, for weed plant recognition (detection and classification) using field images and aerial images. We have used Mask R-CNN to recognize the crop plants and weed plants using the Crop/Weed Field Image Dataset (CWFID) for the field image study. However, the CWFID\u27s limitations are that it identifies all weed plants as a single class and all of the crop plants are from a single organic carrot field. We have created a synthetic dataset with 80 weed plant species to tackle this problem and tested it with Mask R-CNN to expand our study. Throughout this thesis, we predominantly focused on detecting one specific invasive weed type called Persicaria Perfoliata or Mile-A-Minute (MAM) for our aerial image study. In general, supervised model outcomes are slow to aerial images, primarily due to large image size and scarcity of well-annotated datasets, making it relatively harder to recognize the species from higher altitudes. We propose a three-level (leaves, trees, forest) hierarchy to recognize the species using Unmanned Aerial Vehicles(UAVs) to address this issue. To create a dataset that resembles weed clusters similar to MAM, we have used a localized style transfer technique to transfer the style from the available MAM images to a portion of the aerial images\u27 content using VGG-19 architecture. We have also generated another dataset at a relatively low altitude and tested it with Mask R-CNN and reached ~92% AP50 using these low-altitude resized images

    Dual Progressive Transformations for Weakly Supervised Semantic Segmentation

    Full text link
    Weakly supervised semantic segmentation (WSSS), which aims to mine the object regions by merely using class-level labels, is a challenging task in computer vision. The current state-of-the-art CNN-based methods usually adopt Class-Activation-Maps (CAMs) to highlight the potential areas of the object, however, they may suffer from the part-activated issues. To this end, we try an early attempt to explore the global feature attention mechanism of vision transformer in WSSS task. However, since the transformer lacks the inductive bias as in CNN models, it can not boost the performance directly and may yield the over-activated problems. To tackle these drawbacks, we propose a Convolutional Neural Networks Refined Transformer (CRT) to mine a globally complete and locally accurate class activation maps in this paper. To validate the effectiveness of our proposed method, extensive experiments are conducted on PASCAL VOC 2012 and CUB-200-2011 datasets. Experimental evaluations show that our proposed CRT achieves the new state-of-the-art performance on both the weakly supervised semantic segmentation task the weakly supervised object localization task, which outperform others by a large margin

    Towards Developing Computer Vision Algorithms and Architectures for Real-world Applications

    Get PDF
    abstract: Computer vision technology automatically extracts high level, meaningful information from visual data such as images or videos, and the object recognition and detection algorithms are essential in most computer vision applications. In this dissertation, we focus on developing algorithms used for real life computer vision applications, presenting innovative algorithms for object segmentation and feature extraction for objects and actions recognition in video data, and sparse feature selection algorithms for medical image analysis, as well as automated feature extraction using convolutional neural network for blood cancer grading. To detect and classify objects in video, the objects have to be separated from the background, and then the discriminant features are extracted from the region of interest before feeding to a classifier. Effective object segmentation and feature extraction are often application specific, and posing major challenges for object detection and classification tasks. In this dissertation, we address effective object flow based ROI generation algorithm for segmenting moving objects in video data, which can be applied in surveillance and self driving vehicle areas. Optical flow can also be used as features in human action recognition algorithm, and we present using optical flow feature in pre-trained convolutional neural network to improve performance of human action recognition algorithms. Both algorithms outperform the state-of-the-arts at their time. Medical images and videos pose unique challenges for image understanding mainly due to the fact that the tissues and cells are often irregularly shaped, colored, and textured, and hand selecting most discriminant features is often difficult, thus an automated feature selection method is desired. Sparse learning is a technique to extract the most discriminant and representative features from raw visual data. However, sparse learning with \textit{L1} regularization only takes the sparsity in feature dimension into consideration; we improve the algorithm so it selects the type of features as well; less important or noisy feature types are entirely removed from the feature set. We demonstrate this algorithm to analyze the endoscopy images to detect unhealthy abnormalities in esophagus and stomach, such as ulcer and cancer. Besides sparsity constraint, other application specific constraints and prior knowledge may also need to be incorporated in the loss function in sparse learning to obtain the desired results. We demonstrate how to incorporate similar-inhibition constraint, gaze and attention prior in sparse dictionary selection for gastroscopic video summarization that enable intelligent key frame extraction from gastroscopic video data. With recent advancement in multi-layer neural networks, the automatic end-to-end feature learning becomes feasible. Convolutional neural network mimics the mammal visual cortex and can extract most discriminant features automatically from training samples. We present using convolutinal neural network with hierarchical classifier to grade the severity of Follicular Lymphoma, a type of blood cancer, and it reaches 91\% accuracy, on par with analysis by expert pathologists. Developing real world computer vision applications is more than just developing core vision algorithms to extract and understand information from visual data; it is also subject to many practical requirements and constraints, such as hardware and computing infrastructure, cost, robustness to lighting changes and deformation, ease of use and deployment, etc.The general processing pipeline and system architecture for the computer vision based applications share many similar design principles and architecture. We developed common processing components and a generic framework for computer vision application, and a versatile scale adaptive template matching algorithm for object detection. We demonstrate the design principle and best practices by developing and deploying a complete computer vision application in real life, building a multi-channel water level monitoring system, where the techniques and design methodology can be generalized to other real life applications. The general software engineering principles, such as modularity, abstraction, robust to requirement change, generality, etc., are all demonstrated in this research.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Design Knowledge for Deep-Learning-Enabled Image-Based Decision Support Systems

    Get PDF
    With the ever-increasing societal dependence on electricity, one of the critical tasks in power supply is maintaining the power line infrastructure. In the process of making informed, cost-effective, and timely decisions, maintenance engineers must rely on human-created, heterogeneous, structured, and also largely unstructured information. The maturing research on vision-based power line inspection driven by advancements in deep learning offers first possibilities to move towards more holistic, automated, and safe decision-making. However, (current) research focuses solely on the extraction of information rather than its implementation in decision-making processes. The paper addresses this shortcoming by designing, instantiating, and evaluating a holistic deep-learning-enabled image-based decision support system artifact for power line maintenance at a German distribution system operator in southern Germany. Following the design science research paradigm, two main components of the artifact are designed: A deep-learning-based model component responsible for automatic fault detection of power line parts as well as a user-oriented interface responsible for presenting the captured information in a way that enables more informed decisions. As a basis for both components, preliminary design requirements are derived from literature and the application field. Drawing on justificatory knowledge from deep learning as well as decision support systems, tentative design principles are derived. Based on these design principles, a prototype of the artifact is implemented that allows for rigorous evaluation of the design knowledge in multiple evaluation episodes, covering different angles. Through a technical experiment the technical novelty of the artifact’s capability to capture selected faults (regarding insulators and safety pins) in unmanned aerial vehicle (UAV)-captured image data (model component) is validated. Subsequent interviews, surveys, and workshops in a natural environment confirm the usefulness of the model as well as the user interface component. The evaluation provides evidence that (1) the image processing approach manages to address the gap of power line component inspection and (2) that the proposed holistic design knowledge for image-based decision support systems enables more informed decision-making. The paper therefore contributes to research and practice in three ways. First, the technical feasibility to detect certain maintenance-intensive parts of power lines with the help of unique UAV image data is shown. Second, the distribution system operators’ specific problem is solved by supporting decisions in maintenance with the proposed image-based decision support system. Third, precise design knowledge for image-based decision support systems is formulated that can inform future system designs of a similar nature
    • …
    corecore