1,976 research outputs found

    Deep learning in remote sensing: a review

    Get PDF
    Standing at the paradigm shift towards data-intensive science, machine learning techniques are becoming increasingly important. In particular, as a major breakthrough in the field, deep learning has proven as an extremely powerful tool in many fields. Shall we embrace deep learning as the key to all? Or, should we resist a 'black-box' solution? There are controversial opinions in the remote sensing community. In this article, we analyze the challenges of using deep learning for remote sensing data analysis, review the recent advances, and provide resources to make deep learning in remote sensing ridiculously simple to start with. More importantly, we advocate remote sensing scientists to bring their expertise into deep learning, and use it as an implicit general model to tackle unprecedented large-scale influential challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin

    EchoFusion: Tracking and Reconstruction of Objects in 4D Freehand Ultrasound Imaging without External Trackers

    Get PDF
    Ultrasound (US) is the most widely used fetal imaging technique. However, US images have limited capture range, and suffer from view dependent artefacts such as acoustic shadows. Compounding of overlapping 3D US acquisitions into a high-resolution volume can extend the field of view and remove image artefacts, which is useful for retrospective analysis including population based studies. However, such volume reconstructions require information about relative transformations between probe positions from which the individual volumes were acquired. In prenatal US scans, the fetus can move independently from the mother, making external trackers such as electromagnetic or optical tracking unable to track the motion between probe position and the moving fetus. We provide a novel methodology for image-based tracking and volume reconstruction by combining recent advances in deep learning and simultaneous localisation and mapping (SLAM). Tracking semantics are established through the use of a Residual 3D U-Net and the output is fed to the SLAM algorithm. As a proof of concept, experiments are conducted on US volumes taken from a whole body fetal phantom, and from the heads of real fetuses. For the fetal head segmentation, we also introduce a novel weak annotation approach to minimise the required manual effort for ground truth annotation. We evaluate our method qualitatively, and quantitatively with respect to tissue discrimination accuracy and tracking robustness.Comment: MICCAI Workshop on Perinatal, Preterm and Paediatric Image analysis (PIPPI), 201

    A Novel GAN-Based Anomaly Detection and Localization Method for Aerial Video Surveillance at Low Altitude

    Get PDF
    The last two decades have seen an incessant growth in the use of Unmanned Aerial Vehicles (UAVs) equipped with HD cameras for developing aerial vision-based systems to support civilian and military tasks, including land monitoring, change detection, and object classification. To perform most of these tasks, the artificial intelligence algorithms usually need to know, a priori, what to look for, identify. or recognize. Actually, in most operational scenarios, such as war zones or post-disaster situations, areas and objects of interest are not decidable a priori since their shape and visual features may have been altered by events or even intentionally disguised (e.g., improvised explosive devices (IEDs)). For these reasons, in recent years, more and more research groups are investigating the design of original anomaly detection methods, which, in short, are focused on detecting samples that differ from the others in terms of visual appearance and occurrences with respect to a given environment. In this paper, we present a novel two-branch Generative Adversarial Network (GAN)-based method for low-altitude RGB aerial video surveillance to detect and localize anomalies. We have chosen to focus on the low-altitude sequences as we are interested in complex operational scenarios where even a small object or device can represent a reason for danger or attention. The proposed model was tested on the UAV Mosaicking and Change Detection (UMCD) dataset, a one-of-a-kind collection of challenging videos whose sequences were acquired between 6 and 15 m above sea level on three types of ground (i.e., urban, dirt, and countryside). Results demonstrated the effectiveness of the model in terms of Area Under the Receiving Operating Curve (AUROC) and Structural Similarity Index (SSIM), achieving an average of 97.2% and 95.7%, respectively, thus suggesting that the system can be deployed in real-world applications

    Alignment control using visual servoing and mobilenet single-shot multi-box detection (SSD): a review

    Get PDF
    The concept is highly critical for robotic technologies that rely on visual feedback. In this context, robot systems tend to be unresponsive due to reliance on pre-programmed trajectory and path, meaning the occurrence of a change in the environment or the absence of an object. This review paper aims to provide comprehensive studies on the recent application of visual servoing and DNN. PBVS and Mobilenet-SSD were chosen algorithms for alignment control of the film handler mechanism of the portable x-ray system. It also discussed the theoretical framework features extraction and description, visual servoing, and Mobilenet-SSD. Likewise, the latest applications of visual servoing and DNN was summarized, including the comparison of Mobilenet-SSD with other sophisticated models. As a result of a previous study presented, visual servoing and MobileNet-SSD provide reliable tools and models for manipulating robotics systems, including where occlusion is present. Furthermore, effective alignment control relies significantly on visual servoing and deep neural reliability, shaped by different parameters such as the type of visual servoing, feature extraction and description, and DNNs used to construct a robust state estimator. Therefore, visual servoing and MobileNet-SSD are parameterized concepts that require enhanced optimization to achieve a specific purpose with distinct tools

    Object-Aware Tracking and Mapping

    Get PDF
    Reasoning about geometric properties of digital cameras and optical physics enabled researchers to build methods that localise cameras in 3D space from a video stream, while – often simultaneously – constructing a model of the environment. Related techniques have evolved substantially since the 1980s, leading to increasingly accurate estimations. Traditionally, however, the quality of results is strongly affected by the presence of moving objects, incomplete data, or difficult surfaces – i.e. surfaces that are not Lambertian or lack texture. One insight of this work is that these problems can be addressed by going beyond geometrical and optical constraints, in favour of object level and semantic constraints. Incorporating specific types of prior knowledge in the inference process, such as motion or shape priors, leads to approaches with distinct advantages and disadvantages. After introducing relevant concepts in Chapter 1 and Chapter 2, methods for building object-centric maps in dynamic environments using motion priors are investigated in Chapter 5. Chapter 6 addresses the same problem as Chapter 5, but presents an approach which relies on semantic priors rather than motion cues. To fully exploit semantic information, Chapter 7 discusses the conditioning of shape representations on prior knowledge and the practical application to monocular, object-aware reconstruction systems

    Efficient and Accurate Segmentation of Defects in Industrial CT Scans

    Get PDF
    Industrial computed tomography (CT) is an elementary tool for the non-destructive inspection of cast light-metal or plastic parts. A comprehensive testing not only helps to ensure the stability and durability of a part, it also allows reducing the rejection rate by supporting the optimization of the casting process and to save material (and weight) by producing equivalent but more filigree structures. With a CT scan it is theoretically possible to locate any defect in the part under examination and to exactly determine its shape, which in turn helps to draw conclusions about its harmfulness. However, most of the time the data quality is not good enough to allow segmenting the defects with simple filter-based methods which directly operate on the gray-values—especially when the inspection is expanded to the entire production. In such in-line inspection scenarios the tight cycle times further limit the available time for the acquisition of the CT scan, which renders them noisy and prone to various artifacts. In recent years, dramatic advances in deep learning (and convolutional neural networks in particular) made even the reliable detection of small objects in cluttered scenes possible. These methods are a promising approach to quickly yield a reliable and accurate defect segmentation even in unfavorable CT scans. The huge drawback: a lot of precisely labeled training data is required, which is utterly challenging to obtain—particularly in the case of the detection of tiny defects in huge, highly artifact-afflicted, three-dimensional voxel data sets. Hence, a significant part of this work deals with the acquisition of precisely labeled training data. Firstly, we consider facilitating the manual labeling process: our experts annotate on high-quality CT scans with a high spatial resolution and a high contrast resolution and we then transfer these labels to an aligned ``normal'' CT scan of the same part, which holds all the challenging aspects we expect in production use. Nonetheless, due to the indecisiveness of the labeling experts about what to annotate as defective, the labels remain fuzzy. Thus, we additionally explore different approaches to generate artificial training data, for which a precise ground truth can be computed. We find an accurate labeling to be crucial for a proper training. We evaluate (i) domain randomization which simulates a super-set of reality with simple transformations, (ii) generative models which are trained to produce samples of the real-world data distribution, and (iii) realistic simulations which capture the essential aspects of real CT scans. Here, we develop a fully automated simulation pipeline which provides us with an arbitrary amount of precisely labeled training data. First, we procedurally generate virtual cast parts in which we place reasonable artificial casting defects. Then, we realistically simulate CT scans which include typical CT artifacts like scatter, noise, cupping, and ring artifacts. Finally, we compute a precise ground truth by determining for each voxel the overlap with the defect mesh. To determine whether our realistically simulated CT data is eligible to serve as training data for machine learning methods, we compare the prediction performance of learning-based and non-learning-based defect recognition algorithms on the simulated data and on real CT scans. In an extensive evaluation, we compare our novel deep learning method to a baseline of image processing and traditional machine learning algorithms. This evaluation shows how much defect detection benefits from learning-based approaches. In particular, we compare (i) a filter-based anomaly detection method which finds defect indications by subtracting the original CT data from a generated ``defect-free'' version, (ii) a pixel-classification method which, based on densely extracted hand-designed features, lets a random forest decide about whether an image element is part of a defect or not, and (iii) a novel deep learning method which combines a U-Net-like encoder-decoder-pair of three-dimensional convolutions with an additional refinement step. The encoder-decoder-pair yields a high recall, which allows us to detect even very small defect instances. The refinement step yields a high precision by sorting out the false positive responses. We extensively evaluate these models on our realistically simulated CT scans as well as on real CT scans in terms of their probability of detection, which tells us at which probability a defect of a given size can be found in a CT scan of a given quality, and their intersection over union, which gives us information about how precise our segmentation mask is in general. While the learning-based methods clearly outperform the image processing method, the deep learning method in particular convinces by its inference speed and its prediction performance on challenging CT scans—as they, for example, occur in in-line scenarios. Finally, we further explore the possibilities and the limitations of the combination of our fully automated simulation pipeline and our deep learning model. With the deep learning method yielding reliable results for CT scans of low data quality, we examine by how much we can reduce the scan time while still maintaining proper segmentation results. Then, we take a look on the transferability of the promising results to CT scans of parts of different materials and different manufacturing techniques, including plastic injection molding, iron casting, additive manufacturing, and composed multi-material parts. Each of these tasks comes with its own challenges like an increased artifact-level or different types of defects which occasionally are hard to detect even for the human eye. We tackle these challenges by employing our simulation pipeline to produce virtual counterparts that capture the tricky aspects and fine-tuning the deep learning method on this additional training data. With that we can tailor our approach towards specific tasks, achieving reliable and robust segmentation results even for challenging data. Lastly, we examine if the deep learning method, based on our realistically simulated training data, can be trained to distinguish between different types of defects—the reason why we require a precise segmentation in the first place—and we examine if the deep learning method can detect out-of-distribution data where its predictions become less trustworthy, i.e. an uncertainty estimation

    3D Sensor Placement and Embedded Processing for People Detection in an Industrial Environment

    Get PDF
    Papers I, II and III are extracted from the dissertation and uploaded as separate documents to meet post-publication requirements for self-arciving of IEEE conference papers.At a time when autonomy is being introduced in more and more areas, computer vision plays a very important role. In an industrial environment, the ability to create a real-time virtual version of a volume of interest provides a broad range of possibilities, including safety-related systems such as vision based anti-collision and personnel tracking. In an offshore environment, where such systems are not common, the task is challenging due to rough weather and environmental conditions, but the result of introducing such safety systems could potentially be lifesaving, as personnel work close to heavy, huge, and often poorly instrumented moving machinery and equipment. This thesis presents research on important topics related to enabling computer vision systems in industrial and offshore environments, including a review of the most important technologies and methods. A prototype 3D sensor package is developed, consisting of different sensors and a powerful embedded computer. This, together with a novel, highly scalable point cloud compression and sensor fusion scheme allows to create a real-time 3D map of an industrial area. The question of where to place the sensor packages in an environment where occlusions are present is also investigated. The result is algorithms for automatic sensor placement optimisation, where the goal is to place sensors in such a way that maximises the volume of interest that is covered, with as few occluded zones as possible. The method also includes redundancy constraints where important sub-volumes can be defined to be viewed by more than one sensor. Lastly, a people detection scheme using a merged point cloud from six different sensor packages as input is developed. Using a combination of point cloud clustering, flattening and convolutional neural networks, the system successfully detects multiple people in an outdoor industrial environment, providing real-time 3D positions. The sensor packages and methods are tested and verified at the Industrial Robotics Lab at the University of Agder, and the people detection method is also tested in a relevant outdoor, industrial testing facility. The experiments and results are presented in the papers attached to this thesis.publishedVersio
    • …
    corecore