830 research outputs found

    Comparative evaluation of instrument segmentation and tracking methods in minimally invasive surgery

    Get PDF
    Intraoperative segmentation and tracking of minimally invasive instruments is a prerequisite for computer- and robotic-assisted surgery. Since additional hardware like tracking systems or the robot encoders are cumbersome and lack accuracy, surgical vision is evolving as promising techniques to segment and track the instruments using only the endoscopic images. However, what is missing so far are common image data sets for consistent evaluation and benchmarking of algorithms against each other. The paper presents a comparative validation study of different vision-based methods for instrument segmentation and tracking in the context of robotic as well as conventional laparoscopic surgery. The contribution of the paper is twofold: we introduce a comprehensive validation data set that was provided to the study participants and present the results of the comparative validation study. Based on the results of the validation study, we arrive at the conclusion that modern deep learning approaches outperform other methods in instrument segmentation tasks, but the results are still not perfect. Furthermore, we show that merging results from different methods actually significantly increases accuracy in comparison to the best stand-alone method. On the other hand, the results of the instrument tracking task show that this is still an open challenge, especially during challenging scenarios in conventional laparoscopic surgery

    Fire detection using deep learning methods

    Get PDF
    Fire detection is an important task in the field of safety and emergency prevention. In recent years, deep learning methods have shown high efficiency in solving various computer vision problems, including detecting objects in images. In this paper, monitoring wildfires was considered, which allows you to quickly respond to them and prevent their spread using deep learning methods. For the experiment, images from the satellite and images from the FireWatch sensor were taken as initial data. In this work, the deep learning algorithms you only look once (YOLO), convolutional neural network (CNN), and fast recurrent neural network (FastRNN) were considered, which makes it possible to determine the accuracy of a natural fire. As a result of the experiments, an automated fire recognition algorithm using YOLOv4 deep learning methods was created. It is expected that the results of the study will show that deep learning methods can be successfully applied to detect fire in images. This may lead to the development of automated monitoring systems capable of quickly and reliably detecting fire situations, which will help improve safety and reduce the risk of fires

    Forest and Water Bodies Segmentation Through Satellite Images Using U-Net

    Full text link
    Global environment monitoring is a task that requires additional attention in the contemporary rapid climate change environment. This includes monitoring the rate of deforestation and areas affected by flooding. Satellite imaging has greatly helped monitor the earth, and deep learning techniques have helped to automate this monitoring process. This paper proposes a solution for observing the area covered by the forest and water. To achieve this task UNet model has been proposed, which is an image segmentation model. The model achieved a validation accuracy of 82.55% and 82.92% for the segmentation of areas covered by forest and water, respectively

    Comparative study on machine learning algorithms for early fire forest detection system using geodata

    Get PDF
    Forest fires have caused considerable losses to ecologies, societies and economies worldwide. To minimize these losses and reduce forest fires, modeling and predicting the occurrence of forest fires are meaningful because they can support forest fire prevention and management. In recent years, the convolutional neural network (CNN) has become an important state-of-the-art deep learning algorithm, and its implementation has enriched many fields. Therefore, a competitive spatial prediction model for automatic early detection of wild forest fire using machine learning algorithms can be proposed. This model can help researchers to predict forest fires and identify risk zonas. System using machine learning algorithm on geodata will be able to notify in real time the interested parts and authorities by providing alerts and presenting on maps based on geographical treatments for more efficacity and analyzing of the situation. This research extends the application of machine learning algorithms for early fire forest prediction to detection and representation in geographical information system (GIS) maps

    Harnessing Big Data and Machine Learning for Event Detection and Localization

    Get PDF
    Anomalous events are rare and significantly deviate from expected pattern and other data instances, making them hard to predict. Correctly and timely detecting anomalous severe events can help reduce risks and losses. Many anomalous event detection techniques are studied in the literature. Recently, big data and machine learning based techniques have shown a remarkable success in a wide range of fields. It is important to tailor big data and machine learning based techniques for each application; otherwise it may result in expensive computation, slow prediction, false alarms, and improper prediction granularity.First, we aim to address the above challenges by harnessing big data and machine learning techniques for fast and reliable prediction and localization of severe events. Firstly, to improve storage failure prediction, we develop a new lightweight and high performing tensor decomposition-based method, named SEFEE, for storage error forecasting in large-scale enterprise storage systems. SEFEE employs tensor decomposition technique to capture latent spatio-temporal information embedded in storage event logs. By utilizing the latent spatio-temporal information, we can make accurate storage error forecasting without training requirements of typical machine learning techniques. The training-free method allows for live prediction of storage errors and their locations in the storage system based on previous observations that had been used in tensor decomposition pipeline to extract meaningful latent correlations. Moreover, we propose an extension to include severity of the errors as contextual information to improve the accuracy of tensor decomposition which in turn improves the prediction accuracy. We further provide detailed characterization of NetApp dataset to provide additional insight into the dynamics of typical large-scale enterprise storage systems for the community.Next, we focus on another application -- AI-driven Wildfire prediction. Wildfires cause billions of dollars in property damages and loss of lives, with harmful health threats. We aim to correctly detect and localize wildfire events in the early stage and also classify wildfire smoke based on perceived pixel density of camera images. Due to the lack of publicly available dataset for early wildfire smoke detection, we first collect and process images from the AlertWildfire camera network. The images are annotated with bounding boxes and densities for deep learning methods to use. We then adapt a transformer-based end-to-end object detection model for wildfire detection using our dataset. The dataset and detection model together form as a benchmark named the Nevada smoke detection benchmark, or Nemo for short. Nemo is the first open-source benchmark for wildfire smoke detection with the focus of the early incipient stage. We further provide a weakly supervised Nemo version to enable wider support as a benchmark

    Smoke plume segmentation of wildfire images

    Get PDF
    Aquest treball s'emmarca dins del camp d'estudi de les xarxes neuronals en Aprenentatge profund. L'objectiu del projecte és analitzar i aplicar les xarxes neuronals que hi ha avui dia en el mercat per resoldre un problema en específic. Aquest és tracta de la segmentació de plomalls de fum en incendis forestals. S'ha desenvolupat un estudi de les xarxes neuronals utilitzades per resoldre problemes de segmentació d'imatges i també una reconstrucció posterior en 3D d'aquests plomalls de fum. L'algorisme finalment escollit és tracta del model UNet, una xarxa neuronal convolucional basada en l'estructura d'autoencoders amb connexions de pas, que desenvolupa tasques d'autoaprenentatge per finalment obtenir una predicció de la classe a segmentar entrenada, en aquest cas plomalls. de fum. Posteriorment, una comparativa entre algoritmes tradicionals i el model UNet aplicat fent servir aprenentatge profund s'ha realitzat, veient que tant quantitativament com qualitativament s'aconsegueix els millors resultats aplicant el model UNet, però a la vegada comporta més temps de computació. Tots aquests models s'han desenvolupat amb el llenguatge de programació Python utilitzant els llibres d'aprenentatge automàtic Tensorflow i Keras. Dins del model UNet s'han dut a terme múltiples experiments per obtenir els diferents valors dels hiperparàmetres més adequats per a l'aplicació del projecte, obtenint una precisió del 93.45 % en el model final per a la segmentació de fum en imatges d'incendis. forestals.Este trabajo se enmarca dentro del campo de estudio de las redes neuronales en aprendizaje profundo. El objetivo del proyecto es analizar y aplicar las redes neuronales que existen hoy en día en el mercado para resolver un problema en específico. Éste se trata de la segmentación de penachos de humo en incendios forestales. Se ha desarrollado un estudio de las redes neuronales utilizadas para resolver problemas de segmentación de imágenes y también una reconstrucción posterior en 3D de estos penachos de humo. El algoritmo finalmente escogido se trata del modelo UNet, una red neuronal convolucional basada en la estructura de autoencoders con conexiones de paso, que desarrolla tareas de autoaprendizaje para finalmente obtener una predicción de la clase a segmentar entrenada, en este caso penachos de humo. Posteriormente, una comparativa entre algoritmos tradicionales y el modelo UNet aplicado utilizando aprendizaje profundo se ha realizado, viendo que tanto cuantitativa como cualitativamente se consigue los mejores resultados aplicando el modelo UNet, pero a la vez conlleva más tiempo de computación. Todos estos modelos se han desarrollado con el lenguaje de programación Python utilizando libros de aprendizaje automático Tensorflow y Keras. Dentro del modelo UNet se han llevado a cabo múltiples experimentos para obtener los distintos valores de los hiperparámetros más adecuados para la aplicación del proyecto, obteniendo una precisión del 93.45 % en el modelo final para la segmentación de humo en imágenes de incendios forestales.This work is framed within the field of study of neural networks in Deep Learning. The aim of the project is to analyse and apply the neural networks that exist today in the market to solve a specific problem. This is about the segmentation of smoke plumes in forest fires. A study of the neural networks used to solve image segmentation problems and also a subsequent 3D reconstruction of these smoke plumes has been developed. The algorithm finally chosen is the UNet model, a convolutional neural network based on the structure of autoencoders with step connections, which develops self-learning tasks to finally obtain a prediction of the class to be trained, in this case smoke plumes. Also, a comparison between traditional algorithms and the UNet model applied using deep learning has been carried out, seeing that both quantitatively and qualitatively the best results are achieved by applying the UNet model, but at the same time it involves more computing time. All these models have been developed in the Python programming language using the Tensorflow and Keras machine learning books. Within the UNet model, multiple experiments have been carried out to obtain the different hyperparameter values most suitable for the project application, obtaining an accuracy of 93.45% in the final model for smoke segmentation in wildfire images

    Multi-teacher knowledge distillation as an effective method for compressing ensembles of neural networks

    Full text link
    Deep learning has contributed greatly to many successes in artificial intelligence in recent years. Today, it is possible to train models that have thousands of layers and hundreds of billions of parameters. Large-scale deep models have achieved great success, but the enormous computational complexity and gigantic storage requirements make it extremely difficult to implement them in real-time applications. On the other hand, the size of the dataset is still a real problem in many domains. Data are often missing, too expensive, or impossible to obtain for other reasons. Ensemble learning is partially a solution to the problem of small datasets and overfitting. However, ensemble learning in its basic version is associated with a linear increase in computational complexity. We analyzed the impact of the ensemble decision-fusion mechanism and checked various methods of sharing the decisions including voting algorithms. We used the modified knowledge distillation framework as a decision-fusion mechanism which allows in addition compressing of the entire ensemble model into a weight space of a single model. We showed that knowledge distillation can aggregate knowledge from multiple teachers in only one student model and, with the same computational complexity, obtain a better-performing model compared to a model trained in the standard manner. We have developed our own method for mimicking the responses of all teachers at the same time, simultaneously. We tested these solutions on several benchmark datasets. In the end, we presented a wide application use of the efficient multi-teacher knowledge distillation framework. In the first example, we used knowledge distillation to develop models that could automate corrosion detection on aircraft fuselage. The second example describes detection of smoke on observation cameras in order to counteract wildfires in forests.Comment: Doctoral dissertation in the field of computer science, machine learning. Application of knowledge distillation as aggregation of ensemble models. Along with several uses. 140 pages, 67 figures, 13 table

    AVA: A Video Dataset of Spatio-temporally Localized Atomic Visual Actions

    Get PDF
    This paper introduces a video dataset of spatio-temporally localized Atomic Visual Actions (AVA). The AVA dataset densely annotates 80 atomic visual actions in 430 15-minute video clips, where actions are localized in space and time, resulting in 1.58M action labels with multiple labels per person occurring frequently. The key characteristics of our dataset are: (1) the definition of atomic visual actions, rather than composite actions; (2) precise spatio-temporal annotations with possibly multiple annotations for each person; (3) exhaustive annotation of these atomic actions over 15-minute video clips; (4) people temporally linked across consecutive segments; and (5) using movies to gather a varied set of action representations. This departs from existing datasets for spatio-temporal action recognition, which typically provide sparse annotations for composite actions in short video clips. We will release the dataset publicly. AVA, with its realistic scene and action complexity, exposes the intrinsic difficulty of action recognition. To benchmark this, we present a novel approach for action localization that builds upon the current state-of-the-art methods, and demonstrates better performance on JHMDB and UCF101-24 categories. While setting a new state of the art on existing datasets, the overall results on AVA are low at 15.6% mAP, underscoring the need for developing new approaches for video understanding.Comment: To appear in CVPR 2018. Check dataset page https://research.google.com/ava/ for detail

    De-smokeGCN: Generative Cooperative Networks for Joint Surgical Smoke Detection and Removal

    Get PDF
    Surgical smoke removal algorithms can improve the quality of intra-operative imaging and reduce hazards in image-guided surgery, a highly desirable post-process for many clinical applications. These algorithms also enable effective computer vision tasks for future robotic surgery. In this paper, we present a new unsupervised learning framework for high-quality pixel-wise smoke detection and removal. One of the well recognized grand challenges in using convolutional neural networks (CNNs) for medical image processing is to obtain intra-operative medical imaging datasets for network training and validation, but availability and quality of these datasets are scarce. Our novel training framework does not require ground-truth image pairs. Instead, it learns purely from computer-generated simulation images. This approach opens up new avenues and bridges a substantial gap between conventional non-learning based methods and which requiring prior knowledge gained from extensive training datasets. Inspired by the Generative Adversarial Network (GAN), we have developed a novel generative-collaborative learning scheme that decomposes the de-smoke process into two separate tasks: smoke detection and smoke removal. The detection network is used as prior knowledge, and also as a loss function to maximize its support for training of the smoke removal network. Quantitative and qualitative studies show that the proposed training framework outperforms the state-of-the-art de-smoking approaches including the latest GAN framework (such as PIX2PIX). Although trained on synthetic images, experimental results on clinical images have proved the effectiveness of the proposed network for detecting and removing surgical smoke on both simulated and real-world laparoscopic images
    • …
    corecore