1,396 research outputs found

    Can Image Enhancement be Beneficial to Find Smoke Images in Laparoscopic Surgery?

    Full text link
    Laparoscopic surgery has a limited field of view. Laser ablation in a laproscopic surgery causes smoke, which inevitably influences the surgeon's visibility. Therefore, it is of vital importance to remove the smoke, such that a clear visualization is possible. In order to employ a desmoking technique, one needs to know beforehand if the image contains smoke or not, to this date, there exists no accurate method that could classify the smoke/non-smoke images completely. In this work, we propose a new enhancement method which enhances the informative details in the RGB images for discrimination of smoke/non-smoke images. Our proposed method utilizes weighted least squares optimization framework~(WLS). For feature extraction, we use statistical features based on bivariate histogram distribution of gradient magnitude~(GM) and Laplacian of Gaussian~(LoG). We then train a SVM classifier with binary smoke/non-smoke classification task. We demonstrate the effectiveness of our method on Cholec80 dataset. Experiments using our proposed enhancement method show promising results with improvements of 4\% in accuracy and 4\% in F1-Score over the baseline performance of RGB images. In addition, our approach improves over the saturation histogram based classification methodologies Saturation Analysis~(SAN) and Saturation Peak Analysis~(SPA) by 1/5\% and 1/6\% in accuracy/F1-Score metrics.Comment: In proceedings of IST, Color and Imaging Conference (CIC 26). Congcong Wang and Vivek Sharma contributed equally to this work and listed in alphabetical orde

    Discriminatively Trained Latent Ordinal Model for Video Classification

    Full text link
    We study the problem of video classification for facial analysis and human action recognition. We propose a novel weakly supervised learning method that models the video as a sequence of automatically mined, discriminative sub-events (eg. onset and offset phase for "smile", running and jumping for "highjump"). The proposed model is inspired by the recent works on Multiple Instance Learning and latent SVM/HCRF -- it extends such frameworks to model the ordinal aspect in the videos, approximately. We obtain consistent improvements over relevant competitive baselines on four challenging and publicly available video based facial analysis datasets for prediction of expression, clinical pain and intent in dyadic conversations and on three challenging human action datasets. We also validate the method with qualitative results and show that they largely support the intuitions behind the method.Comment: Paper accepted in IEEE TPAMI. arXiv admin note: substantial text overlap with arXiv:1604.0150

    Early Fire Detection on Video Using LBP and Spread Ascending of Smoke

    Get PDF
    This paper proposes a methodology for early fire detection based on visual smoke characteristics such as movement, color, gray tones and dynamic texture, i.e., diverse but representative and discriminant characteristics, as well as its ascending expansion, which is sequentially processed to find the candidate smoke regions. Thus, once a region with movement is detected, the pixels inside it that are smoke color are estimated to obtain a more detailed description of the smoke candidate region. Next, to increase the system efficiency and reduce false alarms, each region is characterized using the local binary pattern, which analyzes its texture and classifies it by means of a multi-layer perceptron. Finally, the ascending expansion of the candidate region is analyzed and those smoke regions that maintain or increase their ascending growth over a time span are considered as a smoke regions, and an alarm is triggered. Evaluations were performed using two different classifiers, namely multi-Layer perceptron and the support vector machine, with a standard database smoke video. Evaluation results show that the proposed system provides fire detection accuracy of between 97.85% and 99.83%

    Articulated Multi-Instrument 2D Pose Estimation Using Fully Convolutional Networks

    Get PDF
    Instrument detection, pose estimation and tracking in surgical videos is an important vision component for computer assisted interventions. While significant advances have been made in recent years, articulation detection is still a major challenge. In this paper, we propose a deep neural network for articulated multi-instrument 2D pose estimation, which is trained on a detailed annotations of endoscopic and microscopic datasets. Our model is formed by a fully convolutional detection-regression network. Joints and associations between joint pairs in our instrument model are located by the detection subnetwork and are subsequently refined through a regression subnetwork. Based on the output from the model, the poses of the instruments are inferred using maximum bipartite graph matching. Our estimation framework is powered by deep learning techniques without any direct kinematic information from a robot. Our framework is tested on single-instrument RMIT data, and also on multi-instrument EndoVis and in vivo data with promising results. In addition, the dataset annotations are publicly released along with our code and model

    Early Flame Detection System Using Real-Time Machine-Vision and Image Analysis

    Get PDF
    From 2010 to 2019, 110,811 fires with losses have been reported to the Office of Fire Marshall and Emergency Management in Ontario. In The USA, the local fire departments responded to 1,338,500 fires in 2020. These fires caused 3,500 civilian deaths, 15,200 civilian injuries and $21.9 billion in property damage. A fire occurs in a structure in the USA every 64 seconds. Those and similar recent statistics from different parts of the world indicate that the current point-type fire detection technology has failed to eliminate the hazards of death, injury, and economic loss caused by fire. This research aims to utilize the latest digital video processing and computer vision technology to develop a more efficient flame detection system. Due to rapid developments in digital Cameras, IoT, and 5G telecommunication technologies, computer-vision based fire detection is getting more attention from researchers in recent years. Computer-vision based fire detection can be as simple as a single IoT camera to detect a fire early before becoming out of control and turning into a threatening risk, triggers a local alarm, and sends remote warning signals to the fire department and emergency management officials. The proposed system does not require high capital costs nor high operation and maintenance costs since it will run on top of the existing infrastructure of the digital security & surveillance system network. Moreover, the proposed system has broad potential for indoor and outdoor applications in urban areas, and it is easily expandable by adding more IP cameras to the existing network. The proposed system incorporates two stages: Stage-I: Detecting the fire candidate region from live video stream based on colour and motion information; and Stage-II: passing the candidate region to a trained Convolutional Neural Network (CNN) classification model to classify the region as fire or non-fire. The main innovation in this approach is its simplicity and suitability for real-time use without compromising the accuracy. The experimental results show that the system training and validation accuracies reach 100% and 98% respectively. Applying the proposed framework as an additional layer of protection integrated into existing indoor and outdoor digital security & surveillance systems is expected to provide early fire detection and allows firefighters and rescue teams to arrive at the scene at its early and offer priceless minutes to the attendees or building occupants to evacuate the hazardous locations. This proposal will save lives and minimize the economic loss in public and private properties
    • …
    corecore