13,468 research outputs found

    Multi-Task Recurrent Neural Network for Surgical Gesture Recognition and Progress Prediction

    Full text link
    Surgical gesture recognition is important for surgical data science and computer-aided intervention. Even with robotic kinematic information, automatically segmenting surgical steps presents numerous challenges because surgical demonstrations are characterized by high variability in style, duration and order of actions. In order to extract discriminative features from the kinematic signals and boost recognition accuracy, we propose a multi-task recurrent neural network for simultaneous recognition of surgical gestures and estimation of a novel formulation of surgical task progress. To show the effectiveness of the presented approach, we evaluate its application on the JIGSAWS dataset, that is currently the only publicly available dataset for surgical gesture recognition featuring robot kinematic data. We demonstrate that recognition performance improves in multi-task frameworks with progress estimation without any additional manual labelling and training.Comment: Accepted to ICRA 202

    Detecting and generating overlapping nested communities

    Get PDF
    Nestedness has been observed in a variety of networks but has been primarily viewed in the context of bipartite networks. Numerous metrics quantify nestedness and some clustering methods identify fully nested parts of graphs, but all with similar limitations. Clustering approaches also fail to uncover the overlap between fully nested subgraphs, as they assign vertices to a single group only. In this paper, we look at the nestedness of a network through an auxiliary graph, in which a directed edge represents a nested relationship between the two corresponding vertices of the network. We present an algorithm that recovers this so-called community graph, and finds the overlapping fully nested subgraphs of a network. We also introduce an algorithm for generating graphs with such nested structure, given by a community graph. This algorithm can be used to test a nested community detection algorithm of this kind, and potentially to evaluate different metrics of nestedness as well. Finally, we evaluate our nested community detection algorithm on a large variety of networks, including bipartite and non-bipartite ones, too. We derive a new metric from the community graph to quantify the nestedness of both bipartite and non-bipartite networks

    Interaction Analysis in Smart Work Environments through Fuzzy Temporal Logic

    Get PDF
    Interaction analysis is defined as the generation of situation descriptions from machine perception. World models created through machine perception are used by a reasoning engine based on fuzzy metric temporal logic and situation graph trees, with optional parameter learning and clustering as preprocessing, to deduce knowledge about the observed scene. The system is evaluated in a case study on automatic behavior report generation for staff training purposes in crisis response control rooms

    Interaction Analysis in Smart Work Environments through Fuzzy Temporal Logic

    Get PDF
    Interaction analysis is defined as the generation of situation descriptions from machine perception. World models created through machine perception are used by a reasoning engine based on fuzzy metric temporal logic and situation graph trees, with optional parameter learning and clustering as preprocessing, to deduce knowledge about the observed scene. The system is evaluated in a case study on automatic behavior report generation for staff training purposes in crisis response control rooms

    VIDEO FOREGROUND LOCALIZATION FROM TRADITIONAL METHODS TO DEEP LEARNING

    Get PDF
    These days, detection of Visual Attention Regions (VAR), such as moving objects has become an integral part of many Computer Vision applications, viz. pattern recognition, object detection and classification, video surveillance, autonomous driving, human-machine interaction (HMI), and so forth. The moving object identification using bounding boxes has matured to the level of localizing the objects along their rigid borders and the process is called foreground localization (FGL). Over the decades, many image segmentation methodologies have been well studied, devised, and extended to suit the video FGL. Despite that, still, the problem of video foreground (FG) segmentation remains an intriguing task yet appealing due to its ill-posed nature and myriad of applications. Maintaining spatial and temporal coherence, particularly at object boundaries, persists challenging, and computationally burdensome. It even gets harder when the background possesses dynamic nature, like swaying tree branches or shimmering water body, and illumination variations, shadows cast by the moving objects, or when the video sequences have jittery frames caused by vibrating or unstable camera mounts on a surveillance post or moving robot. At the same time, in the analysis of traffic flow or human activity, the performance of an intelligent system substantially depends on its robustness of localizing the VAR, i.e., the FG. To this end, the natural question arises as what is the best way to deal with these challenges? Thus, the goal of this thesis is to investigate plausible real-time performant implementations from traditional approaches to modern-day deep learning (DL) models for FGL that can be applicable to many video content-aware applications (VCAA). It focuses mainly on improving existing methodologies through harnessing multimodal spatial and temporal cues for a delineated FGL. The first part of the dissertation is dedicated for enhancing conventional sample-based and Gaussian mixture model (GMM)-based video FGL using probability mass function (PMF), temporal median filtering, and fusing CIEDE2000 color similarity, color distortion, and illumination measures, and picking an appropriate adaptive threshold to extract the FG pixels. The subjective and objective evaluations are done to show the improvements over a number of similar conventional methods. The second part of the thesis focuses on exploiting and improving deep convolutional neural networks (DCNN) for the problem as mentioned earlier. Consequently, three models akin to encoder-decoder (EnDec) network are implemented with various innovative strategies to improve the quality of the FG segmentation. The strategies are not limited to double encoding - slow decoding feature learning, multi-view receptive field feature fusion, and incorporating spatiotemporal cues through long-shortterm memory (LSTM) units both in the subsampling and upsampling subnetworks. Experimental studies are carried out thoroughly on all conditions from baselines to challenging video sequences to prove the effectiveness of the proposed DCNNs. The analysis demonstrates that the architectural efficiency over other methods while quantitative and qualitative experiments show the competitive performance of the proposed models compared to the state-of-the-art

    Spatio-temporal human action detection and instance segmentation in videos

    Get PDF
    With an exponential growth in the number of video capturing devices and digital video content, automatic video understanding is now at the forefront of computer vision research. This thesis presents a series of models for automatic human action detection in videos and also addresses the space-time action instance segmentation problem. Both action detection and instance segmentation play vital roles in video understanding. Firstly, we propose a novel human action detection approach based on a frame-level deep feature representation combined with a two-pass dynamic programming approach. The method obtains a frame-level action representation by leveraging recent advances in deep learning based action recognition and object detection methods. To combine the the complementary appearance and motion cues, we introduce a new fusion technique which signicantly improves the detection performance. Further, we cast the temporal action detection as two energy optimisation problems which are solved using Viterbi algorithm. Exploiting a video-level representation further allows the network to learn the inter-frame temporal correspondence between action regions and it is bound to be a more optimal solution to the action detection problem than a frame-level representation. Secondly, we propose a novel deep network architecture which learns a video-level action representation by classifying and regressing 3D region proposals spanning two successive video frames. The proposed model is end-to-end trainable and can be jointly optimised for both proposal generation and action detection objectives in a single training step. We name our new network as \AMTnet" (Action Micro-Tube regression Network). We further extend the AMTnet model by incorporating optical ow features to encode motion patterns of actions. Finally, we address the problem of action instance segmentation in which multiple concurrent actions of the same class may be segmented out of an image sequence. By taking advantage of recent work on action foreground-background segmentation, we are able to associate each action tube with class-specic segmentations. We demonstrate the performance of our proposed models on challenging action detection benchmarks achieving new state-of-the-art results across the board and signicantly increasing detection speed at test time

    Creationism and evolution

    Get PDF
    In Tower of Babel, Robert Pennock wrote that “defenders of evolution would help their case immeasurably if they would reassure their audience that morality, purpose, and meaning are not lost by accepting the truth of evolution.” We first consider the thesis that the creationists’ movement exploits moral concerns to spread its ideas against the theory of evolution. We analyze their arguments and possible reasons why they are easily accepted. Creationists usually employ two contradictive strategies to expose the purported moral degradation that comes with accepting the theory of evolution. On the one hand they claim that evolutionary theory is immoral. On the other hand creationists think of evolutionary theory as amoral. Both objections come naturally in a monotheistic view. But we can find similar conclusions about the supposed moral aspects of evolution in non-religiously inspired discussions. Meanwhile, the creationism-evolution debate mainly focuses — understandably — on what constitutes good science. We consider the need for moral reassurance and analyze reassuring arguments from philosophers. Philosophers may stress that science does not prescribe and is therefore not immoral, but this reaction opens the door for the objection of amorality that evolution — as a naturalistic world view at least — supposedly endorses. We consider that the topic of morality and its relation to the acceptance of evolution may need more empirical research

    Hilbert's Metamathematical Problems and Their Solutions

    Get PDF
    This dissertation examines several of the problems that Hilbert discovered in the foundations of mathematics, from a metalogical perspective. The problems manifest themselves in four different aspects of Hilbert’s views: (i) Hilbert’s axiomatic approach to the foundations of mathematics; (ii) His response to criticisms of set theory; (iii) His response to intuitionist criticisms of classical mathematics; (iv) Hilbert’s contribution to the specification of the role of logical inference in mathematical reasoning. This dissertation argues that Hilbert’s axiomatic approach was guided primarily by model theoretical concerns. Accordingly, the ultimate aim of his consistency program was to prove the model-theoretical consistency of mathematical theories. It turns out that for the purpose of carrying out such consistency proofs, a suitable modification of the ordinary first-order logic is needed. To effect this modification, independence-friendly logic is needed as the appropriate conceptual framework. It is then shown how the model theoretical consistency of arithmetic can be proved by using IF logic as its basic logic. Hilbert’s other problems, manifesting themselves as aspects (ii), (iii), and (iv)—most notably the problem of the status of the axiom of choice, the problem of the role of the law of excluded middle, and the problem of giving an elementary account of quantification—can likewise be approached by using the resources of IF logic. It is shown that by means of IF logic one can carry out Hilbertian solutions to all these problems. The two major results concerning aspects (ii), (iii) and (iv) are the following: (a) The axiom of choice is a logical principle; (b) The law of excluded middle divides metamathematical methods into elementary and non-elementary ones. It is argued that these results show that IF logic helps to vindicate Hilbert’s nominalist philosophy of mathematics. On the basis of an elementary approach to logic, which enriches the expressive resources of ordinary first-order logic, this dissertation shows how the different problems that Hilbert discovered in the foundations of mathematics can be solved
    • 

    corecore