73 research outputs found

    Semantic-aware Video Representation for Few-shot Action Recognition

    Full text link
    Recent work on action recognition leverages 3D features and textual information to achieve state-of-the-art performance. However, most of the current few-shot action recognition methods still rely on 2D frame-level representations, often require additional components to model temporal relations, and employ complex distance functions to achieve accurate alignment of these representations. In addition, existing methods struggle to effectively integrate textual semantics, some resorting to concatenation or addition of textual and visual features, and some using text merely as an additional supervision without truly achieving feature fusion and information transfer from different modalities. In this work, we propose a simple yet effective Semantic-Aware Few-Shot Action Recognition (SAFSAR) model to address these issues. We show that directly leveraging a 3D feature extractor combined with an effective feature-fusion scheme, and a simple cosine similarity for classification can yield better performance without the need of extra components for temporal modeling or complex distance functions. We introduce an innovative scheme to encode the textual semantics into the video representation which adaptively fuses features from text and video, and encourages the visual encoder to extract more semantically consistent features. In this scheme, SAFSAR achieves alignment and fusion in a compact way. Experiments on five challenging few-shot action recognition benchmarks under various settings demonstrate that the proposed SAFSAR model significantly improves the state-of-the-art performance.Comment: WACV202

    A Dynamical Systems Approach to Classification of Surgical Gestures in Kinematic and Video Data

    Get PDF
    In Computer Assisted Intervention (CAI) systems, a surgeon performs the surgery using an interface connected to a computer that remotely controls a set of surgical tools attached to a robot. Such systems are particularly appealing for minimally invasive surgeries since they allow for a larger and more precise set of movements than in traditional laparoscopic interventions, and provide enhanced vision capabilities such as 3D vision and augmented reality. These features directly translate into benefits for the patients such as smaller incisions, less pain and quicker healing. However, the benefits of the technology might be reduced due to the steep learning curve associated with CAI systems. This makes it necessary to account for a fair and objective criterion for the evaluation and assessment of the skills of a novice surgeon. Furthermore, it is desirable to automate the process in order to avoid constant supervision of an expert surgeon, a time consuming, subjective and rather inefficient method. It is therefore necessary to develop algorithmic methods that extract information from kinematic cues provided by the robot and video recordings of the interventions. A common approach is to divide the surgical procedure into smaller actions, forming a vocabulary able to to describe different surgical tasks. Following such an approach requires a method capable of providing temporal segmentation, recognition of the action and final skill assessment. Prior work has usually modeled the interactions between these atomic actions using generative models such as Hidden Markov Models, Factor-Analysis and Switching Linear Dynamical Systems. In this thesis, we focus on the classification problem and assume segmented data. We propose to follow a discriminative approach using Linear Dynamical Systems (LDS) to model and characterize a particular action. We develop new methods for the extraction of meaningful representations by means of averaging in the space of LDSs. These representative points are then used into a discriminative framework for surgical gesture classification. We propose a novel SVM classification method for time series of data that reduces computation at the expense of some degradation in performance. Our contributions are fairly general and can be applied to any temporal signal coming from an LDS

    Topology Optimization For Energy-Efficient Communications In Consensus Wireless Networks

    Get PDF
    Over the past years there has been an increasing interest in developing distributed computation methods over wireless networks. A new communication paradigm has emerged where distributed algorithms such as consensus have played a key role in the development of such networks. A special case are wireless sensor networks (WSN) which have found application in a large variety of problems such as environmental monitoring, surveillance, or localization, to cite a few. One major design issue in WSNs is energy efficiency. Nodes are typically battery-powered devices and thus, it is critical to make a proper use of the scarce energy resources. This fact motivates the search for optimal conditions that favor the communication environment. It is well known that the rate at which the information is spread across the network depends on the topology of the network and that finding the optimal topology is a hard combinatorial problem. However, using convex optimization tools, we propose a method that tries to find the optimal topology in a consensus wireless network that uses broadcast messages. Our results show that exploiting the broadcast nature of the wireless channel leads to more energy efficient configurations than using dedicated unicast messages and that our algorithm performs very close to the optimal solution

    Enhancing local-Transmitting less-Improving global

    Get PDF
    Super-resolving a natural image is an ill-posed problem. The classical approach is based on the registration and subsequent interpolation of a given set of low-resolution images. However, achieving satisfactory results typically requires the combination of a large number of them. Such an approach would be impractical over heterogeneous rate-constrained wireless networks due to the associated communication cost and limited data available. In this paper, we present an approach for local image enhancement following the finite rate of innovation sampling framework, and motivate its application to the super-resolution problem over heterogeneous networks. Local estimates can be exchanged among the nodes of the network in order to regularize the super-resolution problem while, at the same time, reduce data exchange

    Shape from bandwidth: the 2-D orthogonal projection case

    Get PDF
    Could bandwidth—one of the most classic concepts in signal processing—have a new purpose? In this paper, we investigate the feasibility of using bandwidth to infer shape from a single image. As a first analysis, we limit our attention to orthographic projection and assume a 2-D world. We show that, under certain conditions, a single image of a surface, painted with a bandlimited texture, is enough to deduce the surface up to an equivalence class. This equivalence class is unavoidable, since it stems from surface transformations that are invisible to orthographic projections. A proof of concept algorithm is presented and tested with both a simulation and a simple practical experiment

    Unlabeled Sensing: Reconstruction Algorithm and Theoretical Guarantees

    Get PDF
    It often happens that we are interested in reconstructing an unknown signal from partial measurements. Also, it is typically assumed that the location (temporal or spatial) of the samples is known and that the only distortion present in the observations is due to additive measurement noise. However, there are some applications where such location information is lost. In this paper, we consider the situation in which the order of noisy samples out of a linear measurement system is missing. Previous work on this topic has only considered the noiseless case and exhaustive search combinatorial algorithms. We propose a much more efficient algorithm based on a geometrical viewpoint of the problem. We also study the uniqueness of the solution under different choices of the sampling matrix and its robustness to noise for the case of two-dimensional signals. Finally we provide simulation results to confirm the theoretical findings of the paper

    Sampling at unknown locations: Uniqueness and reconstruction under constraints

    Get PDF
    Traditional sampling results assume that the sample locations are known. Motivated by simultaneous localization and mapping (SLAM) and structure from motion (SfM), we investigate sampling at unknown locations. Without further constraints, the problem is often hopeless. For example, we recently showed that, for polynomial and bandlimited signals, it is possible to find two signals, arbitrarily far from each other, that fit the measurements. However, we also showed that this can be overcome by adding constraints to the sample positions. In this paper, we show that these constraints lead to a uniform sampling of a composite of functions. Furthermore, the formulation retains the key aspects of the SLAM and SfM problems, whilst providing uniqueness, in many cases. We demonstrate this by studying two simple examples of constrained sampling at unknown locations. In the first, we consider sampling a periodic bandlimited signal composite with an unknown linear function. We derive the sampling requirements for uniqueness and present an algorithm that recovers both the bandlimited signal and the linear warping. Furthermore, we prove that, when the requirements for uniqueness are not met, the cases of multiple solutions have measure zero. For our second example, we consider polynomials sampled such that the sampling positions are constrained by a rational function. We previously proved that, if a specific sampling requirement is met, uniqueness is achieved. In addition, we present an alternate minimization scheme for solving the resulting non-convex optimization problem. Finally, fully reproducible simulation results are provided to support our theoretical analysis

    SF3B1-mutant MDS as a distinct disease subtype:a proposal from the International Working Group for the Prognosis of MDS

    Get PDF
    The 2016 revision of the World Health Organization classification of tumors of hematopoietic and lymphoid tissues is characterized by a closer integration of morphology and molecular genetics. Notwithstanding, the myelodysplastic syndrome (MDS) with isolated del(5q) remains so far the only MDS subtype defined by a genetic abnormality. Approximately half of MDS patients carry somatic mutations in spliceosome genes, with SF3B1 being the most commonly mutated one. SF3B1 mutation identifies a condition characterized by ring sideroblasts (RS), ineffective erythropoiesis, and indolent clinical course. A large body of evidence supports recognition of SF3B1-mutant MDSas a distinct nosologic entity. To further validate this notion, we interrogated the data set of the International Working Group for the Prognosis of MDS (IWG-PM). Based on the findings of our analyses, we propose the following diagnostic criteria for SF3B1-mutant MDS: (1) cytopenia as defined by standard hematologic values, (2) somatic SF3B1 mutation, (3) morphologic dysplasia (with or without RS), and (4) bone marrow blasts <5% and peripheral blood blasts <1%. Selected concomitant genetic lesions represent exclusion criteria for the proposed entity. In patients with clonal cytopenia of undetermined significance, SF3B1 mutation is almost invariably associated with subsequent development of overtMDS with RS, suggesting that this genetic lesion might provide presumptive evidence of MDS in the setting of persistent unexplained cytopenia. Diagnosis of SF3B1-mutant MDS has considerable clinical implications in terms of risk stratification and therapeutic decision making. In fact, this condition has a relatively good prognosis and may respond to luspatercept with abolishment of the transfusion requirement. (Blood. 2020;136(2):157-170)
    • …
    corecore