997 research outputs found

    Detecting Semantic Parts on Partially Occluded Objects

    Get PDF
    In this paper, we address the task of detecting semantic parts on partially occluded objects. We consider a scenario where the model is trained using non-occluded images but tested on occluded images. The motivation is that there are infinite number of occlusion patterns in real world, which cannot be fully covered in the training data. So the models should be inherently robust and adaptive to occlusions instead of fitting / learning the occlusion patterns in the training data. Our approach detects semantic parts by accumulating the confidence of local visual cues. Specifically, the method uses a simple voting method, based on log-likelihood ratio tests and spatial constraints, to combine the evidence of local cues. These cues are called visual concepts, which are derived by clustering the internal states of deep networks. We evaluate our voting scheme on the VehicleSemanticPart dataset with dense part annotations. We randomly place two, three or four irrelevant objects onto the target object to generate testing images with various occlusions. Experiments show that our algorithm outperforms several competitors in semantic part detection when occlusions are present.Comment: Accepted to BMVC 2017 (13 pages, 3 figures

    Stylized Table Tennis Robots Skill Learning with Incomplete Human Demonstrations

    Full text link
    In recent years, Reinforcement Learning (RL) is becoming a popular technique for training controllers for robots. However, for complex dynamic robot control tasks, RL-based method often produces controllers with unrealistic styles. In contrast, humans can learn well-stylized skills under supervisions. For example, people learn table tennis skills by imitating the motions of coaches. Such reference motions are often incomplete, e.g. without the presence of an actual ball. Inspired by this, we propose an RL-based algorithm to train a robot that can learn the playing style from such incomplete human demonstrations. We collect data through the teaching-and-dragging method. We also propose data augmentation techniques to enable our robot to adapt to balls of different velocities. We finally evaluate our policy in different simulators with varying dynamics.Comment: Submitted to ICRA 202

    Visual Concepts and Compositional Voting

    Get PDF
    It is very attractive to formulate vision in terms of pattern theory \cite{Mumford2010pattern}, where patterns are defined hierarchically by compositions of elementary building blocks. But applying pattern theory to real world images is currently less successful than discriminative methods such as deep networks. Deep networks, however, are black-boxes which are hard to interpret and can easily be fooled by adding occluding objects. It is natural to wonder whether by better understanding deep networks we can extract building blocks which can be used to develop pattern theoretic models. This motivates us to study the internal representations of a deep network using vehicle images from the PASCAL3D+ dataset. We use clustering algorithms to study the population activities of the features and extract a set of visual concepts which we show are visually tight and correspond to semantic parts of vehicles. To analyze this we annotate these vehicles by their semantic parts to create a new dataset, VehicleSemanticParts, and evaluate visual concepts as unsupervised part detectors. We show that visual concepts perform fairly well but are outperformed by supervised discriminative methods such as Support Vector Machines (SVM). We next give a more detailed analysis of visual concepts and how they relate to semantic parts. Following this, we use the visual concepts as building blocks for a simple pattern theoretical model, which we call compositional voting. In this model several visual concepts combine to detect semantic parts. We show that this approach is significantly better than discriminative methods like SVM and deep networks trained specifically for semantic part detection. Finally, we return to studying occlusion by creating an annotated dataset with occlusion, called VehicleOcclusion, and show that compositional voting outperforms even deep networks when the amount of occlusion becomes large.Comment: It is accepted by Annals of Mathematical Sciences and Application

    Bayesian dense inverse searching algorithm for real-time stereo matching in minimally invasive surgery

    Full text link
    This paper reports a CPU-level real-time stereo matching method for surgical images (10 Hz on 640 * 480 image with a single core of i5-9400). The proposed method is built on the fast ''dense inverse searching'' algorithm, which estimates the disparity of the stereo images. The overlapping image patches (arbitrary squared image segment) from the images at different scales are aligned based on the photometric consistency presumption. We propose a Bayesian framework to evaluate the probability of the optimized patch disparity at different scales. Moreover, we introduce a spatial Gaussian mixed probability distribution to address the pixel-wise probability within the patch. In-vivo and synthetic experiments show that our method can handle ambiguities resulted from the textureless surfaces and the photometric inconsistency caused by the Lambertian reflectance. Our Bayesian method correctly balances the probability of the patch for stereo images at different scales. Experiments indicate that the estimated depth has higher accuracy and fewer outliers than the baseline methods in the surgical scenario

    Diffusion–reaction–induced stress in moving boundary cylindrical Li-ion battery electrodes

    Get PDF
    Lithium (Li) inserted into or extracted from the electrode in Li-ion battery causes stress which may cause fracture of the electrode. A moving boundary model in a cylindrical Li-ion battery electrode accounting for reversible electrochemical reaction is obtained. The volumetric change created by Li diffusion and formation of reversible reaction product would generate the diffusion–reaction-induced stress in the electrode. The constitutive relation among Li concentration, reaction product, and stress is derived, and the numerical solutions of the concentration, reaction product, and stress fields are obtained. The effects of phase transformation and reversible electrochemical reaction on Li diffusion and stress in a cylindrical Li-ion battery electrode are analyzed

    Analysis of Nakamoto Consensus, Revisited

    Get PDF
    In the Bitcoin white paper, Nakamoto proposed a very simple Byzantine fault tolerant consensus algorithm that is also known as Nakamoto consensus. Despite its simplicity, some existing analysis of Nakamoto consensus appears to be long and involved. In this technical report, we aim to make such analysis simple and transparent so that we can teach senior undergraduate students and graduate students in our institutions. This report is largely based on a 3-hour tutorial given by one of the authors in June 2019.Comment: 8 page

    Incorporating Heterogeneous User Behaviors and Social Influences for Predictive Analysis

    Full text link
    Behavior prediction based on historical behavioral data have practical real-world significance. It has been applied in recommendation, predicting academic performance, etc. With the refinement of user data description, the development of new functions, and the fusion of multiple data sources, heterogeneous behavioral data which contain multiple types of behaviors become more and more common. In this paper, we aim to incorporate heterogeneous user behaviors and social influences for behavior predictions. To this end, this paper proposes a variant of Long-Short Term Memory (LSTM) which can consider context information while modeling a behavior sequence, a projection mechanism which can model multi-faceted relationships among different types of behaviors, and a multi-faceted attention mechanism which can dynamically find out informative periods from different facets. Many kinds of behavioral data belong to spatio-temporal data. An unsupervised way to construct a social behavior graph based on spatio-temporal data and to model social influences is proposed. Moreover, a residual learning-based decoder is designed to automatically construct multiple high-order cross features based on social behavior representation and other types of behavior representations. Qualitative and quantitative experiments on real-world datasets have demonstrated the effectiveness of this model
    • …
    corecore