134 research outputs found

    Study of object recognition and identification based on shape and texture analysis

    Get PDF
    The objective of object recognition is to enable computers to recognize image patterns without human intervention. According to its applications, it is mainly divided into two parts: recognition of object categories and detection/identification of objects. My thesis studied the techniques of object feature analysis and identification strategies, which solve the object recognition problem by employing effective and perceptually important object features. The shape information is of particular interest and a review of the shape representation and description is presented, as well as the latest research work on object recognition. In the second chapter of the thesis, a novel content-based approach is proposed for efficient shape classification and retrieval of 2D objects. Two object detection approaches, which are designed according to the characteristics of the shape context and SIFT descriptors, respectively, are analyzed and compared. It is found that the identification strategy constructed on a single type of object feature is only able to recognize the target object under specific conditions which the identifier is adapted to. These identifiers are usually designed to detect the target objects which are rich in the feature type captured by the identifier. In addition, this type of feature often distinguishes the target object from the complex scene. To overcome this constraint, a novel prototyped-based object identification method is presented to detect the target object in the complex scene by employing different types of descriptors to capture the heterogeneous features. All types of descriptors are modified to meet the requirement of the detection strategy’s framework. Thus this new method is able to describe and identify various kinds of objects whose dominant features are quite different. The identification system employs the cosine similarity to evaluate the resemblance between the prototype image and image windows on the complex scene. Then a ‘resemblance map’ is established with values on each patch representing the likelihood of the target object’s presence. The simulation approved that this novel object detection strategy is efficient, robust and of scale and rotation invariance

    Ultrafast laser-induced spin/electron dynamic in advanced material

    Get PDF
    I have commissioned the instrumentation of a one-colour time-resolved optical pump-probe experimental system and studied carrier dynamics in a monolayer MoSe2 sample. The system has been first tested and optimized by performing time-resolved measurements at room temperatures in an intrinsic GaAs wafer. Different modulation methods have been extensively investigated in order to increase SNR ratio. Comparing the experiment data and simulation results, double-chopping modulation is found to significantly enhance the signal-to-noise ratio. Transient reflectivity and Kerr rotation have been obtained in the GaAs test sample under various pump fluency, polarisation, and wavelength. The carrier dynamics in the GaAs sample can be reproduced and three distinct time scales have been revealed: τ_1~1.5 ps,τ_2~10 ps,τ_3~100 ps, which are consistent to what have been reported. The same experimental methods are then used to investigate the carrier dynamics of a monolayer MoSe2. From the wavelength dependent transient reflectivity data, the band gap of the MoSe2 sample is confirmed to be lower than 820 nm at room temperature, which is greatly reduced compared with its reported value at zero temperature. The reflectivity is found falling to negative after the initial positive peak at wavelengths longer than 790 nm. We suggest this phenomenon arise from the bound of trion and nonradioactive combination of excitons. All the reflectivity data is fitted by a bi-exponential decay function. fitting weight constants for these two dynamics have a clear dependence on laser wavelength. Transient Kerr rotation is also observed over the range of wavelength from 790 nm to 820 nm. The decay of the Kerr rotation is within the first picosecond and doesn’t show an obvious wavelength dependence, which suggest an ultrafast spin relaxation time in MoSe2

    Leveraging Work-Related Stressors for Employee Innovation: The Moderating Role of Enterprise Social Networking Use

    Get PDF
    Enterprise social networking (ESN) techniques have been widely adopted by firms to provide a platform for public communication among employees. This study investigates how the relationships between stressors (i.e., challenge and hindrance stressors) and employee innovation are moderated by task-oriented and relationship-oriented ESN use. Since challenge-hindrance stressors and employee innovation are individual-level variables and task-oriented ESN use and relationship-oriented ESN use are team-level variables, we thus use hierarchical linear model to test this cross-level model. The results of a survey of 191 employees in 50 groups indicate that two ESN use types differentially moderate the relationship between stressors and employee innovation. Specifically, task-oriented ESN use positively moderates the effects of the two stressors on employee innovation, while relationship-oriented ESN use negatively moderates the relationship between the two stressors and employee innovation. In addition, we find that challenge stressors significantly improve employee innovation. Theoretical and practical implications are discussed

    What Does Stable Diffusion Know about the 3D Scene?

    Full text link
    Recent advances in generative models like Stable Diffusion enable the generation of highly photo-realistic images. Our objective in this paper is to probe the diffusion network to determine to what extent it 'understands' different properties of the 3D scene depicted in an image. To this end, we make the following contributions: (i) We introduce a protocol to evaluate whether a network models a number of physical 'properties' of the 3D scene by probing for explicit features that represent these properties. The probes are applied on datasets of real images with annotations for the property. (ii) We apply this protocol to properties covering scene geometry, scene material, support relations, lighting, and view dependent measures. (iii) We find that Stable Diffusion is good at a number of properties including scene geometry, support relations, shadows and depth, but less performant for occlusion. (iv) We also apply the probes to other models trained at large-scale, including DINO and CLIP, and find their performance inferior to that of Stable Diffusion

    Evaluating Explanation Methods for Vision-and-Language Navigation

    Full text link
    The ability to navigate robots with natural language instructions in an unknown environment is a crucial step for achieving embodied artificial intelligence (AI). With the improving performance of deep neural models proposed in the field of vision-and-language navigation (VLN), it is equally interesting to know what information the models utilize for their decision-making in the navigation tasks. To understand the inner workings of deep neural models, various explanation methods have been developed for promoting explainable AI (XAI). But they are mostly applied to deep neural models for image or text classification tasks and little work has been done in explaining deep neural models for VLN tasks. In this paper, we address these problems by building quantitative benchmarks to evaluate explanation methods for VLN models in terms of faithfulness. We propose a new erasure-based evaluation pipeline to measure the step-wise textual explanation in the sequential decision-making setting. We evaluate several explanation methods for two representative VLN models on two popular VLN datasets and reveal valuable findings through our experiments.Comment: Accepted by ECAI 202

    Learning Environment-Aware Affordance for 3D Articulated Object Manipulation under Occlusions

    Full text link
    Perceiving and manipulating 3D articulated objects in diverse environments is essential for home-assistant robots. Recent studies have shown that point-level affordance provides actionable priors for downstream manipulation tasks. However, existing works primarily focus on single-object scenarios with homogeneous agents, overlooking the realistic constraints imposed by the environment and the agent's morphology, e.g., occlusions and physical limitations. In this paper, we propose an environment-aware affordance framework that incorporates both object-level actionable priors and environment constraints. Unlike object-centric affordance approaches, learning environment-aware affordance faces the challenge of combinatorial explosion due to the complexity of various occlusions, characterized by their quantities, geometries, positions and poses. To address this and enhance data efficiency, we introduce a novel contrastive affordance learning framework capable of training on scenes containing a single occluder and generalizing to scenes with complex occluder combinations. Experiments demonstrate the effectiveness of our proposed approach in learning affordance considering environment constraints. Project page at https://chengkaiacademycity.github.io/EnvAwareAfford/Comment: In 37th Conference on Neural Information Processing Systems (NeurIPS 2023). Website at https://chengkaiacademycity.github.io/EnvAwareAfford

    Image-based Visual Servo Control for Aerial Manipulation Using a Fully-Actuated UAV

    Full text link
    Using Unmanned Aerial Vehicles (UAVs) to perform high-altitude manipulation tasks beyond just passive visual application can reduce the time, cost, and risk of human workers. Prior research on aerial manipulation has relied on either ground truth state estimate or GPS/total station with some Simultaneous Localization and Mapping (SLAM) algorithms, which may not be practical for many applications close to infrastructure with degraded GPS signal or featureless environments. Visual servo can avoid the need to estimate robot pose. Existing works on visual servo for aerial manipulation either address solely end-effector position control or rely on precise velocity measurement and pre-defined visual visual marker with known pattern. Furthermore, most of previous work used under-actuated UAVs, resulting in complicated mechanical and hence control design for the end-effector. This paper develops an image-based visual servo control strategy for bridge maintenance using a fully-actuated UAV. The main components are (1) a visual line detection and tracking system, (2) a hybrid impedance force and motion control system. Our approach does not rely on either robot pose/velocity estimation from an external localization system or pre-defined visual markers. The complexity of the mechanical system and controller architecture is also minimized due to the fully-actuated nature. Experiments show that the system can effectively execute motion tracking and force holding using only the visual guidance for the bridge painting. To the best of our knowledge, this is one of the first studies on aerial manipulation using visual servo that is capable of achieving both motion and force control without the need of external pose/velocity information or pre-defined visual guidance.Comment: Accepted by 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS

    Score-PA: Score-based 3D Part Assembly

    Full text link
    Autonomous 3D part assembly is a challenging task in the areas of robotics and 3D computer vision. This task aims to assemble individual components into a complete shape without relying on predefined instructions. In this paper, we formulate this task from a novel generative perspective, introducing the Score-based 3D Part Assembly framework (Score-PA) for 3D part assembly. Knowing that score-based methods are typically time-consuming during the inference stage. To address this issue, we introduce a novel algorithm called the Fast Predictor-Corrector Sampler (FPC) that accelerates the sampling process within the framework. We employ various metrics to assess assembly quality and diversity, and our evaluation results demonstrate that our algorithm outperforms existing state-of-the-art approaches. We release our code at https://github.com/J-F-Cheng/Score-PA_Score-based-3D-Part-Assembly.Comment: BMVC 202
    • …
    corecore