2,385 research outputs found

    잔차 합성곱 신경망을 통한 산업용 로봇 기어박스의 동작 적응형 퓨샷 고장 감지 방법

    Get PDF
    학위논문 (석사) -- 서울대학교 대학원 : 공과대학 기계공학부, 2020. 8. 윤병동.Nowadays, industrial robots are indispensable equipment for automated manufacturing processes because they can perform repetitive tasks with consistent precision and accuracy. However, when faults occur in the industrial robot, it can lead to the unexpected shutdown of the production line, which brings significant economic losses, so the fault detection is important. The gearbox, one of the main drivetrain components of an industrial robot, is often subjected to high torque loads, and faults occur frequently. When faults occur in the gearbox, the amplitude and frequency of the torque signal are modulated, which leads to changes in the characteristics of the torque signal. Although several previous studies have proposed fault detection methods for industrial robots using torque signals, it is still a challenge to extract fault-related features under various environmental and operating conditions and to detect faults in the complex motions used in industrial sites To overcome such difficulties, in this paper, we propose a novel motion-adaptive few-shot (MAFS) fault detection method of industrial robot gearboxes using torque ripples via a one-dimensional (1D) residual-convolutional neural network (Res-CNN) and binary-supervised domain adaptation (BSDA). The overall procedure of the proposed method is as follows. First, applying the moving average filtering to the torque signal to extract the data trend, and the torque ripples of the high-frequency band are obtained as a residual value between the original signal and the filtered signal. Second, classifying the state of pre-processed torque ripples under various operating and environmental conditions. It is shown that Res-CNN network 1) distinguishes small differences between normal and fault torque ripples effectively, and 2) focuses on important regions of the input data by the attention effect. Third, after constructing the Siamese network with a pre-trained network in the source domain, which consisted of simple motions, detecting the faults on the target domain, which consisted of complex motions through BSDA. As a result, 1) the similarities of the jointly shared physical mechanisms of torque ripples between simple and complex motions are learned, and 2) faults of the gearbox are adaptively detected while the industrial robot executes complex motions. The proposed method showed the most superior accuracy over other deep learning-based methods in few-shot conditions where only one cycle of each normal and fault data of complex motions is available. In addition, the transferable regions on the torque ripples after domain adaptation was highlighted using 1D guided grad-CAM. The effectiveness of the proposed method was validated with experimental data of multi-axial welding motions in constant and transient speed, which are commonly executed in real-industrial fields such as the automobile manufacturing line. Furthermore, it is expected that the proposed method is applicable to other types of motions, such as inspection, painting, assembly, and so on. The source code is available on my GitHub page of https://github.com/oyt9306/MAFS.Chapter 1. Introduction 1 1.1 Research Motivation 1 1.2 Scope of Research 4 1.3 Thesis Layout 5 Chapter 2. Research Backgrounds 6 2.1 Interpretations of Torque Ripples 6 2.1.1. Causes of torque ripples 6 2.1.1. Modulations on torque ripples due to gearbox faults 8 2.2 Architectures of Res-CNN 11 2.2.1 Convolutional Operation 11 2.2.2 Pooling Operation 12 2.2.3 Activation 13 2.2.4 Batch Normalization 13 2.2.5 Residual Learning 15 2.3 Domain Adaptation (DA) 17 2.3.1 Few-shot domain adaptation 18 Chapter 3. Motion-Adaptive Few-Shot (MAFS) Fault Detection Method 20 3.1 Pre-processing 23 3.2 Network Pre-training 28 3.3 Binary-Supervised Domain Adaptation (BSDA) 31 Chapter 4. Experimental Validations 37 4.1 Experimental Settings 37 4.2 Pre-trained Network Generation 40 4.3 Motion-Adaptation with Few-Shot Learning 43 Chapter 5. Conclusion and Future Work 52 5.1 Conclusion 52 5.2 Contribution 52 5.3 Future Work 54 Bibliography 55 Appendix A. 1D Guided Grad-CAM 60 국문 초록 62Maste

    Enriquecendo animações em quadros-chaves espaciais com movimento capturado

    Get PDF
    While motion capture (mocap) achieves realistic character animation at great cost, keyframing is capable of producing less realistic but more controllable animations. In this work we show how to combine the Spatial Keyframing (SK) Framework of IGARASHI et al. [1] and multidimensional projection techniques to reuse mocap data in several ways. Additionally, we show that multidimensional projection also can be used for visualization and motion analysis. We also propose a method for mocap compaction with the help of SK’s pose reconstruction (backprojection) algorithm. Finally, we present a novel multidimensional projection optimization technique that significantly enhances SK-based reconstruction and can also be applied to other contexts where a backprojection algorithm is available.Movimento capturado (mocap) produz animacões de personagens com grande realismo mas a um custo alto. A utilização de quadros-chave torna mais difícil um resultado com realismo mas torna mais fácil o controle da animacão. Neste trabalho, mostramos como combinar o uso de quadros-chaves espaciais – Spatial Keyframing (SK) Framework – de IGARASHI et al. [1] e técnicas de projeção multidimensional para reutilizar dados de movimento capturado de várias maneiras. Mostramos também como projeções multidimensionais podem ser utilizadas para visualização e análise de movimento. Propomos um método de compactação de dados de mocap utilizando a reconstrução de poses por meio do algoritmo de quadros-chaves espaciais. Também apresentamos uma técnica de otimização para as projeções multidimensionais que melhora a reconstrução do movimento e que pode ser aplicada em outros casos onde um algoritmo de retroprojecão esteja dad

    Personalized Cinemagraphs using Semantic Understanding and Collaborative Learning

    Full text link
    Cinemagraphs are a compelling way to convey dynamic aspects of a scene. In these media, dynamic and still elements are juxtaposed to create an artistic and narrative experience. Creating a high-quality, aesthetically pleasing cinemagraph requires isolating objects in a semantically meaningful way and then selecting good start times and looping periods for those objects to minimize visual artifacts (such a tearing). To achieve this, we present a new technique that uses object recognition and semantic segmentation as part of an optimization method to automatically create cinemagraphs from videos that are both visually appealing and semantically meaningful. Given a scene with multiple objects, there are many cinemagraphs one could create. Our method evaluates these multiple candidates and presents the best one, as determined by a model trained to predict human preferences in a collaborative way. We demonstrate the effectiveness of our approach with multiple results and a user study.Comment: To appear in ICCV 2017. Total 17 pages including the supplementary materia

    Motion Synthesis and Control for Autonomous Agents using Generative Models and Reinforcement Learning

    Get PDF
    Imitating and predicting human motions have wide applications in both graphics and robotics, from developing realistic models of human movement and behavior in immersive virtual worlds and games to improving autonomous navigation for service agents deployed in the real world. Traditional approaches for motion imitation and prediction typically rely on pre-defined rules to model agent behaviors or use reinforcement learning with manually designed reward functions. Despite impressive results, such approaches cannot effectively capture the diversity of motor behaviors and the decision making capabilities of human beings. Furthermore, manually designing a model or reward function to explicitly describe human motion characteristics often involves laborious fine-tuning and repeated experiments, and may suffer from generalization issues. In this thesis, we explore data-driven approaches using generative models and reinforcement learning to study and simulate human motions. Specifically, we begin with motion synthesis and control of physically simulated agents imitating a wide range of human motor skills, and then focus on improving the local navigation decisions of autonomous agents in multi-agent interaction settings. For physics-based agent control, we introduce an imitation learning framework built upon generative adversarial networks and reinforcement learning that enables humanoid agents to learn motor skills from a few examples of human reference motion data. Our approach generates high-fidelity motions and robust controllers without needing to manually design and finetune a reward function, allowing at the same time interactive switching between different controllers based on user input. Based on this framework, we further propose a multi-objective learning scheme for composite and task-driven control of humanoid agents. Our multi-objective learning scheme balances the simultaneous learning of disparate motions from multiple reference sources and multiple goal-directed control objectives in an adaptive way, enabling the training of efficient composite motion controllers. Additionally, we present a general framework for fast and robust learning of motor control skills. Our framework exploits particle filtering to dynamically explore and discretize the high-dimensional action space involved in continuous control tasks, and provides a multi-modal policy as a substitute for the commonly used Gaussian policies. For navigation learning, we leverage human crowd data to train a human-inspired collision avoidance policy by combining knowledge distillation and reinforcement learning. Our approach enables autonomous agents to take human-like actions during goal-directed steering in fully decentralized, multi-agent environments. To inform better control in such environments, we propose SocialVAE, a variational autoencoder based architecture that uses timewise latent variables with socially-aware conditions and a backward posterior approximation to perform agent trajectory prediction. Our approach improves current state-of-the-art performance on trajectory prediction tasks in daily human interaction scenarios and more complex scenes involving interactions between NBA players. We further extend SocialVAE by exploiting semantic maps as context conditions to generate map-compliant trajectory prediction. Our approach processes context conditions and social conditions occurring during agent-agent interactions in an integrated manner through the use of a dual-attention mechanism. We demonstrate the real-time performance of our approach and its ability to provide high-fidelity, multi-modal predictions on various large-scale vehicle trajectory prediction tasks

    Representation Learning: A Review and New Perspectives

    Full text link
    The success of machine learning algorithms generally depends on data representation, and we hypothesize that this is because different representations can entangle and hide more or less the different explanatory factors of variation behind the data. Although specific domain knowledge can be used to help design representations, learning with generic priors can also be used, and the quest for AI is motivating the design of more powerful representation-learning algorithms implementing such priors. This paper reviews recent work in the area of unsupervised feature learning and deep learning, covering advances in probabilistic models, auto-encoders, manifold learning, and deep networks. This motivates longer-term unanswered questions about the appropriate objectives for learning good representations, for computing representations (i.e., inference), and the geometrical connections between representation learning, density estimation and manifold learning

    CALC2.0: Combining Appearance, Semantic and Geometric Information for Robust and Efficient Visual Loop Closure

    Full text link
    Traditional attempts for loop closure detection typically use hand-crafted features, relying on geometric and visual information only, whereas more modern approaches tend to use semantic, appearance or geometric features extracted from deep convolutional neural networks (CNNs). While these approaches are successful in many applications, they do not utilize all of the information that a monocular image provides, and many of them, particularly the deep-learning based methods, require user-chosen thresholding to actually close loops -- which may impact generality in practical applications. In this work, we address these issues by extracting all three modes of information from a custom deep CNN trained specifically for the task of place recognition. Our network is built upon a combination of a semantic segmentator, Variational Autoencoder (VAE) and triplet embedding network. The network is trained to construct a global feature space to describe both the visual appearance and semantic layout of an image. Then local keypoints are extracted from maximally-activated regions of low-level convolutional feature maps, and keypoint descriptors are extracted from these feature maps in a novel way that incorporates ideas from successful hand-crafted features. These keypoints are matched globally for loop closure candidates, and then used as a final geometric check to refute false positives. As a result, the proposed loop closure detection system requires no touchy thresholding, and is highly robust to false positives -- achieving better precision-recall curves than the state-of-the-art NetVLAD, and with real-time speeds.Comment: Appears in IROS 201
    corecore