438 research outputs found

    MACHINE VISION RECOGNITION OF THREE-DIMENSIONAL SPECULAR SURFACE FOR GAS TUNGSTEN ARC WELD POOL

    Get PDF
    Observing the weld pool surface and measuring its geometrical parameters is a key to developing the next-generation intelligent welding machines that can mimic a skilled human welder who observes the weld pool to adjust welding parameters. It also provides us an effective way to improve and validate welding process modeling. Although different techniques have been applied in the past few years, the dynamic specular weld pool surface and the strong weld arc complicate these approaches and make the observation /measurement difficult. In this dissertation, a novel machine vision system to measure three-dimensional gas tungsten arc weld pool surface is proposed, which takes advantage of the specular reflection. In the designed system, a structured laser pattern is projected onto the weld pool surface and its reflection from the specular weld pool surface is imaged on an imaging plane and recorded by a high-speed camera with a narrow band-pass filter. The deformation of the molten weld pool surface distorts the reflected pattern. To derive the deformed surface of the weld pool, an image processing algorithm is firstly developed to detect the reflection points in the reflected laser pattern. The reflection points are then matched with their respective incident rays according to the findings of correspondence simulations. As a result, a set of matched incident ray and reflection point is obtained and an iterative surface reconstruction scheme is proposed to derive the three-dimensional pool surface from this set of data based on the reflection law. The reconstructed results proved the effectiveness of the system. Using the proposed surface measurement (machine vision) system, the fluctuation of weld pool surface parameters has been studied. In addition, analysis has been done to study the measurement error and identify error sources in order to improve the measurement system for better accuracy. The achievements in this dissertation provide a useful guidance for the further studies in on-line pool measurement and welding quality control

    Decoupled DETR: Spatially Disentangling Localization and Classification for Improved End-to-End Object Detection

    Full text link
    The introduction of DETR represents a new paradigm for object detection. However, its decoder conducts classification and box localization using shared queries and cross-attention layers, leading to suboptimal results. We observe that different regions of interest in the visual feature map are suitable for performing query classification and box localization tasks, even for the same object. Salient regions provide vital information for classification, while the boundaries around them are more favorable for box regression. Unfortunately, such spatial misalignment between these two tasks greatly hinders DETR's training. Therefore, in this work, we focus on decoupling localization and classification tasks in DETR. To achieve this, we introduce a new design scheme called spatially decoupled DETR (SD-DETR), which includes a task-aware query generation module and a disentangled feature learning process. We elaborately design the task-aware query initialization process and divide the cross-attention block in the decoder to allow the task-aware queries to match different visual regions. Meanwhile, we also observe that the prediction misalignment problem for high classification confidence and precise localization exists, so we propose an alignment loss to further guide the spatially decoupled DETR training. Through extensive experiments, we demonstrate that our approach achieves a significant improvement in MSCOCO datasets compared to previous work. For instance, we improve the performance of Conditional DETR by 4.5 AP. By spatially disentangling the two tasks, our method overcomes the misalignment problem and greatly improves the performance of DETR for object detection.Comment: accepted by ICCV202

    Shadow of general rotating black hole

    Full text link
    The Johannsen black hole (BH) is a generic rotating BH admitting three constants of motions ( energy, angular momentum and Carter constant) and is characterized by four deviation parameters besides mass and spin, which could be a model-independent probe of the no-hair theorem. We study the effects of the deviation parameters on the BH shadow as well as the effects of spin. By using the shadow boundaries of M87* and SgrA*, for the fist time, the deviation parameters of are constrained. The detail results depend on the spin aa and inclination angle θ0 \theta_0. Assuming a=0.2a=0.2 and θ0=15∘\theta_0=15^{\circ}, the deviation parameter α13\alpha_{13} are constained within ∼\sim [-3.5, 6] for M87* observation and [-3, 0.5] for SgrA* observation. We also show the images of a Johannsen BH surrounded by a Page-Thorne thin accretion disk observed by a remote observer with a ray-tracing method, and discuss the effects of the deviation parameters on deforming the accretion disk image, which could be tested by observations with higher sensitivities in the future.Comment: 14 pages, 9 figure

    Towards Large-scale Masked Face Recognition

    Full text link
    During the COVID-19 coronavirus epidemic, almost everyone is wearing masks, which poses a huge challenge for deep learning-based face recognition algorithms. In this paper, we will present our \textbf{championship} solutions in ICCV MFR WebFace260M and InsightFace unconstrained tracks. We will focus on four challenges in large-scale masked face recognition, i.e., super-large scale training, data noise handling, masked and non-masked face recognition accuracy balancing, and how to design inference-friendly model architecture. We hope that the discussion on these four aspects can guide future research towards more robust masked face recognition systems.Comment: the top1 solution for ICCV2021-MFR challeng

    Mechanism underlying synergic activation of Tyrosinase promoter by MITF and IRF4

    Get PDF
    Background: The transcription factor interferon regulatory factor 4 (IRF4) was identified to be involved in human pigmentation by genome-wide association studies (GWASs). The rs12203592-[T/C], which is located in intron 4 of IRF4, shows the strongest link to these pigmentation phenotypes including freckling, sun sensitivity, eye and hair color. Previous studies indicated a functional cooperation of IRF4 with Microphthalmia-associated transcription factor (MITF), a causing gene of Waardenburg syndrome (WS), to synergistically trans-activate Tyrosinase (TYR). However, the underlying mechanism is still unknown. Methods: To investigate the importance of DNA binding in the synergic effect of IRF4. Reporter plasmids with mutant TYR promoters was generated to locate the IRF4 DNA binding sites in the Tyrosinase minimal promoter. By building MITF and IRF4 truncated mutations plasmids, the necessary regions of the synergy functions of these two proteins were also located. Results: The cooperative effect between MITF and IRF4 was specific for TYR promoter. The DNA-binding of IRF4 was critical for the synergic function. IRF4 DNA binding sites in TYR promoter were identified. The Trans-activation domains in IRF4 (aa134-207, aa300-420) were both important for the synergic function, whereas the auto-mask domain (aa207-300) appeared to mask the synergic effect. Mutational analysis in MITF indicated that both DNA-binding and transcriptional activation domains were both required for this synergic effect. Conclusions: Here we showed that IRF4 potently synergized with MITF to activate the TYR promoter, which was dependent on DNA binding of IRF4. The synergic domains in both IRF4 and MITF were identified by mutational analysis. This identification of IRF4 as a partner for MITF in regulation of TYR may provide an important molecular function for IRF4 in the genesis of melanocytes and the pathogenic mechanism in WS

    Teach-DETR: Better Training DETR with Teachers

    Full text link
    In this paper, we present a novel training scheme, namely Teach-DETR, to learn better DETR-based detectors from versatile teacher detectors. We show that the predicted boxes from teacher detectors are effective medium to transfer knowledge of teacher detectors, which could be either RCNN-based or DETR-based detectors, to train a more accurate and robust DETR model. This new training scheme can easily incorporate the predicted boxes from multiple teacher detectors, each of which provides parallel supervisions to the student DETR. Our strategy introduces no additional parameters and adds negligible computational cost to the original detector during training. During inference, Teach-DETR brings zero additional overhead and maintains the merit of requiring no non-maximum suppression. Extensive experiments show that our method leads to consistent improvement for various DETR-based detectors. Specifically, we improve the state-of-the-art detector DINO with Swin-Large backbone, 4 scales of feature maps and 36-epoch training schedule, from 57.8% to 58.9% in terms of mean average precision on MSCOCO 2017 validation set. Code will be available at https://github.com/LeonHLJ/Teach-DETR

    Gen-L-Video: Multi-Text to Long Video Generation via Temporal Co-Denoising

    Full text link
    Leveraging large-scale image-text datasets and advancements in diffusion models, text-driven generative models have made remarkable strides in the field of image generation and editing. This study explores the potential of extending the text-driven ability to the generation and editing of multi-text conditioned long videos. Current methodologies for video generation and editing, while innovative, are often confined to extremely short videos (typically less than 24 frames) and are limited to a single text condition. These constraints significantly limit their applications given that real-world videos usually consist of multiple segments, each bearing different semantic information. To address this challenge, we introduce a novel paradigm dubbed as Gen-L-Video, capable of extending off-the-shelf short video diffusion models for generating and editing videos comprising hundreds of frames with diverse semantic segments without introducing additional training, all while preserving content consistency. We have implemented three mainstream text-driven video generation and editing methodologies and extended them to accommodate longer videos imbued with a variety of semantic segments with our proposed paradigm. Our experimental outcomes reveal that our approach significantly broadens the generative and editing capabilities of video diffusion models, offering new possibilities for future research and applications. The code is available at https://github.com/G-U-N/Gen-L-Video.Comment: The code is available at https://github.com/G-U-N/Gen-L-Vide
    • …
    corecore