20 research outputs found

    LAD-RCNN:A Powerful Tool for Livestock Face Detection and Normalization

    Full text link
    With the demand for standardized large-scale livestock farming and the development of artificial intelligence technology, a lot of research in area of animal face recognition were carried on pigs, cattle, sheep and other livestock. Face recognition consists of three sub-task: face detection, face normalizing and face identification. Most of animal face recognition study focuses on face detection and face identification. Animals are often uncooperative when taking photos, so the collected animal face images are often in arbitrary directions. The use of non-standard images may significantly reduce the performance of face recognition system. However, there is no study on normalizing of the animal face image with arbitrary directions. In this study, we developed a light-weight angle detection and region-based convolutional network (LAD-RCNN) containing a new rotation angle coding method that can detect the rotation angle and the location of animal face in one-stage. LAD-RCNN has a frame rate of 72.74 FPS (including all steps) on a single GeForce RTX 2080 Ti GPU. LAD-RCNN has been evaluated on multiple dataset including goat dataset and gaot infrared image. Evaluation result show that the AP of face detection was more than 95% and the deviation between the detected rotation angle and the ground-truth rotation angle were less than 0.036 (i.e. 6.48{\deg}) on all the test dataset. This shows that LAD-RCNN has excellent performance on livestock face and its direction detection, and therefore it is very suitable for livestock face detection and Normalizing. Code is available at https://github.com/SheepBreedingLab-HZAU/LAD-RCNN/Comment: 8 figures, 5 table

    Optimizing the design of nanostructures for improved thermal conduction within confined spaces

    Get PDF
    Maintaining constant temperature is of particular importance to the normal operation of electronic devices. Aiming at the question, this paper proposes an optimum design of nanostructures made of high thermal conductive nanomaterials to provide outstanding heat dissipation from the confined interior (possibly nanosized) to the micro-spaces of electronic devices. The design incorporates a carbon nanocone for conducting heat from the interior to the exterior of a miniature electronic device, with the optimum diameter, D0, of the nanocone satisfying the relationship: D02(x) ∝ x1/2 where x is the position along the length direction of the carbon nanocone. Branched structure made of single-walled carbon nanotubes (CNTs) are shown to be particularly suitable for the purpose. It was found that the total thermal resistance of a branched structure reaches a minimum when the diameter ratio, β* satisfies the relationship: β* = γ-0.25bN-1/k*, where γ is ratio of length, b = 0.3 to approximately 0.4 on the single-walled CNTs, b = 0.6 to approximately 0.8 on the multiwalled CNTs, k* = 2 and N is the bifurcation number (N = 2, 3, 4 ...). The findings of this research provide a blueprint in designing miniaturized electronic devices with outstanding heat dissipation

    Learning Profitable NFT Image Diffusions via Multiple Visual-Policy Guided Reinforcement Learning

    Full text link
    We study the task of generating profitable Non-Fungible Token (NFT) images from user-input texts. Recent advances in diffusion models have shown great potential for image generation. However, existing works can fall short in generating visually-pleasing and highly-profitable NFT images, mainly due to the lack of 1) plentiful and fine-grained visual attribute prompts for an NFT image, and 2) effective optimization metrics for generating high-quality NFT images. To solve these challenges, we propose a Diffusion-based generation framework with Multiple Visual-Policies as rewards (i.e., Diffusion-MVP) for NFT images. The proposed framework consists of a large language model (LLM), a diffusion-based image generator, and a series of visual rewards by design. First, the LLM enhances a basic human input (such as "panda") by generating more comprehensive NFT-style prompts that include specific visual attributes, such as "panda with Ninja style and green background." Second, the diffusion-based image generator is fine-tuned using a large-scale NFT dataset to capture fine-grained image styles and accessory compositions of popular NFT elements. Third, we further propose to utilize multiple visual-policies as optimization goals, including visual rarity levels, visual aesthetic scores, and CLIP-based text-image relevances. This design ensures that our proposed Diffusion-MVP is capable of minting NFT images with high visual quality and market value. To facilitate this research, we have collected the largest publicly available NFT image dataset to date, consisting of 1.5 million high-quality images with corresponding texts and market values. Extensive experiments including objective evaluations and user studies demonstrate that our framework can generate NFT images showing more visually engaging elements and higher market value, compared with SOTA approaches

    MM-Diffusion: Learning Multi-Modal Diffusion Models for Joint Audio and Video Generation

    Full text link
    We propose the first joint audio-video generation framework that brings engaging watching and listening experiences simultaneously, towards high-quality realistic videos. To generate joint audio-video pairs, we propose a novel Multi-Modal Diffusion model (i.e., MM-Diffusion), with two-coupled denoising autoencoders. In contrast to existing single-modal diffusion models, MM-Diffusion consists of a sequential multi-modal U-Net for a joint denoising process by design. Two subnets for audio and video learn to gradually generate aligned audio-video pairs from Gaussian noises. To ensure semantic consistency across modalities, we propose a novel random-shift based attention block bridging over the two subnets, which enables efficient cross-modal alignment, and thus reinforces the audio-video fidelity for each other. Extensive experiments show superior results in unconditional audio-video generation, and zero-shot conditional tasks (e.g., video-to-audio). In particular, we achieve the best FVD and FAD on Landscape and AIST++ dancing datasets. Turing tests of 10k votes further demonstrate dominant preferences for our model. The code and pre-trained models can be downloaded at https://github.com/researchmm/MM-Diffusion.Comment: Accepted by CVPR 202

    A PolSAR Image Segmentation Algorithm Based on Scattering Characteristics and the Revised Wishart Distance

    No full text
    A novel segmentation algorithm for polarimetric synthetic aperture radar (PolSAR) images is proposed in this paper. The method is composed of two essential components: a merging order and a merging predicate. The similarity measured by the complex-kind Hotelling–Lawley trace (HLT) statistic is used to decide the merging order. The merging predicate is determined by the scattering characteristics and the revised Wishart distance between adjacent pixels, which can greatly improve the performance in speckle suppression and detail preservation. A postprocessing step is applied to obtain a satisfactory result after the merging operation. The decomposition and merging processes are iteratively executed until the termination criterion is met. The superiority of the proposed method was verified with experiments on two RADARSAT-2 PolSAR images and a Gaofen-3 PolSAR image, which demonstrated that the proposed method can obtain more accurate segmentation results and shows a better performance in speckle suppression and detail preservation than the other algorithms

    Tailoring Nonlinear Metamaterials for the Controlling of Spatial Quantum Entanglement

    No full text
    The high designability of metamaterials has made them an attractive platform for devising novel optoelectronic devices. The demonstration of nonlinear metamaterials further indicates their potential in developing quantum applications. Here, we investigate designing nonlinear metamaterials consisting of the 3-fold (C3) rotationally symmetrical nanoantennas for generating and modulating entangled photons in the spatial degrees of freedom. Through tailoring the geometry and orientation of the nanoantennas, the parametric down conversion process inside the metamaterials can be locally engineered to generate entangled states with desired spatial properties. As the orbital angular momentum (OAM) states are valuable for enhancing the data capacity of quantum information systems, the photonic OAM entanglement is practically considered. With suitable nanostructure design, the generation of OAM entangled states is shown to be effectively realized in the discussed nonlinear metamaterial system. The nonlinear metamaterials present a perspective to provide a flexible platform for quantum photonic applications

    Optimizing the design of nanostructures for improved thermal conduction within confined spaces

    No full text
    <p>Abstract</p> <p>Maintaining constant temperature is of particular importance to the normal operation of electronic devices. Aiming at the question, this paper proposes an optimum design of nanostructures made of high thermal conductive nanomaterials to provide outstanding heat dissipation from the confined interior (possibly nanosized) to the micro-spaces of electronic devices. The design incorporates a carbon nanocone for conducting heat from the interior to the exterior of a miniature electronic device, with the optimum diameter, <it>D</it> <sub>0</sub>, of the nanocone satisfying the relationship: <it>D<sub>0</sub> <sup>2</sup> </it>(<it>x</it>) &#8733; <it>x</it> <sup>1/2 </sup>where <it>x </it>is the position along the length direction of the carbon nanocone. Branched structure made of single-walled carbon nanotubes (CNTs) are shown to be particularly suitable for the purpose. It was found that the total thermal resistance of a branched structure reaches a minimum when the diameter ratio, <it>&#946;* </it>satisfies the relationship: <it>&#946;* </it>= <it>&#947;</it> <sup>-0.25<it>b</it> </sup> <it>N</it> <sup>-1<it>/k*</it> </sup>, where <it>&#947; </it>is ratio of length, <it>b </it>= 0.3 to approximately 0.4 on the single-walled CNTs, <it>b </it>= 0.6 to approximately 0.8 on the multiwalled CNTs, <it>k</it>* = 2 and <it>N </it>is the bifurcation number (<it>N </it>= 2, 3, 4 ...). The findings of this research provide a blueprint in designing miniaturized electronic devices with outstanding heat dissipation.</p> <p>PACS numbers: 44.10.+i, 44.05.+e, 66.70.-f, 61.48.De</p

    CO&lt;inf&gt;2&lt;/inf&gt; outgassing from the Yellow River network and its implications for riverine carbon cycle

    No full text
    ©2015. American Geophysical Union. All Rights Reserved.CO2 outgassing across water-air interface is an important, but poorly quantified, component of riverine carbon cycle, largely because the data needed for flux calculations are spatially and temporally sparse. Based on compiled data sets measured throughout the Yellow River watershed and chamber measurements on the main stem, this study investigates CO2 evasion and assesses its implications for riverine carbon cycle. Fluxes of CO2 evasion present significant spatial and seasonal variations. High effluxes are estimated in regions with intense rock weathering or severe soil erosion that mobilizes organic carbon into the river network. By integrating seasonal changes of water surface area and gas transfer velocity (k), the CO2 efflux is estimated at 7.9±1.2TgCyr-1 with a mean k of 42.1±16.9cmh-1. Unlike in lake and estuarine environments where wind is the main generator of turbulence, k is more correlated with flow velocity changes. CO2 evasion in the Yellow River network constitutes an important pathway in its riverine carbon cycling. Analyzing the watershed-scale carbon budget indicates that 35% of the carbon exported into the Yellow River network from land is degassed during fluvial transport. The CO2 efflux is comparable to the carbon burial rate, while both larger than the fluvial export to the ocean. Comparing CO2 evasion with ecosystem productivity in the Yellow River watershed shows that its ecosystem carbon sink has previously been overestimated by >50%. Present efflux estimates are associated with uncertainty, and future work is needed to mechanistically understand CO2 evasion from the highly turbid waters.Link_to_subscribed_fulltex

    Reconstruction of the first metatarsophalangeal joint by vascular anastomotic transplantation of fibular head: A case report

    No full text
    Foot injury with soft tissue and bone defects is very common, and it is very difficult to reconstruct the irreparable first metatarsophalangeal joint in clinical work. In this paper, partial fibular head free transplantation was used to reconstruct the articular surface defect of the first metatarsal head and restore the first metatarsophalangeal joint in a clinic case. After 18 months of follow-up, the patient achieved satisfactory first metatarsophalangeal joint function
    corecore