21 research outputs found

    Vision Sensor based Action Recognition for Improving Efficiency and Quality under the Environment of Industry 4.0

    Get PDF
    In the environment of industry 4.0, human beings are still an important influencing factor of efficiency and quality which are the core of product life cycle management. Hence, monitoring and analyzing humans\u27 actions are essential. This paper proposes a vision sensor based method to evaluate the accuracy of operators\u27 actions. Each action of operators is recognized in real time by a Convolutional Neural Network (CNN) based classification model in which hierarchical clustering is introduced to minimize the effects of action uncertainty. Warnings are triggered when incorrect actions occur in real time and applications of action analysis of workers on a reducer assembling line show the effectiveness of the proposed method. The research is expected to provide a guidance for operators to correct their actions to reduce the cost of quality defects and improve the efficiency of workforce

    Energy Optimization in Multi-UAV-Assisted Edge Data Collection System

    Get PDF
    In the IoT (Internet of Things) system, the introduction of UAV (Unmanned Aerial Vehicle) as a new data collection platform can solve the problem that IoT devices are unable to transmit data over long distances due to the limitation of their battery energy. However, the unreasonable distribution of UAVs will still lead to the problem of the high total energy consumption of the system. In this work, to deal with the problem, a deployment model of a mobile edge computing (MEC) system based on multi-UAV is proposed. The goal of the model is to minimize the energy consumption of the system in the process of data transmission by optimizing the deployment of UAVs. The DEVIPSK (differential evolution algorithm with variable population size based on a mutation strategy pool initialized by K-Means) is proposed to solve the model. In DEVIPSK, the population is initialized by K-Means to obtain better initial positions of UAVs. Besides, considering the limitation of the fixed mutation strategy in the traditional evolutionary algorithm, a mutation strategy pool is used to update the positions of UAVs. The experimental results show the superiority of the DEVIPSK and provide guidance for the deployment of UAVs in the field of edge data collection in the IoT system

    Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification

    Full text link
    Recent progress in large language models (LLMs) like GPT-4 and PaLM-2 has brought significant advancements in addressing math reasoning problems. In particular, OpenAI's latest version of GPT-4, known as GPT-4 Code Interpreter, shows remarkable performance on challenging math datasets. In this paper, we explore the effect of code on enhancing LLMs' reasoning capability by introducing different constraints on the \textit{Code Usage Frequency} of GPT-4 Code Interpreter. We found that its success can be largely attributed to its powerful skills in generating and executing code, evaluating the output of code execution, and rectifying its solution when receiving unreasonable outputs. Based on this insight, we propose a novel and effective prompting method, explicit \uline{c}ode-based \uline{s}elf-\uline{v}erification~(CSV), to further boost the mathematical reasoning potential of GPT-4 Code Interpreter. This method employs a zero-shot prompt on GPT-4 Code Interpreter to encourage it to use code to self-verify its answers. In instances where the verification state registers as ``False'', the model shall automatically amend its solution, analogous to our approach of rectifying errors during a mathematics examination. Furthermore, we recognize that the states of the verification result indicate the confidence of a solution, which can improve the effectiveness of majority voting. With GPT-4 Code Interpreter and CSV, we achieve an impressive zero-shot accuracy on MATH dataset \textbf{(53.9\% \to 84.3\%)}.Comment: Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verificatio

    Layered Rendering Diffusion Model for Zero-Shot Guided Image Synthesis

    Full text link
    This paper introduces innovative solutions to enhance spatial controllability in diffusion models reliant on text queries. We present two key innovations: Vision Guidance and the Layered Rendering Diffusion (LRDiff) framework. Vision Guidance, a spatial layout condition, acts as a clue in the perturbed distribution, greatly narrowing down the search space, to focus on the image sampling process adhering to the spatial layout condition. The LRDiff framework constructs an image-rendering process with multiple layers, each of which applies the vision guidance to instructively estimate the denoising direction for a single object. Such a layered rendering strategy effectively prevents issues like unintended conceptual blending or mismatches, while allowing for more coherent and contextually accurate image synthesis. The proposed method provides a more efficient and accurate means of synthesising images that align with specific spatial and contextual requirements. We demonstrate through our experiments that our method provides better results than existing techniques both quantitatively and qualitatively. We apply our method to three practical applications: bounding box-to-image, semantic mask-to-image and image editing

    JourneyDB: A Benchmark for Generative Image Understanding

    Full text link
    While recent advancements in vision-language models have had a transformative impact on multi-modal comprehension, the extent to which these models possess the ability to comprehend generated images remains uncertain. Synthetic images, in comparison to real data, encompass a higher level of diversity in terms of both content and style, thereby presenting significant challenges for the models to fully grasp. In light of this challenge, we introduce a comprehensive dataset, referred to as JourneyDB, that caters to the domain of generative images within the context of multi-modal visual understanding. Our meticulously curated dataset comprises 4 million distinct and high-quality generated images, each paired with the corresponding text prompts that were employed in their creation. Furthermore, we additionally introduce an external subset with results of another 22 text-to-image generative models, which makes JourneyDB a comprehensive benchmark for evaluating the comprehension of generated images. On our dataset, we have devised four benchmarks to assess the performance of generated image comprehension in relation to both content and style interpretation. These benchmarks encompass prompt inversion, style retrieval, image captioning, and visual question answering. Lastly, we evaluate the performance of state-of-the-art multi-modal models when applied to the JourneyDB dataset, providing a comprehensive analysis of their strengths and limitations in comprehending generated content. We anticipate that the proposed dataset and benchmarks will facilitate further research in the field of generative content understanding. The dataset is publicly available at https://journeydb.github.io.Comment: Accepted to the Thirty-seventh Conference on Neural Information Processing Systems (NeurIPS 2023

    Numerical Simulation of Thermal Conductivity of Foam Glass Based on the Steady-State Method

    No full text
    The effects of fly ash, sodium carbonate content, foaming temperature and foaming time on foam glass aperture sizes and their distribution were analyzed by the orthogonal experimental design. Results from the steady-state method showed a normal distribution of the number of apertures with change in average aperture, which ranges from 0.1 to 2.0 mm for more than 93% of apertures. For a given porosity, the thermal conductivity decreases with the increase of the aperture size. The apertures in the sample have obvious effects in blocking the heat flow transmission: heat flow is quickly diverted to both sides when encountered with the aperture. When the thickness of the sample is constant, the thermal resistance of the foam glass sample increases with increasing porosity, leading to better thermal insulation. Furthermore, our results suggest that the more evenly distributed and orderly arranged the apertures are in the foam glass material, the larger the thermal resistance of the material and hence, the better the thermal insulation

    Optimization of preparation process and performance analysis of fly ash foam glass

    No full text
    Foam glass was prepared with fly ash and glass powder as main raw materials, sodium carbonate as foaming agent, and trisodium phosphate as suds-stabilizing agent. The influence of the amount of fly ash and sodium carbonate, foaming temperature and foaming time on the compressive strength, flexural strength, apparent density and thermal conductivity of foam glass was studied by orthogonal experiment and the optimum technological conditions for preparing foam glass were obtained. The pore structure, morphology, pore size distribution, morphology and crystal precipitation of foam glass were investigated by means of Occhio Scan v750, Nano Measurer, SEM and XRD. The result shows that the amount of fly ash has a significant influence on the mechanical and thermal conductivity of foam glass, the foaming temperature has the greatest influence on the apparent density, and the influence of sodium carbonate content on the average pore size is the most obvious. The pore numbers of 9 sets of samples are approximately normal distribution with the change of average pore sizes, and the average pore sizes of 0.1-2.0mm exceed 85%. There is a certain amount of crystal precipitating inside the foam glass and the major and secondary crystalline phase are nepheline and diopside respectively

    Improving Genomic Prediction Accuracy in the Chinese Holstein Population by Combining with the Nordic Holstein Reference Population

    No full text
    The size of the reference population is critical in order to improve the accuracy of genomic prediction. Indeed, improving genomic prediction accuracy by combining multinational reference populations has proven to be effective. In this study, we investigated the improvement of genomic prediction accuracy in seven complex traits (i.e., milk yield; fat yield; protein yield; somatic cell count; body conformation; feet and legs; and mammary system conformation) by combining the Chinese and Nordic Holstein reference populations. The estimated genetic correlations between the Chinese and Nordic Holstein populations are high with respect to protein yield, fat yield, and milk yield—whereby these correlations range from 0.621 to 0.720—and are moderate with respect to somatic cell count (0.449), but low for the three conformation traits (which range from 0.144 to 0.236). When utilizing the joint reference data and a two-trait GBLUP model, the genomic prediction accuracy in the Chinese Holsteins improves considerably with respect to the traits with moderate-to-high genetic correlations, whereas the improvement in Nordic Holsteins is small. When compared with the single population analysis, using the joint reference population for genomic prediction in younger animals, results in a 2.3 to 8.1 percent improvement in accuracy. Meanwhile, 10 replications of five-fold cross-validation were also implemented in order to evaluate the performance of joint genomic prediction, thereby resulting in a 1.6 to 5.2 percent increase in accuracy. With respect to joint genomic prediction, the bias was found to be quite low. However, for traits with low genetic correlations, the joint reference data do not improve the prediction accuracy substantially for either population

    Pyramid Fusion Transformer for Semantic Segmentation

    Full text link
    The recently proposed MaskFormer gives a refreshed perspective on the task of semantic segmentation: it shifts from the popular pixel-level classification paradigm to a mask-level classification method. In essence, it generates paired probabilities and masks corresponding to category segments and combines them during inference for the segmentation maps. In our study, we find that per-mask classification decoder on top of a single-scale feature is not effective enough to extract reliable probability or mask. To mine for rich semantic information across the feature pyramid, we propose a transformer-based Pyramid Fusion Transformer (PFT) for per-mask approach semantic segmentation with multi-scale features. The proposed transformer decoder performs cross-attention between the learnable queries and each spatial feature from the feature pyramid in parallel and uses cross-scale inter-query attention to exchange complimentary information. We achieve competitive performance on three widely used semantic segmentation datasets. In particular, on ADE20K validation set, our result with Swin-B backbone surpasses that of MaskFormer's with a much larger Swin-L backbone in both single-scale and multi-scale inference, achieving 54.1 mIoU and 55.7 mIoU respectively. Using a Swin-L backbone, we achieve single-scale 56.1 mIoU and multi-scale 57.4 mIoU, obtaining state-of-the-art performance on the dataset. Extensive experiments on three widely used semantic segmentation datasets verify the effectiveness of our proposed method

    Mechanical Properties and Constitutive Model of the Cement-Improved Loess under Freeze-Thaw Conditions

    No full text
    Cement-improved loess (CIL) is used as a common filler for subgrade construction projects in loess areas. The freeze-thaw (F-T) conditions have a significant effect on the stability of cement-improved loess subgrades in seasonally frozen regions. In this paper, the CIL samples, experiencing different numbers of F-T cycles at varying freezing temperatures, were used in consolidated undrained triaxial compression tests to investigate the effect of F-T conditions on the mechanical properties of CIL. The results show the stress-strain curves of CIL are of a strain-softening type with strong elastic brittleness. The initial tangent modulus of CIL increases with the growing confining pressure and gradually decreases with the increase in the F-T cycle number and the decreasing freezing temperature. It loses 46.4% of its original value after the twelfth F-T cycle with the confining pressure of 150 kPa and at the freezing temperature of −15 °C. The strength of CIL decreases with the increasing F-T cycle number, but it gradually tends to keep stable after the sixth F-T cycle. The strength also decreases with the reduction in the freezing temperature. It loses 37.7% of its original value after the twelfth F-T cycle with the confining pressure of 150 kPa and the freezing temperature of −15 °C. To express the nonlinearity correlation between the strength and confining pressure under F-T conditions, the Weibull function was applied and a nonlinear Mohr-Coulomb strength criterion was proposed. Through introducing a breakage rate function and a local strain coefficient, a binary-medium constitutive model consisting of bonded elements (soil-particle cohesion) and frictional elements (soil particles or soil aggregations) was established to describe the stress-strain relationships of CIL under F-T conditions. The test results indicated that the model can well describe the strain-softening phenomenon of the stress-strain curve of CIL and reflect the breakage mechanism of CIL
    corecore