72 research outputs found

    Enhancing Diffusion Models with Text-Encoder Reinforcement Learning

    Full text link
    Text-to-image diffusion models are typically trained to optimize the log-likelihood objective, which presents challenges in meeting specific requirements for downstream tasks, such as image aesthetics and image-text alignment. Recent research addresses this issue by refining the diffusion U-Net using human rewards through reinforcement learning or direct backpropagation. However, many of them overlook the importance of the text encoder, which is typically pretrained and fixed during training. In this paper, we demonstrate that by finetuning the text encoder through reinforcement learning, we can enhance the text-image alignment of the results, thereby improving the visual quality. Our primary motivation comes from the observation that the current text encoder is suboptimal, often requiring careful prompt adjustment. While fine-tuning the U-Net can partially improve performance, it remains suffering from the suboptimal text encoder. Therefore, we propose to use reinforcement learning with low-rank adaptation to finetune the text encoder based on task-specific rewards, referred as \textbf{TexForce}. We first show that finetuning the text encoder can improve the performance of diffusion models. Then, we illustrate that TexForce can be simply combined with existing U-Net finetuned models to get much better results without additional training. Finally, we showcase the adaptability of our method in diverse applications, including the generation of high-quality face and hand images

    Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-level Vision

    Full text link
    The rapid evolution of Multi-modality Large Language Models (MLLMs) has catalyzed a shift in computer vision from specialized models to general-purpose foundation models. Nevertheless, there is still an inadequacy in assessing the abilities of MLLMs on low-level visual perception and understanding. To address this gap, we present Q-Bench, a holistic benchmark crafted to systematically evaluate potential abilities of MLLMs on three realms: low-level visual perception, low-level visual description, and overall visual quality assessment. a) To evaluate the low-level perception ability, we construct the LLVisionQA dataset, consisting of 2,990 diverse-sourced images, each equipped with a human-asked question focusing on its low-level attributes. We then measure the correctness of MLLMs on answering these questions. b) To examine the description ability of MLLMs on low-level information, we propose the LLDescribe dataset consisting of long expert-labelled golden low-level text descriptions on 499 images, and a GPT-involved comparison pipeline between outputs of MLLMs and the golden descriptions. c) Besides these two tasks, we further measure their visual quality assessment ability to align with human opinion scores. Specifically, we design a softmax-based strategy that enables MLLMs to predict quantifiable quality scores, and evaluate them on various existing image quality assessment (IQA) datasets. Our evaluation across the three abilities confirms that MLLMs possess preliminary low-level visual skills. However, these skills are still unstable and relatively imprecise, indicating the need for specific enhancements on MLLMs towards these abilities. We hope that our benchmark can encourage the research community to delve deeper to discover and enhance these untapped potentials of MLLMs. Project Page: https://vqassessment.github.io/Q-Bench.Comment: 25 pages, 14 figures, 9 tables, preprint versio

    Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

    Full text link
    Multi-modality foundation models, as represented by GPT-4V, have brought a new paradigm for low-level visual perception and understanding tasks, that can respond to a broad range of natural human instructions in a model. While existing foundation models have shown exciting potentials on low-level visual tasks, their related abilities are still preliminary and need to be improved. In order to enhance these models, we conduct a large-scale subjective experiment collecting a vast number of real human feedbacks on low-level vision. Each feedback follows a pathway that starts with a detailed description on the low-level visual appearance (*e.g. clarity, color, brightness* of an image, and ends with an overall conclusion, with an average length of 45 words. The constructed **Q-Pathway** dataset includes 58K detailed human feedbacks on 18,973 images with diverse low-level appearance. Moreover, to enable foundation models to robustly respond to diverse types of questions, we design a GPT-participated conversion to process these feedbacks into diverse-format 200K instruction-response pairs. Experimental results indicate that the **Q-Instruct** consistently elevates low-level perception and understanding abilities across several foundational models. We anticipate that our datasets can pave the way for a future that general intelligence can perceive, understand low-level visual appearance and evaluate visual quality like a human. Our dataset, model zoo, and demo is published at: https://q-future.github.io/Q-Instruct.Comment: 16 pages, 11 figures, page 12-16 as appendi

    Organic Matter Regulates Ammonia-Oxidizing Bacterial and Archaeal Communities in the Surface Sediments of Ctenopharyngodon idellus Aquaculture Ponds

    Get PDF
    Ammonia-oxidizing bacteria (AOB) and archaea (AOA) play important roles in nitrogen removal in aquaculture ponds, but their distribution and the environmental factors that drive their distribution are largely unknown. In this study, we collected surface sediment samples from Ctenopharyngodon idellus ponds in three different areas in China that practice aquaculture. The community structure of AOB and AOA and physicochemical characteristics in the ponds were investigated. The results showed that AOA were more abundant than AOB in all sampling ponds except one, but sediment AOB and AOA numbers varied greatly between ponds. Correlation analyses indicated a significant correlation between the abundance of AOB and arylsulfatase, as well as the abundance of AOA and total nitrogen (TN) and arylsulfatase. In addition, AOB/AOA ratio was found to be significantly correlated with the microbial biomass carbon. AOB were grouped into seven clusters affiliated to Nitrosospira and Nitrosomonas, and AOA were grouped into six clusters affiliated to Nitrososphaera, Nitrososphaera sister group, and Nitrosopumilus. AOB/AOA diversity in the surface sediments of aquaculture ponds varied according to the levels of total organic carbon (TOC), and AOB and AOA diversity was significantly correlated with arylsulfatase and β-glucosidase, respectively. The compositions of the AOB communities were also found to be significantly influenced by sediment eutrophic status (TOC and TN levels), and pH. In addition, concentrations of acid phosphatase and arylsulfatase in surface sediments were significantly correlated with the prominent bacterial amoA genotypes, and concentrations of TOC and urease were found to be significantly correlated with the prominent archaeal amoA genotype compositions. Taken together, our results indicated that AOB and AOA communities in the surface sediments of Ctenopharyngodon idellus aquaculture ponds are regulated by organic matter and its availability to the microorganisms

    Ultrasensitive piezoelectric sensor based on two-dimensional Na2Cl crystals with periodic atom vacancies

    Full text link
    Pursuing ultrasensitivity of pressure sensors has been a long-standing goal. Here, we report a piezoelectric sensor that exhibits supreme pressure-sensing performance, including a peak sensitivity up to 3.5*10^6 kPa^-1 in the pressure range of 1-100 mPa and a detection limit of less than 1 mPa, superior to the current state-of-the-art pressure sensors. These properties are attributed to the high percentage of periodic atom vacancies in the two-dimensional Na2Cl crystals formed within multilayered graphene oxide membrane in the sensor, which provides giant polarization with high stability. The sensor can even clearly detect the airflow fluctuations surrounding a flapping butterfly, which have long been the elusive tiny signals in the famous "butterfly effect". The finding represents a step towards next-generation pressure sensors for various precision applications

    Effect of Stress Amplitude on the Damping of Recycled Aggregate Concrete

    No full text
    Damping characterizes the energy dissipation capacity of materials and structures, and it is affected by several external factors such as vibrating frequency, stress history, temperature, and stress amplitude. This study investigates the relationship between the damping and the stress amplitude of environment-friendly recycled aggregate concrete (RAC). First, a function model of a member’s loss factor and stress amplitude was derived based on Lazan’s damping-stress function. Then, the influence of stress amplitude on the loss tangent of RAC was experimentally investigated. Finally, parameters used to determine the newly derived function were obtained by numerical fitting. It is shown that the member’s loss factor is affected not only by the stress amplitude but also by factors such as the cross section shapes, boundary conditions, load types, and loading positions. The loss tangent of RAC increases with the stress amplitude, even at low stress amplitude. The damping energy exponent of RAC is not identically equal to 2.0, indicating that the damping is nonlinear. It is also found that the energy dissipation capacity of RAC is superior to that of natural aggregate concrete (NAC), and the energy dissipation capacity can be further improved by adding modified admixtures

    Optimal Control for Hybrid Energy Storage Electric Vehicle to Achieve Energy Saving Using Dynamic Programming Approach

    No full text
    In this paper, the efficiency characteristics of battery, super capacitor (SC), direct current (DC)-DC converter and electric motor in a hybrid power system of an electric vehicle (EV) are analyzed. In addition, the optimal efficiency model of the hybrid power system is proposed based on the hybrid power system component’s models. A rule-based strategy is then proposed based on the projection partition of composite power system efficiency, so it has strong adaptive adjustment ability. Additionally. the simulation results under the New European Driving Cycle (NEDC) condition show that the efficiency of rule-based strategy is higher than that of single power system. Furthermore, in order to explore the maximum energy-saving potential of hybrid power electric vehicles, a dynamic programming (DP) optimization method is proposed on the basis of the establishment of the whole hybrid power system, which takes into account various energy consumption factors of the whole system. Compared to the battery-only EV based on simulation results, the hybrid power system controlled by rule-based strategy can decrease energy consumption by 13.4% in line with the NEDC condition, while the power-split strategy derived from the DP approach can reduce energy consumption by 17.6%. The results show that compared with rule-based strategy, the optimized DP strategy has higher system efficiency and lower energy consumption

    STUDY ON DAMAGE ASSESSMENT FOR BEAM TYPE STRUCTURES BY VIRTUAL LOAD METHOD

    No full text
    The virtual load method is studied in this paper for damage assessment of beam type structures. The main advantage of this technique is that damage assessment can be accomplished by using only the vibration modes of the current structure. A series of virtual load vectors are constructed to obtain the same pure bending states for different segments of the beam. And then the curvatures of these segments loaded in pure bending are compared to identify the damage locations and extents. It was found that the proposed method is very suitable for damage assessment of those existing beam type structures

    Effect of Stress Amplitude on the Damping of Recycled Aggregate Concrete

    No full text
    Damping characterizes the energy dissipation capacity of materials and structures, and it is affected by several external factors such as vibrating frequency, stress history, temperature, and stress amplitude. This study investigates the relationship between the damping and the stress amplitude of environment-friendly recycled aggregate concrete (RAC). First, a function model of a member’s loss factor and stress amplitude was derived based on Lazan’s damping-stress function. Then, the influence of stress amplitude on the loss tangent of RAC was experimentally investigated. Finally, parameters used to determine the newly derived function were obtained by numerical fitting. It is shown that the member’s loss factor is affected not only by the stress amplitude but also by factors such as the cross section shapes, boundary conditions, load types, and loading positions. The loss tangent of RAC increases with the stress amplitude, even at low stress amplitude. The damping energy exponent of RAC is not identically equal to 2.0, indicating that the damping is nonlinear. It is also found that the energy dissipation capacity of RAC is superior to that of natural aggregate concrete (NAC), and the energy dissipation capacity can be further improved by adding modified admixtures

    Reuse of engineering waste soil and recycled fine aggregate to manufacture eco-friendly unfired clay bricks: Experimental assessment, data-driven modeling and environmental friendliness evaluation

    No full text
    In order to explore the possible application of engineering waste soil (EWS) and recycled fine aggregate (RFA) in cement-based unfired clay bricks (CUCB), this paper utilizes the orthogonal experiment to investigate the combined utilization of EWS and RFA to fabricate eco-friendly CUCB. The study employs an L9 (34) orthogonal table with four factors and three levels for designing mix proportions. The factors include the ratios of water-to-cement (w/c), cement-to-EWS (c/e), RFA-to-EWS (r/e) and additive content. CUCB were tested for density, compressive strength, and flexural strength; and the impact of each factor on these properties was analyzed. The test results reveal that the eco-friendly CUCB exhibit lightweight characteristics and desirable mechanical properties. The ratio of flexural strength-to-compressive strength for eco-friendly CUCB ranges from 0.23 to 0.28, indicating that eco-friendly CUCB have good toughness. Data-driven models were developed to construct the relationships between target properties (i.e., density, compressive strength, and flexural strength) and four factors. The optimal mix proportion for physical and mechanical properties was determined to be w/c= 0.55, c/e = 0.3, r/e = 0.4 and additive content= 10%, with predicted compressive and flexural strengths of 19.68 MPa and 5.19 MPa, respectively. In addition, Scanning Electron Microscope (SEM) was performed to figure out the strength enhancement mechanism of eco-friendly CUCB with varying mix proportions. Environmental friendliness evaluation shows that the optimal mix pro-portion is more environmentally friendly to fabricate CUCB from the perspective of strength-normalized carbon footprint and energy consumption. Using large quantity of cement cannot increase the compressive strength and only acts as filler in the microstructure of eco-friendly CUCB
    • …
    corecore