43 research outputs found

    AI-Generated Images as Data Source: The Dawn of Synthetic Era

    Full text link
    The advancement of visual intelligence is intrinsically tethered to the availability of large-scale data. In parallel, generative Artificial Intelligence (AI) has unlocked the potential to create synthetic images that closely resemble real-world photographs. This prompts a compelling inquiry: how much visual intelligence could benefit from the advance of generative AI? This paper explores the innovative concept of harnessing these AI-generated images as new data sources, reshaping traditional modeling paradigms in visual intelligence. In contrast to real data, AI-generated data exhibit remarkable advantages, including unmatched abundance and scalability, the rapid generation of vast datasets, and the effortless simulation of edge cases. Built on the success of generative AI models, we examine the potential of their generated data in a range of applications, from training machine learning models to simulating scenarios for computational modeling, testing, and validation. We probe the technological foundations that support this groundbreaking use of generative AI, engaging in an in-depth discussion on the ethical, legal, and practical considerations that accompany this transformative paradigm shift. Through an exhaustive survey of current technologies and applications, this paper presents a comprehensive view of the synthetic era in visual intelligence. A project associated with this paper can be found at https://github.com/mwxely/AIGS .Comment: 20 pages, 11 figure

    StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting

    Full text link
    We introduce StyleGaussian, a novel 3D style transfer technique that allows instant transfer of any image's style to a 3D scene at 10 frames per second (fps). Leveraging 3D Gaussian Splatting (3DGS), StyleGaussian achieves style transfer without compromising its real-time rendering ability and multi-view consistency. It achieves instant style transfer with three steps: embedding, transfer, and decoding. Initially, 2D VGG scene features are embedded into reconstructed 3D Gaussians. Next, the embedded features are transformed according to a reference style image. Finally, the transformed features are decoded into the stylized RGB. StyleGaussian has two novel designs. The first is an efficient feature rendering strategy that first renders low-dimensional features and then maps them into high-dimensional features while embedding VGG features. It cuts the memory consumption significantly and enables 3DGS to render the high-dimensional memory-intensive features. The second is a K-nearest-neighbor-based 3D CNN. Working as the decoder for the stylized features, it eliminates the 2D CNN operations that compromise strict multi-view consistency. Extensive experiments show that StyleGaussian achieves instant 3D stylization with superior stylization quality while preserving real-time rendering and strict multi-view consistency. Project page: https://kunhao-liu.github.io/StyleGaussian

    DivAvatar: Diverse 3D Avatar Generation with a Single Prompt

    Full text link
    Text-to-Avatar generation has recently made significant strides due to advancements in diffusion models. However, most existing work remains constrained by limited diversity, producing avatars with subtle differences in appearance for a given text prompt. We design DivAvatar, a novel framework that generates diverse avatars, empowering 3D creatives with a multitude of distinct and richly varied 3D avatars from a single text prompt. Different from most existing work that exploits scene-specific 3D representations such as NeRF, DivAvatar finetunes a 3D generative model (i.e., EVA3D), allowing diverse avatar generation from simply noise sampling in inference time. DivAvatar has two key designs that help achieve generation diversity and visual quality. The first is a noise sampling technique during training phase which is critical in generating diverse appearances. The second is a semantic-aware zoom mechanism and a novel depth loss, the former producing appearances of high textual fidelity by separate fine-tuning of specific body parts and the latter improving geometry quality greatly by smoothing the generated mesh in the features space. Extensive experiments show that DivAvatar is highly versatile in generating avatars of diverse appearances

    Pose-Free Neural Radiance Fields via Implicit Pose Regularization

    Full text link
    Pose-free neural radiance fields (NeRF) aim to train NeRF with unposed multi-view images and it has achieved very impressive success in recent years. Most existing works share the pipeline of training a coarse pose estimator with rendered images at first, followed by a joint optimization of estimated poses and neural radiance field. However, as the pose estimator is trained with only rendered images, the pose estimation is usually biased or inaccurate for real images due to the domain gap between real images and rendered images, leading to poor robustness for the pose estimation of real images and further local minima in joint optimization. We design IR-NeRF, an innovative pose-free NeRF that introduces implicit pose regularization to refine pose estimator with unposed real images and improve the robustness of the pose estimation for real images. With a collection of 2D images of a specific scene, IR-NeRF constructs a scene codebook that stores scene features and captures the scene-specific pose distribution implicitly as priors. Thus, the robustness of pose estimation can be promoted with the scene priors according to the rationale that a 2D real image can be well reconstructed from the scene codebook only when its estimated pose lies within the pose distribution. Extensive experiments show that IR-NeRF achieves superior novel view synthesis and outperforms the state-of-the-art consistently across multiple synthetic and real datasets.Comment: Accepted by ICCV202

    Weakly Supervised 3D Open-vocabulary Segmentation

    Full text link
    Open-vocabulary segmentation of 3D scenes is a fundamental function of human perception and thus a crucial objective in computer vision research. However, this task is heavily impeded by the lack of large-scale and diverse 3D open-vocabulary segmentation datasets for training robust and generalizable models. Distilling knowledge from pre-trained 2D open-vocabulary segmentation models helps but it compromises the open-vocabulary feature as the 2D models are mostly finetuned with close-vocabulary datasets. We tackle the challenges in 3D open-vocabulary segmentation by exploiting pre-trained foundation models CLIP and DINO in a weakly supervised manner. Specifically, given only the open-vocabulary text descriptions of the objects in a scene, we distill the open-vocabulary multimodal knowledge and object reasoning capability of CLIP and DINO into a neural radiance field (NeRF), which effectively lifts 2D features into view-consistent 3D segmentation. A notable aspect of our approach is that it does not require any manual segmentation annotations for either the foundation models or the distillation process. Extensive experiments show that our method even outperforms fully supervised models trained with segmentation annotations in certain scenes, suggesting that 3D open-vocabulary segmentation can be effectively learned from 2D images and text-image pairs. Code is available at \url{https://github.com/Kunhao-Liu/3D-OVS}.Comment: Accepted to NeurIPS 202

    Never Lost in the Middle: Improving Large Language Models via Attention Strengthening Question Answering

    Full text link
    While large language models (LLMs) are equipped with longer text input capabilities than before, they are struggling to seek correct information in long contexts. The "lost in the middle" problem challenges most LLMs, referring to the dramatic decline in accuracy when correct information is located in the middle. To overcome this crucial issue, this paper proposes to enhance the information searching and reflection ability of LLMs in long contexts via specially designed tasks called Attention Strengthening Multi-doc QA (ASM QA). Following these tasks, our model excels in focusing more precisely on the desired information. Experimental results show substantial improvement in Multi-doc QA and other benchmarks, superior to state-of-the-art models by 13.7% absolute gain in shuffled settings, by 21.5% in passage retrieval task. We release our model, Ziya-Reader to promote related research in the community

    Preoperative Strength Training for Clinical Outcomes Before and After Total Knee Arthroplasty: A Systematic Review and Meta-Analysis

    Get PDF
    BackgroundThere is an increasing interest in preoperative strength training for promoting post-operative rehabilitation, but the effectiveness of preoperative strength training for clinical outcomes after total knee arthroplasty (TKA) remains controversial.ObjectiveThis study aims to systematically evaluate the effect of preoperative strength training on clinical outcomes before and after TKA.MethodsWe systematically searched PubMed, Cochrane Library, Web of Science, and EMBASE databases from the inception to November 17, 2021. The meta-analysis was performed to evaluate the effects of preoperative strength training on clinical outcomes before and after TKA.ResultsSeven randomized controlled trials (RCTs) were included (n = 306). Immediately before TKA, the pooled results showed significant improvements in pain, knee function, functional ability, stiffness, and physical function in the strength training group compared with the control group, but not in strength (quadriceps), ROM, and WOMAC (total). Compared with the control group, the results indicated strength training had a statistically significant improvement in post-operative knee function, ROM, and functional ability at less than 1 month and 3 months, and had a statistically significant improvement in post-operative strength (quadriceps), stiffness, and WOMAC (total) at 3 months, and had a statistically significant improvement in post-operative pain at 6 months. However, the results indicated strength training had no statistically significant improvement in post-operative strength (quadriceps) at less than 1 month, 6, and 12 months, had no statistically significant improvement in post-operative pain at less than 1 month, 3, and 12 months, had no statistically significant improvement in post-operative knee function at 6 and 12 months, and had no statistically significant improvement in post-operative physical function at 3 months.ConclusionsPreoperative strength training may be beneficial to early rehabilitation after TKA, but the long-term efficacy needs to be further determined. At the same time, more caution should be exercised when interpreting the clinical efficacy of preoperative strength training for TKA

    Big data research guided by sociological theory: a triadic dialogue among big data analysis, theory, and predictive models

    Full text link
    Abstract Computational social science has integrated social science theories and methodology with big data analysis. It has opened a number of new topics for big data analysis and enabled qualitative and quantitative sociological research to provide the ground truth for testing the results of data mining. At the same time, threads of evidence obtained by data mining can inform the development of theory and thereby guide the construction of predictive models to infer and explain more phenomena. Using the example of the Internet data of China’s venture capital industry, this paper shows the triadic dialogue among data mining, sociological theory, and predictive models and forms a methodology of big data analysis guided by sociological theories

    Report on the development of EU law in 2022 : digital and green transition, supply chain law, and reciprocal market openness

    Full text link
    Published: 5 December 2023The transformation of the EU advanced amidst crises and challenges in 2022. This report is an overview of the development of EU law in 2022, focusing on three dimensions: the digital and green transition, the supply chain law, and reciprocal market openness. In the area of digital transition, the EU strengthened the legal instruments to address emerging issues of competition and data barriers arising from data regulation and the digital market. With regard to green transition, the EU put forward a set of legislative proposals on the green economy in relation to climate objectives, such as the renewed EU Emission Trade System and the Carbon Border Adjustment Mechanism. In terms of the supply chain law, the EU relied on normative power to exert influence on global supply chains and introduced unilateral legal instruments infused with extraterritorial effects, including the Corporate Sustainability Due Diligence legislation and the Forced Labour Ban. It also proposed the Single Market Emergency Instrument and the Chip Act to strengthen the resilience of the internal market and its supply chains. Besides, the EU emphasises reciprocal market openness based on level-playing competition in the internal market. The EU’s International Public Procurement Instrument and the Foreign Subsidies Regulation entered into force, reinforcing its regulatory toolkit and enhancing the thresholds for market access. For bilateral relations, the EU proceeded with the ongoing negotiations on free trade agreements with India and New Zealand, respectively. At the multilateral level, the EU participated in the concluded negotiation on modernising the Energy Charter Treaty, which was later on hold due to disagreements raised by some member states

    Analysis of the Refrigeration Performance of the Refrigerated Warehouse with Ice Thermal Energy Storage Driven Directly by Variable Photovoltaic Capacity

    Full text link
    An independent solar photovoltaic (PV) refrigerated warehouse system with ice thermal energy storage is constructed in this paper. In this system, the vapour compression refrigeration cycle is directly driven by a PV array, and the frequency of the compressor varies with the solar radiation intensity. The refrigeration performance and the matching characteristics of the system driven by different PV capacities are studied. The results show that the intensity of solar radiation required for the compressor to work at the same frequency decreases by approximately 7.8% when the ratio of PV capacity to compressor-rated power increases by 10%, and the time required for the temperature in the refrigerated warehouse to drop from ambient temperature to 0°C is reduced by 32 min on average. The energy efficiency ratio of the vapour compression refrigeration subsystem and the coefficient of performance (COP) of the refrigerated warehouse system increase with the ratio of PV capacity to compressor-rated power α. When α increases from 1 to 1.3, the growth rate of the COP is very slow. For the PV direct-drive refrigerated warehouse system with a compressor-rated power of 4.4 kW, the suitable ratio of PV capacity to compressor-rated power α is about 1.3. When the refrigerated warehouse system is driven directly by a 5.4 kW PV array, the overall COP is approximately 0.19. In the cycle mode of refrigeration and cold energy storage during the day and cold energy release at night, the stored cold energy can still meet the refrigeration required by the load for 48 hours after eight days of continuous operation. According to the current market price of cold storage, during the service life of the system, the income per unit volume of cold storage is about 2.2 times the investment
    corecore