43 research outputs found
AI-Generated Images as Data Source: The Dawn of Synthetic Era
The advancement of visual intelligence is intrinsically tethered to the
availability of large-scale data. In parallel, generative Artificial
Intelligence (AI) has unlocked the potential to create synthetic images that
closely resemble real-world photographs. This prompts a compelling inquiry: how
much visual intelligence could benefit from the advance of generative AI? This
paper explores the innovative concept of harnessing these AI-generated images
as new data sources, reshaping traditional modeling paradigms in visual
intelligence. In contrast to real data, AI-generated data exhibit remarkable
advantages, including unmatched abundance and scalability, the rapid generation
of vast datasets, and the effortless simulation of edge cases. Built on the
success of generative AI models, we examine the potential of their generated
data in a range of applications, from training machine learning models to
simulating scenarios for computational modeling, testing, and validation. We
probe the technological foundations that support this groundbreaking use of
generative AI, engaging in an in-depth discussion on the ethical, legal, and
practical considerations that accompany this transformative paradigm shift.
Through an exhaustive survey of current technologies and applications, this
paper presents a comprehensive view of the synthetic era in visual
intelligence. A project associated with this paper can be found at
https://github.com/mwxely/AIGS .Comment: 20 pages, 11 figure
StyleGaussian: Instant 3D Style Transfer with Gaussian Splatting
We introduce StyleGaussian, a novel 3D style transfer technique that allows
instant transfer of any image's style to a 3D scene at 10 frames per second
(fps). Leveraging 3D Gaussian Splatting (3DGS), StyleGaussian achieves style
transfer without compromising its real-time rendering ability and multi-view
consistency. It achieves instant style transfer with three steps: embedding,
transfer, and decoding. Initially, 2D VGG scene features are embedded into
reconstructed 3D Gaussians. Next, the embedded features are transformed
according to a reference style image. Finally, the transformed features are
decoded into the stylized RGB. StyleGaussian has two novel designs. The first
is an efficient feature rendering strategy that first renders low-dimensional
features and then maps them into high-dimensional features while embedding VGG
features. It cuts the memory consumption significantly and enables 3DGS to
render the high-dimensional memory-intensive features. The second is a
K-nearest-neighbor-based 3D CNN. Working as the decoder for the stylized
features, it eliminates the 2D CNN operations that compromise strict multi-view
consistency. Extensive experiments show that StyleGaussian achieves instant 3D
stylization with superior stylization quality while preserving real-time
rendering and strict multi-view consistency. Project page:
https://kunhao-liu.github.io/StyleGaussian
DivAvatar: Diverse 3D Avatar Generation with a Single Prompt
Text-to-Avatar generation has recently made significant strides due to
advancements in diffusion models. However, most existing work remains
constrained by limited diversity, producing avatars with subtle differences in
appearance for a given text prompt. We design DivAvatar, a novel framework that
generates diverse avatars, empowering 3D creatives with a multitude of distinct
and richly varied 3D avatars from a single text prompt. Different from most
existing work that exploits scene-specific 3D representations such as NeRF,
DivAvatar finetunes a 3D generative model (i.e., EVA3D), allowing diverse
avatar generation from simply noise sampling in inference time. DivAvatar has
two key designs that help achieve generation diversity and visual quality. The
first is a noise sampling technique during training phase which is critical in
generating diverse appearances. The second is a semantic-aware zoom mechanism
and a novel depth loss, the former producing appearances of high textual
fidelity by separate fine-tuning of specific body parts and the latter
improving geometry quality greatly by smoothing the generated mesh in the
features space. Extensive experiments show that DivAvatar is highly versatile
in generating avatars of diverse appearances
Pose-Free Neural Radiance Fields via Implicit Pose Regularization
Pose-free neural radiance fields (NeRF) aim to train NeRF with unposed
multi-view images and it has achieved very impressive success in recent years.
Most existing works share the pipeline of training a coarse pose estimator with
rendered images at first, followed by a joint optimization of estimated poses
and neural radiance field. However, as the pose estimator is trained with only
rendered images, the pose estimation is usually biased or inaccurate for real
images due to the domain gap between real images and rendered images, leading
to poor robustness for the pose estimation of real images and further local
minima in joint optimization. We design IR-NeRF, an innovative pose-free NeRF
that introduces implicit pose regularization to refine pose estimator with
unposed real images and improve the robustness of the pose estimation for real
images. With a collection of 2D images of a specific scene, IR-NeRF constructs
a scene codebook that stores scene features and captures the scene-specific
pose distribution implicitly as priors. Thus, the robustness of pose estimation
can be promoted with the scene priors according to the rationale that a 2D real
image can be well reconstructed from the scene codebook only when its estimated
pose lies within the pose distribution. Extensive experiments show that IR-NeRF
achieves superior novel view synthesis and outperforms the state-of-the-art
consistently across multiple synthetic and real datasets.Comment: Accepted by ICCV202
Weakly Supervised 3D Open-vocabulary Segmentation
Open-vocabulary segmentation of 3D scenes is a fundamental function of human
perception and thus a crucial objective in computer vision research. However,
this task is heavily impeded by the lack of large-scale and diverse 3D
open-vocabulary segmentation datasets for training robust and generalizable
models. Distilling knowledge from pre-trained 2D open-vocabulary segmentation
models helps but it compromises the open-vocabulary feature as the 2D models
are mostly finetuned with close-vocabulary datasets. We tackle the challenges
in 3D open-vocabulary segmentation by exploiting pre-trained foundation models
CLIP and DINO in a weakly supervised manner. Specifically, given only the
open-vocabulary text descriptions of the objects in a scene, we distill the
open-vocabulary multimodal knowledge and object reasoning capability of CLIP
and DINO into a neural radiance field (NeRF), which effectively lifts 2D
features into view-consistent 3D segmentation. A notable aspect of our approach
is that it does not require any manual segmentation annotations for either the
foundation models or the distillation process. Extensive experiments show that
our method even outperforms fully supervised models trained with segmentation
annotations in certain scenes, suggesting that 3D open-vocabulary segmentation
can be effectively learned from 2D images and text-image pairs. Code is
available at \url{https://github.com/Kunhao-Liu/3D-OVS}.Comment: Accepted to NeurIPS 202
Never Lost in the Middle: Improving Large Language Models via Attention Strengthening Question Answering
While large language models (LLMs) are equipped with longer text input
capabilities than before, they are struggling to seek correct information in
long contexts. The "lost in the middle" problem challenges most LLMs, referring
to the dramatic decline in accuracy when correct information is located in the
middle. To overcome this crucial issue, this paper proposes to enhance the
information searching and reflection ability of LLMs in long contexts via
specially designed tasks called Attention Strengthening Multi-doc QA (ASM QA).
Following these tasks, our model excels in focusing more precisely on the
desired information. Experimental results show substantial improvement in
Multi-doc QA and other benchmarks, superior to state-of-the-art models by 13.7%
absolute gain in shuffled settings, by 21.5% in passage retrieval task. We
release our model, Ziya-Reader to promote related research in the community
Preoperative Strength Training for Clinical Outcomes Before and After Total Knee Arthroplasty: A Systematic Review and Meta-Analysis
BackgroundThere is an increasing interest in preoperative strength training for promoting post-operative rehabilitation, but the effectiveness of preoperative strength training for clinical outcomes after total knee arthroplasty (TKA) remains controversial.ObjectiveThis study aims to systematically evaluate the effect of preoperative strength training on clinical outcomes before and after TKA.MethodsWe systematically searched PubMed, Cochrane Library, Web of Science, and EMBASE databases from the inception to November 17, 2021. The meta-analysis was performed to evaluate the effects of preoperative strength training on clinical outcomes before and after TKA.ResultsSeven randomized controlled trials (RCTs) were included (n = 306). Immediately before TKA, the pooled results showed significant improvements in pain, knee function, functional ability, stiffness, and physical function in the strength training group compared with the control group, but not in strength (quadriceps), ROM, and WOMAC (total). Compared with the control group, the results indicated strength training had a statistically significant improvement in post-operative knee function, ROM, and functional ability at less than 1 month and 3 months, and had a statistically significant improvement in post-operative strength (quadriceps), stiffness, and WOMAC (total) at 3 months, and had a statistically significant improvement in post-operative pain at 6 months. However, the results indicated strength training had no statistically significant improvement in post-operative strength (quadriceps) at less than 1 month, 6, and 12 months, had no statistically significant improvement in post-operative pain at less than 1 month, 3, and 12 months, had no statistically significant improvement in post-operative knee function at 6 and 12 months, and had no statistically significant improvement in post-operative physical function at 3 months.ConclusionsPreoperative strength training may be beneficial to early rehabilitation after TKA, but the long-term efficacy needs to be further determined. At the same time, more caution should be exercised when interpreting the clinical efficacy of preoperative strength training for TKA
Big data research guided by sociological theory: a triadic dialogue among big data analysis, theory, and predictive models
Abstract Computational social science has integrated social science theories and methodology with big data analysis. It has opened a number of new topics for big data analysis and enabled qualitative and quantitative sociological research to provide the ground truth for testing the results of data mining. At the same time, threads of evidence obtained by data mining can inform the development of theory and thereby guide the construction of predictive models to infer and explain more phenomena. Using the example of the Internet data of China’s venture capital industry, this paper shows the triadic dialogue among data mining, sociological theory, and predictive models and forms a methodology of big data analysis guided by sociological theories
Report on the development of EU law in 2022 : digital and green transition, supply chain law, and reciprocal market openness
Published: 5 December 2023The transformation of the EU advanced amidst crises and challenges in 2022. This report is an overview of the development of EU law in 2022, focusing on three dimensions: the digital and green transition, the supply chain law, and reciprocal market openness. In the area of digital transition, the EU strengthened the legal instruments to address emerging issues of competition and data barriers arising from data regulation and the digital market. With regard to green transition, the EU put forward a set of legislative proposals on the green economy in relation to climate objectives, such as the renewed EU Emission Trade System and the Carbon Border Adjustment Mechanism. In terms of the supply chain law, the EU relied on normative power to exert influence on global supply chains and introduced unilateral legal instruments infused with extraterritorial effects, including the Corporate Sustainability Due Diligence legislation and the Forced Labour Ban. It also proposed the Single Market Emergency Instrument and the Chip Act to strengthen the resilience of the internal market and its supply chains. Besides, the EU emphasises reciprocal market openness based on level-playing competition in the internal market. The EU’s International Public Procurement Instrument and the Foreign Subsidies Regulation entered into force, reinforcing its regulatory toolkit and enhancing the thresholds for market access. For bilateral relations, the EU proceeded with the ongoing negotiations on free trade agreements with India and New Zealand, respectively. At the multilateral level, the EU participated in the concluded negotiation on modernising the Energy Charter Treaty, which was later on hold due to disagreements raised by some member states
Analysis of the Refrigeration Performance of the Refrigerated Warehouse with Ice Thermal Energy Storage Driven Directly by Variable Photovoltaic Capacity
An independent solar photovoltaic (PV) refrigerated warehouse system with ice thermal energy storage is constructed in this paper. In this system, the vapour compression refrigeration cycle is directly driven by a PV array, and the frequency of the compressor varies with the solar radiation intensity. The refrigeration performance and the matching characteristics of the system driven by different PV capacities are studied. The results show that the intensity of solar radiation required for the compressor to work at the same frequency decreases by approximately 7.8% when the ratio of PV capacity to compressor-rated power increases by 10%, and the time required for the temperature in the refrigerated warehouse to drop from ambient temperature to 0°C is reduced by 32 min on average. The energy efficiency ratio of the vapour compression refrigeration subsystem and the coefficient of performance (COP) of the refrigerated warehouse system increase with the ratio of PV capacity to compressor-rated power α. When α increases from 1 to 1.3, the growth rate of the COP is very slow. For the PV direct-drive refrigerated warehouse system with a compressor-rated power of 4.4 kW, the suitable ratio of PV capacity to compressor-rated power α is about 1.3. When the refrigerated warehouse system is driven directly by a 5.4 kW PV array, the overall COP is approximately 0.19. In the cycle mode of refrigeration and cold energy storage during the day and cold energy release at night, the stored cold energy can still meet the refrigeration required by the load for 48 hours after eight days of continuous operation. According to the current market price of cold storage, during the service life of the system, the income per unit volume of cold storage is about 2.2 times the investment