23 research outputs found
Frozen CLIP Model is An Efficient Point Cloud Backbone
The pretraining-finetuning paradigm has demonstrated great success in NLP and
2D image fields because of the high-quality representation ability and
transferability of their pretrained models. However, pretraining such a strong
model is difficult in the 3D point cloud field since the training data is
limited and point cloud collection is expensive. This paper introduces
Efficient Point Cloud Learning (EPCL), an effective and efficient point cloud
learner for directly training high-quality point cloud models with a frozen
CLIP model. Our EPCL connects the 2D and 3D modalities by semantically aligning
the 2D features and point cloud features without paired 2D-3D data.
Specifically, the input point cloud is divided into a sequence of tokens and
directly fed into the frozen CLIP model to learn point cloud representation.
Furthermore, we design a task token to narrow the gap between 2D images and 3D
point clouds. Comprehensive experiments on 3D detection, semantic segmentation,
classification and few-shot learning demonstrate that the 2D CLIP model can be
an efficient point cloud backbone and our method achieves state-of-the-art
accuracy on both real-world and synthetic downstream tasks. Code will be
available.Comment: Technical repor
Experts Weights Averaging: A New General Training Scheme for Vision Transformers
Structural re-parameterization is a general training scheme for Convolutional
Neural Networks (CNNs), which achieves performance improvement without
increasing inference cost. As Vision Transformers (ViTs) are gradually
surpassing CNNs in various visual tasks, one may question: if a training scheme
specifically for ViTs exists that can also achieve performance improvement
without increasing inference cost? Recently, Mixture-of-Experts (MoE) has
attracted increasing attention, as it can efficiently scale up the capacity of
Transformers at a fixed cost through sparsely activated experts. Considering
that MoE can also be viewed as a multi-branch structure, can we utilize MoE to
implement a ViT training scheme similar to structural re-parameterization? In
this paper, we affirmatively answer these questions, with a new general
training strategy for ViTs. Specifically, we decouple the training and
inference phases of ViTs. During training, we replace some Feed-Forward
Networks (FFNs) of the ViT with specially designed, more efficient MoEs that
assign tokens to experts by random uniform partition, and perform Experts
Weights Averaging (EWA) on these MoEs at the end of each iteration. After
training, we convert each MoE into an FFN by averaging the experts,
transforming the model back into original ViT for inference. We further provide
a theoretical analysis to show why and how it works. Comprehensive experiments
across various 2D and 3D visual tasks, ViT architectures, and datasets validate
the effectiveness and generalizability of the proposed training scheme.
Besides, our training scheme can also be applied to improve performance when
fine-tuning ViTs. Lastly, but equally important, the proposed EWA technique can
significantly improve the effectiveness of naive MoE in various 2D visual small
datasets and 3D visual tasks.Comment: 12 pages, 2 figure
LAMM: Language-Assisted Multi-Modal Instruction-Tuning Dataset, Framework, and Benchmark
Large language models have become a potential pathway toward achieving
artificial general intelligence. Recent works on multi-modal large language
models have demonstrated their effectiveness in handling visual modalities. In
this work, we extend the research of MLLMs to point clouds and present the
LAMM-Dataset and LAMM-Benchmark for 2D image and 3D point cloud understanding.
We also establish an extensible framework to facilitate the extension of MLLMs
to additional modalities. Our main contribution is three-fold: 1) We present
the LAMM-Dataset and LAMM-Benchmark, which cover almost all high-level vision
tasks for 2D and 3D vision. Extensive experiments validate the effectiveness of
our dataset and benchmark. 2) We demonstrate the detailed methods of
constructing instruction-tuning datasets and benchmarks for MLLMs, which will
enable future research on MLLMs to scale up and extend to other domains, tasks,
and modalities faster. 3) We provide a primary but potential MLLM training
framework optimized for modalities' extension. We also provide baseline models,
comprehensive experimental observations, and analysis to accelerate future
research. Codes and datasets are now available at
https://github.com/OpenLAMM/LAMM.Comment: 37 pages, 33 figures. Code available at
https://github.com/OpenLAMM/LAMM ; Project page: https://openlamm.github.io
Multi-index cutting parameters optimization for surface quality and cutting energy consumption of boring
Saving energy is one of the ways to achieve sustainable development. As an important equipment for manufacturing, machine tool has the characteristics of high energy consumption and high emission. In order to cope with reducing energy consumption and carbon emissions without reducing processing quality, the search for optimal cutting parameters requires balancing the contradiction between machining quality and cutting energy consumption, so that cutting parameters can both reduce energy consumption and ensure the quality of processing. It plays an important role in achieving energy saving and emission reduction. In this paper, the processing quality (residual stress, surface roughness) and cutting energy consumption are selected as the optimized multiple indicators, and the selected optimization indicators are analyzed. Weighted grey correlation analysis is used to obtain the multi-index gray correlation degree value, and the multi-index weight coefficient is determined. Based on weighted grey correlation analysis and multi-index orthogonal optimization method, the cutting parameters of the boring process are optimized, and the optimal parameter combination is that cutting depth of 0.05 mm, cutting speed of 120 m/min, and feed rate of 80 mm/min
Mudrocks Lithofacies Characteristics and North-South Hydrocarbon Generation Difference of the Shahejie Formation in the Dongpu Sag
Lacustrine mudrocks are composed of minerals and organic matter (OM). The origin and preservation of OM are two controlling factors of the hydrocarbon generation capacity of mudrocks. It is a key method in source rock research to study the deposition process from the view of the OM and sedimentary environment. Following this idea, the reason for the discrepancy in hydrocarbon production between the northern and the southern part of Dongpu Sag is analyzed and discussed. The lacustrine mudrocks of the Shahejie Formation in Dongpu Sag are sampled and analyzed for information about mineralogy, microstructure, elemental geochemistry, and OM characteristics. The mudrocks are then divided into three lithofacies: silt-rich massive mudstone, homogeneous massive mudstone, and laminated mudstone. Each lithofacies shows distinct characteristics, and the hydrocarbon generation ability of them increases in sequence. Further discussion that the differences in hydrocarbon generation are caused by the sedimentary environment. The water depth, salinity, and reducibility of the sedimentary environments of these three lithofacies increase in sequence, as well. The correlation analysis indicates that it is the environment that controls the origin, accumulation, and preservation of OM in each lithofacies and then causes the great differences in hydrocarbon generation capacity. In Dongpu Sag, the proportion of laminated mudstone is much higher in the northern part, which leads to greater oil/gas production than the southern part. In research of source rocks, both the lithofacies characteristics and the sedimentary environments that control the characteristics should be studied
Correction: Targeting Leptin as a Therapeutic Strategy against Ovarian Cancer Peritoneal Metastasis
ENANTIOSELECTIVE TOTAL SYNTHESES OF 13-ACETYL- 12-HYDROXY-PODOCARPANE- 8,11,13-TRIENE-7-ONE
Effectiveness of enteral feeding protocol on clinical outcomes in critically ill patients: A before and after study.
Enteral nutrition (EN) feeding protocol was proposed to have positive impact on critically ill patients. However, current studies showed conflicting results. The present study aimed to investigate whether enteral feeding protocol was able to improve clinical outcomes in critically ill patients.A before (stage 1) and after (stage 2) interventional study was performed in 10 tertiary care hospitals. All patients expected to stay in the intensive care unit (ICU) for over three days were potentially eligible. Clinical outcomes such as 28-day mortality, ICU length of stay, duration of mechanical ventilation (MV), and nosocomial infection were compared between the two stages.A total of 410 patients were enrolled during the study period, including 236 in stage 1 and 174 in stage 2. EN feeding protocol was able to increase the proportion of EN in day 2 (41.8±22.3 vs. 50.0±28.3%; p = 0.006) and day 6 (70.3±25.2 vs. 77.6±25.8%; p = 0.006). EN percentages tended to be higher in stage 1 than that in stage 2 on other days, but statistical significance was not reached. There was no difference in 28-day mortality between stage 1 and 2 (0.14 vs. 0.14; p = 0.984). Implementation of EN feeding protocol marginally reduced ICU length of stay (19.44±18.48 vs. 16.29±16.19 days; p = 0.077). There was no difference in the duration of MV between stage a and stage 2 (14.24±14.49 vs. 14.51±17.55 days; p = 0.877).The study found that the EN feeding protocol was able to increase the proportion of EN feeding, but failed to reduce 28-day mortality, incidence of nosocomial infection or duration of MV