303 research outputs found
Enhancing Rock Image Segmentation in Digital Rock Physics: A Fusion of Generative AI and State-of-the-Art Neural Networks
In digital rock physics, analysing microstructures from CT and SEM scans is
crucial for estimating properties like porosity and pore connectivity.
Traditional segmentation methods like thresholding and CNNs often fall short in
accurately detailing rock microstructures and are prone to noise. U-Net
improved segmentation accuracy but required many expert-annotated samples, a
laborious and error-prone process due to complex pore shapes. Our study
employed an advanced generative AI model, the diffusion model, to overcome
these limitations. This model generated a vast dataset of CT/SEM and binary
segmentation pairs from a small initial dataset. We assessed the efficacy of
three neural networks: U-Net, Attention-U-net, and TransUNet, for segmenting
these enhanced images. The diffusion model proved to be an effective data
augmentation technique, improving the generalization and robustness of deep
learning models. TransU-Net, incorporating Transformer structures, demonstrated
superior segmentation accuracy and IoU metrics, outperforming both U-Net and
Attention-U-net. Our research advances rock image segmentation by combining the
diffusion model with cutting-edge neural networks, reducing dependency on
extensive expert data and boosting segmentation accuracy and robustness.
TransU-Net sets a new standard in digital rock physics, paving the way for
future geoscience and engineering breakthroughs
MotionBERT: A Unified Perspective on Learning Human Motion Representations
We present a unified perspective on tackling various human-centric video
tasks by learning human motion representations from large-scale and
heterogeneous data resources. Specifically, we propose a pretraining stage in
which a motion encoder is trained to recover the underlying 3D motion from
noisy partial 2D observations. The motion representations acquired in this way
incorporate geometric, kinematic, and physical knowledge about human motion,
which can be easily transferred to multiple downstream tasks. We implement the
motion encoder with a Dual-stream Spatio-temporal Transformer (DSTformer)
neural network. It could capture long-range spatio-temporal relationships among
the skeletal joints comprehensively and adaptively, exemplified by the lowest
3D pose estimation error so far when trained from scratch. Furthermore, our
proposed framework achieves state-of-the-art performance on all three
downstream tasks by simply finetuning the pretrained motion encoder with a
simple regression head (1-2 layers), which demonstrates the versatility of the
learned motion representations. Code and models are available at
https://motionbert.github.io/Comment: ICCV 2023 Camera Read
Linear Gaussian Bounding Box Representation and Ring-Shaped Rotated Convolution for Oriented Object Detection
In oriented object detection, current representations of oriented bounding
boxes (OBBs) often suffer from boundary discontinuity problem. Methods of
designing continuous regression losses do not essentially solve this problem.
Although Gaussian bounding box (GBB) representation avoids this problem,
directly regressing GBB is susceptible to numerical instability. We propose
linear GBB (LGBB), a novel OBB representation. By linearly transforming the
elements of GBB, LGBB avoids the boundary discontinuity problem and has high
numerical stability. In addition, existing convolution-based rotation-sensitive
feature extraction methods only have local receptive fields, resulting in slow
feature aggregation. We propose ring-shaped rotated convolution (RRC), which
adaptively rotates feature maps to arbitrary orientations to extract
rotation-sensitive features under a ring-shaped receptive field, rapidly
aggregating features and contextual information. Experimental results
demonstrate that LGBB and RRC achieve state-of-the-art performance.
Furthermore, integrating LGBB and RRC into various models effectively improves
detection accuracy
Zero-Shot Digital Rock Image Segmentation with a Fine-Tuned Segment Anything Model
Accurate image segmentation is crucial in reservoir modelling and material
characterization, enhancing oil and gas extraction efficiency through detailed
reservoir models. This precision offers insights into rock properties,
advancing digital rock physics understanding. However, creating pixel-level
annotations for complex CT and SEM rock images is challenging due to their size
and low contrast, lengthening analysis time. This has spurred interest in
advanced semi-supervised and unsupervised segmentation techniques in digital
rock image analysis, promising more efficient, accurate, and less
labour-intensive methods. Meta AI's Segment Anything Model (SAM) revolutionized
image segmentation in 2023, offering interactive and automated segmentation
with zero-shot capabilities, essential for digital rock physics with limited
training data and complex image features. Despite its advanced features, SAM
struggles with rock CT/SEM images due to their absence in its training set and
the low-contrast nature of grayscale images. Our research fine-tunes SAM for
rock CT/SEM image segmentation, optimizing parameters and handling large-scale
images to improve accuracy. Experiments on rock CT and SEM images show that
fine-tuning significantly enhances SAM's performance, enabling high-quality
mask generation in digital rock image analysis. Our results demonstrate the
feasibility and effectiveness of the fine-tuned SAM model (RockSAM) for rock
images, offering segmentation without extensive training or complex labelling
Recommended from our members
Differential Features of Culprit Intracranial Atherosclerotic Lesions: A Whole-Brain Vessel Wall Imaging Study in Patients With Acute Ischemic Stroke.
BackgroundIntracranial atherosclerotic disease tends to affect multiple arterial segments. Using whole-brain vessel wall imaging, we sought to study the differences in plaque features among various types of plaques in patients with a recent unilateral anterior circulation ischemic stroke.Methods and resultsSixty-one patients with unilateral anterior circulation ischemic stroke were referred to undergo whole-brain vessel wall imaging (before and after contrast) within 1 month of symptom onset for intracranial atherosclerotic disease evaluations. Each plaque was classified as a culprit, probably culprit, or nonculprit lesion, according to its likelihood of causing the stroke. The associations between plaque features (thickening pattern, plaque-wall contrast ratio, high signal on T1-weighted images, plaque contrast enhancement ratio, enhancement grade, and enhancement pattern) and culprit lesions were estimated using mixed multivariable logistic regression after adjustment for maximum wall thickness. In 52 patients without motion corruption in whole-brain vessel wall imaging, a total of 178 intracranial plaques in the anterior circulation were identified, including 52 culprit lesions (29.2%), 51 probably culprit lesions (28.7%), and 75 nonculprit lesions (42.1%). High signal on T1-weighted images (adjusted odds ratio, 9.1; 95% confidence interval, 1.9-44.1; P=0.006), grade 2 (enhancement ratio of plaque ≥ enhancement ratio of pituitary) contrast enhancement (adjusted odds ratio, 17.4; 95% confidence interval, 1.8-164.9; P=0.013), and type 2 (≥50% cross-sectional wall involvement) enhancement pattern (adjusted odds ratio, 10.1; 95% confidence interval, 1.3-82.2; P=0.030) were independently associated with culprit lesions.ConclusionsHigh signal on T1-weighted images, grade 2 contrast enhancement, and type 2 enhancement pattern are associated with cerebrovascular ischemic events, which may provide valuable insights into risk stratification
Atorvastatin Represses the Angiotensin 2-Induced Oxidative Stress and Inflammatory Response in Dendritic Cells via the PI3K/Akt/Nrf 2 Pathway
Dendritic cells (DCs), which are highly proficient antigen-presenting cells, play a complex role in both the initiation and progression of atherosclerosis. We tested the hypothesis that the anti-inflammatory and antioxidant effects of atorvastatin may be partly mediated by the phosphatidylinositol 3-kinase/protein kinase B/transcription factor nuclear factor-erythroid 2-related factor 2 (PI3K/Akt/Nrf 2) pathway via the attenuation of DC maturation, thus reducing the inflammatory and oxidative stress responses. This study showed that angiotensin 2 (Ang 2) induced the maturation of DCs, stimulated CD83, CD40, CD80, and CD86 expression, and increased the secretion of IL-12p70, IL-6, and TNF-α. These effects were suppressed by atorvastatin. Atorvastatin also lowered the levels of reactive oxygen species (ROS) and malondialdehyde (MDA), counteracting their initial increases in response to Ang 2 stimulation. Atorvastatin activated Nrf 2 via the PI3K/Akt pathway and thereby promoted Nrf 2 translocation from the cytoplasm to the nucleus in bone marrow-derived dendritic cells (BMDCs), a process that was reversed by the PI3K inhibitor LY294002. Therefore, the regulation of Nrf 2 expression by the PI3K/Akt pathway plays an important role in the regulation of the statin-mediated antioxidant and anti-inflammatory responses in DCs
UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase
Point-, voxel-, and range-views are three representative forms of point
clouds. All of them have accurate 3D measurements but lack color and texture
information. RGB images are a natural complement to these point cloud views and
fully utilizing the comprehensive information of them benefits more robust
perceptions. In this paper, we present a unified multi-modal LiDAR segmentation
network, termed UniSeg, which leverages the information of RGB images and three
views of the point cloud, and accomplishes semantic segmentation and panoptic
segmentation simultaneously. Specifically, we first design the Learnable
cross-Modal Association (LMA) module to automatically fuse voxel-view and
range-view features with image features, which fully utilize the rich semantic
information of images and are robust to calibration errors. Then, the enhanced
voxel-view and range-view features are transformed to the point space,where
three views of point cloud features are further fused adaptively by the
Learnable cross-View Association module (LVA). Notably, UniSeg achieves
promising results in three public benchmarks, i.e., SemanticKITTI, nuScenes,
and Waymo Open Dataset (WOD); it ranks 1st on two challenges of two benchmarks,
including the LiDAR semantic segmentation challenge of nuScenes and panoptic
segmentation challenges of SemanticKITTI. Besides, we construct the OpenPCSeg
codebase, which is the largest and most comprehensive outdoor LiDAR
segmentation codebase. It contains most of the popular outdoor LiDAR
segmentation algorithms and provides reproducible implementations. The
OpenPCSeg codebase will be made publicly available at
https://github.com/PJLab-ADG/PCSeg.Comment: ICCV 2023; 21 pages; 9 figures; 18 tables; Code at
https://github.com/PJLab-ADG/PCSe
Recognize Anything: A Strong Image Tagging Model
We present the Recognize Anything Model (RAM): a strong foundation model for
image tagging. RAM can recognize any common category with high accuracy. RAM
introduces a new paradigm for image tagging, leveraging large-scale image-text
pairs for training instead of manual annotations. The development of RAM
comprises four key steps. Firstly, annotation-free image tags are obtained at
scale through automatic text semantic parsing. Subsequently, a preliminary
model is trained for automatic annotation by unifying the caption and tagging
tasks, supervised by the original texts and parsed tags, respectively. Thirdly,
a data engine is employed to generate additional annotations and clean
incorrect ones. Lastly, the model is retrained with the processed data and
fine-tuned using a smaller but higher-quality dataset. We evaluate the tagging
capabilities of RAM on numerous benchmarks and observe impressive zero-shot
performance, significantly outperforming CLIP and BLIP. Remarkably, RAM even
surpasses the fully supervised manners and exhibits competitive performance
with the Google API. We are releasing the RAM at
\url{https://recognize-anything.github.io/} to foster the advancements of large
models in computer vision
- …