55 research outputs found
TCEIP: Text Condition Embedded Regression Network for Dental Implant Position Prediction
When deep neural network has been proposed to assist the dentist in designing
the location of dental implant, most of them are targeting simple cases where
only one missing tooth is available. As a result, literature works do not work
well when there are multiple missing teeth and easily generate false
predictions when the teeth are sparsely distributed. In this paper, we are
trying to integrate a weak supervision text, the target region, to the implant
position regression network, to address above issues. We propose a text
condition embedded implant position regression network (TCEIP), to embed the
text condition into the encoder-decoder framework for improvement of the
regression performance. A cross-modal interaction that consists of cross-modal
attention (CMA) and knowledge alignment module (KAM) is proposed to facilitate
the interaction between features of images and texts. The CMA module performs a
cross-attention between the image feature and the text condition, and the KAM
mitigates the knowledge gap between the image feature and the image encoder of
the CLIP. Extensive experiments on a dental implant dataset through five-fold
cross-validation demonstrated that the proposed TCEIP achieves superior
performance than existing methods.Comment: MICCAI 202
BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion
Recent text-to-image diffusion models have demonstrated an astonishing
capacity to generate high-quality images. However, researchers mainly studied
the way of synthesizing images with only text prompts. While some works have
explored using other modalities as conditions, considerable paired data, e.g.,
box/mask-image pairs, and fine-tuning time are required for nurturing models.
As such paired data is time-consuming and labor-intensive to acquire and
restricted to a closed set, this potentially becomes the bottleneck for
applications in an open world. This paper focuses on the simplest form of
user-provided conditions, e.g., box or scribble. To mitigate the aforementioned
problem, we propose a training-free method to control objects and contexts in
the synthesized images adhering to the given spatial conditions. Specifically,
three spatial constraints, i.e., Inner-Box, Outer-Box, and Corner Constraints,
are designed and seamlessly integrated into the denoising step of diffusion
models, requiring no additional training and massive annotated layout data.
Extensive results show that the proposed constraints can control what and where
to present in the images while retaining the ability of the Stable Diffusion
model to synthesize with high fidelity and diverse concept coverage. The code
is publicly available at https://github.com/Sierkinhane/BoxDiff.Comment: Accepted by ICCV 2023. The paper is still being revised for better
organization and comparison. Code is available at:
https://github.com/Sierkinhane/BoxDif
Dynamically Masked Discriminator for Generative Adversarial Networks
Training Generative Adversarial Networks (GANs) remains a challenging
problem. The discriminator trains the generator by learning the distribution of
real/generated data. However, the distribution of generated data changes
throughout the training process, which is difficult for the discriminator to
learn. In this paper, we propose a novel method for GANs from the viewpoint of
online continual learning. We observe that the discriminator model, trained on
historically generated data, often slows down its adaptation to the changes in
the new arrival generated data, which accordingly decreases the quality of
generated results. By treating the generated data in training as a stream, we
propose to detect whether the discriminator slows down the learning of new
knowledge in generated data. Therefore, we can explicitly enforce the
discriminator to learn new knowledge fast. Particularly, we propose a new
discriminator, which automatically detects its retardation and then dynamically
masks its features, such that the discriminator can adaptively learn the
temporally-vary distribution of generated data. Experimental results show our
method outperforms the state-of-the-art approaches
Open-World Weakly-Supervised Object Localization
While remarkable success has been achieved in weakly-supervised object
localization (WSOL), current frameworks are not capable of locating objects of
novel categories in open-world settings. To address this issue, we are the
first to introduce a new weakly-supervised object localization task called
OWSOL (Open-World Weakly-Supervised Object Localization). During training, all
labeled data comes from known categories and, both known and novel categories
exist in the unlabeled data. To handle such data, we propose a novel paradigm
of contrastive representation co-learning using both labeled and unlabeled data
to generate a complete G-CAM (Generalized Class Activation Map) for object
localization, without the requirement of bounding box annotation. As no class
label is available for the unlabelled data, we conduct clustering over the full
training set and design a novel multiple semantic centroids-driven contrastive
loss for representation learning. We re-organize two widely used datasets,
i.e., ImageNet-1K and iNatLoc500, and propose OpenImages150 to serve as
evaluation benchmarks for OWSOL. Extensive experiments demonstrate that the
proposed method can surpass all baselines by a large margin. We believe that
this work can shift the close-set localization towards the open-world setting
and serve as a foundation for subsequent works. Code will be released at
https://github.com/ryylcc/OWSOL
Landscape of variable domain of heavyāchaināonly antibody repertoire from alpaca
Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/156425/2/imm13224_am.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/156425/1/imm13224.pd
VisorGPT: Learning Visual Prior via Generative Pre-Training
Various stuff and things in visual data possess specific traits, which can be
learned by deep neural networks and are implicitly represented as the visual
prior, e.g., object location and shape, in the model. Such prior potentially
impacts many vision tasks. For example, in conditional image synthesis, spatial
conditions failing to adhere to the prior can result in visually inaccurate
synthetic results. This work aims to explicitly learn the visual prior and
enable the customization of sampling. Inspired by advances in language
modeling, we propose to learn Visual prior via Generative Pre-Training, dubbed
VisorGPT. By discretizing visual locations of objects, e.g., bounding boxes,
human pose, and instance masks, into sequences, VisorGPT can model visual prior
through likelihood maximization. Besides, prompt engineering is investigated to
unify various visual locations and enable customized sampling of sequential
outputs from the learned prior. Experimental results demonstrate that VisorGPT
can effectively model the visual prior, which can be employed for many vision
tasks, such as customizing accurate human pose for conditional image synthesis
models like ControlNet. Code will be released at
https://github.com/Sierkinhane/VisorGPT.Comment: Project web-page: https://sierkinhane.github.io/visor-gpt
Global research status and frontiers on microvascular invasion of hepatocellular carcinoma: A bibliometric and visualized analysis
IntroductionOver the past decade, several studies on the microvascular invasion (MVI) of hepatocellular carcinoma (HCC) have been published. However, they have not quantitatively analyzed the remarkable impact of MVI. Therefore, a more comprehensive understanding of the field is now needed. This study aims to analyze the evolution of HCC-MVI research and to systematically evaluate the scientific outputs using bibliometric citation analysis.MethodsA systematic search was conducted on the Web of Science Core Collection on 2 May 2022 to retrieve studies on HCC-MVI published between 2013 and 2022. Then, a bibliometric analysis of the publications was performed using CiteSpace, VOSviewer, and other visualization tools.ResultsA total of 1,208 articles on HCC MVI were identified. Of these, China (n = 518) was the most prolific country, and Fudan University (n = 90) was the most notable institution. Furthermore, we observed that Lau Wan Yee participated in most studies (n = 26), and Frontiers in Oncology (IF2020:6.24) published the highest number of documents (n = 49) on this subject, with 138 publications. The paper āBray F, 2018, CA-CANCER J CLIN, V68, P394ā has the highest number of co-cited references, with 119 citations. In addition, the top three keywords were āsurvivalā, ārecurrenceā, and āmicrovascular invasionā. Moreover, the research hot spots and frontiers of HCC-MVI for the last 3 years included imaging characteristics and transarterial chemoembolization (TACE) therapy studies.ConclusionsThis study comprehensively summarized the most significant HCC-MVI documents from past literature and highlighted key contributions made to the advancement of this subject and the advancement of this field over the past decade. The trend of MVI research will gradually shift from risk factors and prognosis studies to imaging characteristics and TACE therapy studies
Geochemical Features of Volcanic Rocks from the Shaerbuti Mountain Complex, West Junggar, Xinjiang, China: Implications for Recycling of Materials
In this paper, we focus on the geological features of volcanic edifices and the geochemistry of intermediateābasic volcanic rocks of Shaerbuti Mountain, which offer a new perspective on recycled materials in the study area. The Shaerbuti volcanic rocks consist of calc-alkali basalt and andesite formed in an arc setting. The porphyroclastic texture of basalt, explosive breccia rock, and the distribution of both breccia and agglomerate provide robust evidence that a volcanic edifice exists in Shaerbuti Mountain. Based on geochemical features, the Shaerbuti volcanic rocks have been identified as being of two types. Type I volcanic rocks have light rare earth element (LREE)-enriched patterns, with La/Sm ratios of 2.27ā4.03, Th/Yb ratios of 0.50ā1.46, and Nb/Yb ratios of 1.11ā2.28. Type II volcanic rocks display a flat rare earth element (REE) pattern, with La/Sm ratios ranging from 1.83 to 2.43, Th/Yb ratios ranging from 0.24 to 0.45, and Nb/Yb ratios ranging from 0.87 to 0.93. In the studied rocks, MgO-Cr, MgO-Ni and MgO-CaO present a positive relationship, which indicates clinopyroxenes crystallized. The Sr-Nd-Pb isotopic compositions of these basalts present values of 0.7045 to 0.7063 ((87Sr/86Sr)i), 6.4 to 6.6 (ĪµNd(t)), and 17.1300 to 18.3477 ((206Pb/204Pb)i), respectively. According to Sr-Nd-Pb isotope features, we argue that melts of altered oceanic crust and sediments were incorporated into the source. We also evaluate the water content (0.55%ā6.72%) of the studied volcanic rocks
- ā¦