55 research outputs found

    TCEIP: Text Condition Embedded Regression Network for Dental Implant Position Prediction

    Full text link
    When deep neural network has been proposed to assist the dentist in designing the location of dental implant, most of them are targeting simple cases where only one missing tooth is available. As a result, literature works do not work well when there are multiple missing teeth and easily generate false predictions when the teeth are sparsely distributed. In this paper, we are trying to integrate a weak supervision text, the target region, to the implant position regression network, to address above issues. We propose a text condition embedded implant position regression network (TCEIP), to embed the text condition into the encoder-decoder framework for improvement of the regression performance. A cross-modal interaction that consists of cross-modal attention (CMA) and knowledge alignment module (KAM) is proposed to facilitate the interaction between features of images and texts. The CMA module performs a cross-attention between the image feature and the text condition, and the KAM mitigates the knowledge gap between the image feature and the image encoder of the CLIP. Extensive experiments on a dental implant dataset through five-fold cross-validation demonstrated that the proposed TCEIP achieves superior performance than existing methods.Comment: MICCAI 202

    BoxDiff: Text-to-Image Synthesis with Training-Free Box-Constrained Diffusion

    Full text link
    Recent text-to-image diffusion models have demonstrated an astonishing capacity to generate high-quality images. However, researchers mainly studied the way of synthesizing images with only text prompts. While some works have explored using other modalities as conditions, considerable paired data, e.g., box/mask-image pairs, and fine-tuning time are required for nurturing models. As such paired data is time-consuming and labor-intensive to acquire and restricted to a closed set, this potentially becomes the bottleneck for applications in an open world. This paper focuses on the simplest form of user-provided conditions, e.g., box or scribble. To mitigate the aforementioned problem, we propose a training-free method to control objects and contexts in the synthesized images adhering to the given spatial conditions. Specifically, three spatial constraints, i.e., Inner-Box, Outer-Box, and Corner Constraints, are designed and seamlessly integrated into the denoising step of diffusion models, requiring no additional training and massive annotated layout data. Extensive results show that the proposed constraints can control what and where to present in the images while retaining the ability of the Stable Diffusion model to synthesize with high fidelity and diverse concept coverage. The code is publicly available at https://github.com/Sierkinhane/BoxDiff.Comment: Accepted by ICCV 2023. The paper is still being revised for better organization and comparison. Code is available at: https://github.com/Sierkinhane/BoxDif

    Dynamically Masked Discriminator for Generative Adversarial Networks

    Full text link
    Training Generative Adversarial Networks (GANs) remains a challenging problem. The discriminator trains the generator by learning the distribution of real/generated data. However, the distribution of generated data changes throughout the training process, which is difficult for the discriminator to learn. In this paper, we propose a novel method for GANs from the viewpoint of online continual learning. We observe that the discriminator model, trained on historically generated data, often slows down its adaptation to the changes in the new arrival generated data, which accordingly decreases the quality of generated results. By treating the generated data in training as a stream, we propose to detect whether the discriminator slows down the learning of new knowledge in generated data. Therefore, we can explicitly enforce the discriminator to learn new knowledge fast. Particularly, we propose a new discriminator, which automatically detects its retardation and then dynamically masks its features, such that the discriminator can adaptively learn the temporally-vary distribution of generated data. Experimental results show our method outperforms the state-of-the-art approaches

    Open-World Weakly-Supervised Object Localization

    Full text link
    While remarkable success has been achieved in weakly-supervised object localization (WSOL), current frameworks are not capable of locating objects of novel categories in open-world settings. To address this issue, we are the first to introduce a new weakly-supervised object localization task called OWSOL (Open-World Weakly-Supervised Object Localization). During training, all labeled data comes from known categories and, both known and novel categories exist in the unlabeled data. To handle such data, we propose a novel paradigm of contrastive representation co-learning using both labeled and unlabeled data to generate a complete G-CAM (Generalized Class Activation Map) for object localization, without the requirement of bounding box annotation. As no class label is available for the unlabelled data, we conduct clustering over the full training set and design a novel multiple semantic centroids-driven contrastive loss for representation learning. We re-organize two widely used datasets, i.e., ImageNet-1K and iNatLoc500, and propose OpenImages150 to serve as evaluation benchmarks for OWSOL. Extensive experiments demonstrate that the proposed method can surpass all baselines by a large margin. We believe that this work can shift the close-set localization towards the open-world setting and serve as a foundation for subsequent works. Code will be released at https://github.com/ryylcc/OWSOL

    Landscape of variable domain of heavyā€chainā€only antibody repertoire from alpaca

    Full text link
    Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/156425/2/imm13224_am.pdfhttp://deepblue.lib.umich.edu/bitstream/2027.42/156425/1/imm13224.pd

    VisorGPT: Learning Visual Prior via Generative Pre-Training

    Full text link
    Various stuff and things in visual data possess specific traits, which can be learned by deep neural networks and are implicitly represented as the visual prior, e.g., object location and shape, in the model. Such prior potentially impacts many vision tasks. For example, in conditional image synthesis, spatial conditions failing to adhere to the prior can result in visually inaccurate synthetic results. This work aims to explicitly learn the visual prior and enable the customization of sampling. Inspired by advances in language modeling, we propose to learn Visual prior via Generative Pre-Training, dubbed VisorGPT. By discretizing visual locations of objects, e.g., bounding boxes, human pose, and instance masks, into sequences, VisorGPT can model visual prior through likelihood maximization. Besides, prompt engineering is investigated to unify various visual locations and enable customized sampling of sequential outputs from the learned prior. Experimental results demonstrate that VisorGPT can effectively model the visual prior, which can be employed for many vision tasks, such as customizing accurate human pose for conditional image synthesis models like ControlNet. Code will be released at https://github.com/Sierkinhane/VisorGPT.Comment: Project web-page: https://sierkinhane.github.io/visor-gpt

    Global research status and frontiers on microvascular invasion of hepatocellular carcinoma: A bibliometric and visualized analysis

    Get PDF
    IntroductionOver the past decade, several studies on the microvascular invasion (MVI) of hepatocellular carcinoma (HCC) have been published. However, they have not quantitatively analyzed the remarkable impact of MVI. Therefore, a more comprehensive understanding of the field is now needed. This study aims to analyze the evolution of HCC-MVI research and to systematically evaluate the scientific outputs using bibliometric citation analysis.MethodsA systematic search was conducted on the Web of Science Core Collection on 2 May 2022 to retrieve studies on HCC-MVI published between 2013 and 2022. Then, a bibliometric analysis of the publications was performed using CiteSpace, VOSviewer, and other visualization tools.ResultsA total of 1,208 articles on HCC MVI were identified. Of these, China (n = 518) was the most prolific country, and Fudan University (n = 90) was the most notable institution. Furthermore, we observed that Lau Wan Yee participated in most studies (n = 26), and Frontiers in Oncology (IF2020:6.24) published the highest number of documents (n = 49) on this subject, with 138 publications. The paper ā€œBray F, 2018, CA-CANCER J CLIN, V68, P394ā€ has the highest number of co-cited references, with 119 citations. In addition, the top three keywords were ā€œsurvivalā€, ā€œrecurrenceā€, and ā€œmicrovascular invasionā€. Moreover, the research hot spots and frontiers of HCC-MVI for the last 3 years included imaging characteristics and transarterial chemoembolization (TACE) therapy studies.ConclusionsThis study comprehensively summarized the most significant HCC-MVI documents from past literature and highlighted key contributions made to the advancement of this subject and the advancement of this field over the past decade. The trend of MVI research will gradually shift from risk factors and prognosis studies to imaging characteristics and TACE therapy studies

    Geochemical Features of Volcanic Rocks from the Shaerbuti Mountain Complex, West Junggar, Xinjiang, China: Implications for Recycling of Materials

    No full text
    In this paper, we focus on the geological features of volcanic edifices and the geochemistry of intermediateā€“basic volcanic rocks of Shaerbuti Mountain, which offer a new perspective on recycled materials in the study area. The Shaerbuti volcanic rocks consist of calc-alkali basalt and andesite formed in an arc setting. The porphyroclastic texture of basalt, explosive breccia rock, and the distribution of both breccia and agglomerate provide robust evidence that a volcanic edifice exists in Shaerbuti Mountain. Based on geochemical features, the Shaerbuti volcanic rocks have been identified as being of two types. Type I volcanic rocks have light rare earth element (LREE)-enriched patterns, with La/Sm ratios of 2.27ā€“4.03, Th/Yb ratios of 0.50ā€“1.46, and Nb/Yb ratios of 1.11ā€“2.28. Type II volcanic rocks display a flat rare earth element (REE) pattern, with La/Sm ratios ranging from 1.83 to 2.43, Th/Yb ratios ranging from 0.24 to 0.45, and Nb/Yb ratios ranging from 0.87 to 0.93. In the studied rocks, MgO-Cr, MgO-Ni and MgO-CaO present a positive relationship, which indicates clinopyroxenes crystallized. The Sr-Nd-Pb isotopic compositions of these basalts present values of 0.7045 to 0.7063 ((87Sr/86Sr)i), 6.4 to 6.6 (ĪµNd(t)), and 17.1300 to 18.3477 ((206Pb/204Pb)i), respectively. According to Sr-Nd-Pb isotope features, we argue that melts of altered oceanic crust and sediments were incorporated into the source. We also evaluate the water content (0.55%ā€“6.72%) of the studied volcanic rocks

    CuCl 2

    No full text
    • ā€¦
    corecore