Search CORE

21 research outputs found

Global regularity to the Navier-Stokes equations for a class of large initial data

Author: Bin Han
Yukang Chen
Publication venue: 'Vilnius Gediminas Technical University'
Publication date: 01/04/2018
Field of study

In [5], Chemin, Gallagher and Paicu proved the global regularity of solutions to the classical Navier-Stokes equations with a class of large initial data on T2 × R. This data varies slowly in vertical variable and has a norm which blows up as the small parameter ( represented by ǫ in the paper) tends to zero. However, to the best of our knowledge, the result is still unclear for the whole spaces R3. In this paper, we consider the generalized Navier-Stokes equations on Rn(n ≥ 3): ∂tu + u · ∇u + Dsu + ∇P = 0, div u = 0. For some suitable number s, we prove that the Cauchy problem with initial data of the form u0ǫ(x) = (v0h(xǫ), ǫ−1v0n(xǫ))T , xǫ = (xh, ǫxn)T , is globally well-posed for all small ǫ > 0, provided that the initial velocity proﬁle v0 is analytic in xn and certain norm of v0 is suﬃciently small but independent of ǫ. In particular, our result is true for the n-dimensional classical Navier-Stokes equations with n ≥ 4 and the fractional Navier-Stokes equations with 1 ≤ s < 2 in 3D

Crossref

Directory of Open Access Journals

VGTU Journals (Vilnius Gediminas Technical University - Vilnius Tech)

Mobile Platform for livestock monitoring and inspection.

Author: Barclay David
Fabiyi Samson Damilola
Han Yukang
Ren Jinchang
Zhu Qiming
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/12/2022
Field of study

Livestock keepers acquire and manage information (e.g. identification numbers, images, etc.) about livestock to identify and keep track of livestock using systems with capabilities to extract such information. Examples of such systems are Radio Frequency Identification (RFID) systems which are used to collect and transmit livestock's information to host devices. Sophisticated RFID readers are very expensive, and more functional than the cheap ones whose use are mostly limited to reading and transmission of tag IDs. Cross-platform mobile applications will allow monitoring of livestock irrespective of the platform on which mobile devices are being operated. Farmers' secured access to records via web services is not limited to a device as they can login on any mobile device with the installed application. In this work, a mobile platform which consists of a cross-platform mobile application, webservice and database is developed to cost-effectively manage and exploit records of livestock acquired using a cheap RFID reader. The mobile application was developed using a Xamarin form framework. The programming language and development environment used are C# and Visual studio respectively. Records of livestock were acquired, posted, updated, deleted and retrieved from the database via a web service. Additional advantages offer by the solution implemented include, exporting of animals’ records via email and SMS, viewing of animal's record by scanning their tags or QR code of animals' passports, and login system to sign users in and out of the application. Development of RFID readers with sensors to acquire health-related parameters for health monitoring is recommended

Open Access Institutional Repository at Robert Gordon University

White Rose Research Online

Mask-Attention-Free Transformer for 3D Instance Segmentation

Author: Chen Yukang
Chu Ruihang
Hu Han
Jia Jiaya
Lai Xin
Yuan Yuhui
Publication venue
Publication date: 04/09/2023
Field of study

Recently, transformer-based methods have dominated 3D instance segmentation, where mask attention is commonly involved. Specifically, object queries are guided by the initial instance masks in the first cross-attention, and then iteratively refine themselves in a similar manner. However, we observe that the mask-attention pipeline usually leads to slow convergence due to low-recall initial instance masks. Therefore, we abandon the mask attention design and resort to an auxiliary center regression task instead. Through center regression, we effectively overcome the low-recall issue and perform cross-attention by imposing positional prior. To reach this goal, we develop a series of position-aware designs. First, we learn a spatial distribution of 3D locations as the initial position queries. They spread over the 3D space densely, and thus can easily capture the objects in a scene with a high recall. Moreover, we present relative position encoding for the cross-attention and iterative refinement for more accurate position queries. Experiments show that our approach converges 4x faster than existing work, sets a new state of the art on ScanNetv2 3D instance segmentation benchmark, and also demonstrates superior performance across various datasets. Code and models are available at https://github.com/dvlab-research/Mask-Attention-Free-Transformer.Comment: Accepted to ICCV 2023. Code and models are available at https://github.com/dvlab-research/Mask-Attention-Free-Transforme

arXiv.org e-Print Archive

Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors

Author: Gong Chaoqun
Han Haonan
Li Xiu
Lin Yukang
Xu Zunnan
Zhang Yachao
Publication venue
Publication date: 20/02/2024
Field of study

Reconstructing 3D objects from a single image guided by pretrained diffusion models has demonstrated promising outcomes. However, due to utilizing the case-agnostic rigid strategy, their generalization ability to arbitrary cases and the 3D consistency of reconstruction are still poor. In this work, we propose Consistent123, a case-aware two-stage method for highly consistent 3D asset reconstruction from one image with both 2D and 3D diffusion priors. In the first stage, Consistent123 utilizes only 3D structural priors for sufficient geometry exploitation, with a CLIP-based case-aware adaptive detection mechanism embedded within this process. In the second stage, 2D texture priors are introduced and progressively take on a dominant guiding role, delicately sculpting the details of the 3D model. Consistent123 aligns more closely with the evolving trends in guidance requirements, adaptively providing adequate 3D geometric initialization and suitable 2D texture refinement for different objects. Consistent123 can obtain highly 3D-consistent reconstruction and exhibits strong generalization ability across various objects. Qualitative and quantitative experiments show that our method significantly outperforms state-of-the-art image-to-3D methods. See https://Consistent123.github.io for a more comprehensive exploration of our generated 3D assets

arXiv.org e-Print Archive

DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via Diffusion Models

Author: Cao Yan-Pei
Cao Yukang
Han Kai
Shan Ying
Wong Kwan-Yee K.
Publication venue
Publication date: 30/11/2023
Field of study

We present DreamAvatar, a text-and-shape guided framework for generating high-quality 3D human avatars with controllable poses. While encouraging results have been reported by recent methods on text-guided 3D common object generation, generating high-quality human avatars remains an open challenge due to the complexity of the human body's shape, pose, and appearance. We propose DreamAvatar to tackle this challenge, which utilizes a trainable NeRF for predicting density and color for 3D points and pretrained text-to-image diffusion models for providing 2D self-supervision. Specifically, we leverage the SMPL model to provide shape and pose guidance for the generation. We introduce a dual-observation-space design that involves the joint optimization of a canonical space and a posed space that are related by a learnable deformation field. This facilitates the generation of more complete textures and geometry faithful to the target pose. We also jointly optimize the losses computed from the full body and from the zoomed-in 3D head to alleviate the common multi-face ''Janus'' problem and improve facial details in the generated avatars. Extensive evaluations demonstrate that DreamAvatar significantly outperforms existing methods, establishing a new state-of-the-art for text-and-shape guided 3D human avatar generation.Comment: Project page: https://yukangcao.github.io/DreamAvatar

arXiv.org e-Print Archive

GlyphControl: Glyph Conditional Control for Visual Text Generation

Author: Chen Kai
Ding Haisong
Gui Dongnan
Hu Han
Yang Yukang
Yuan Yuhui
Publication venue
Publication date: 29/05/2023
Field of study

Recently, there has been a growing interest in developing diffusion-based text-to-image generative models capable of generating coherent and well-formed visual text. In this paper, we propose a novel and efficient approach called GlyphControl to address this task. Unlike existing methods that rely on character-aware text encoders like ByT5 and require retraining of text-to-image models, our approach leverages additional glyph conditional information to enhance the performance of the off-the-shelf Stable-Diffusion model in generating accurate visual text. By incorporating glyph instructions, users can customize the content, location, and size of the generated text according to their specific requirements. To facilitate further research in visual text generation, we construct a training benchmark dataset called LAION-Glyph. We evaluate the effectiveness of our approach by measuring OCR-based metrics and CLIP scores of the generated visual text. Our empirical evaluations demonstrate that GlyphControl outperforms the recent DeepFloyd IF approach in terms of OCR accuracy and CLIP scores, highlighting the efficacy of our method.Comment: Technical report. The codes will be released at https://github.com/AIGText/GlyphControl-releas

arXiv.org e-Print Archive

HeadSculpt: Crafting 3D Head Avatars with Text

Author: Cao Yukang
Deng Jiankang
Han Kai
Han Xiao
Song Yi-Zhe
Wong Kwan-Yee K.
Xiang Tao
Zhu Xiatian
Publication venue
Publication date: 29/08/2023
Field of study

Recently, text-guided 3D generative methods have made remarkable advancements in producing high-quality textures and geometry, capitalizing on the proliferation of large vision-language and image diffusion models. However, existing methods still struggle to create high-fidelity 3D head avatars in two aspects: (1) They rely mostly on a pre-trained text-to-image diffusion model whilst missing the necessary 3D awareness and head priors. This makes them prone to inconsistency and geometric distortions in the generated avatars. (2) They fall short in fine-grained editing. This is primarily due to the inherited limitations from the pre-trained 2D image diffusion models, which become more pronounced when it comes to 3D head avatars. In this work, we address these challenges by introducing a versatile coarse-to-fine pipeline dubbed HeadSculpt for crafting (i.e., generating and editing) 3D head avatars from textual prompts. Specifically, we first equip the diffusion model with 3D awareness by leveraging landmark-based control and a learned textual embedding representing the back view appearance of heads, enabling 3D-consistent head avatar generations. We further propose a novel identity-aware editing score distillation strategy to optimize a textured mesh with a high-resolution differentiable rendering technique. This enables identity preservation while following the editing instruction. We showcase HeadSculpt's superior fidelity and editing capabilities through comprehensive experiments and comparisons with existing methods.Comment: Webpage: https://brandonhan.uk/HeadSculpt

arXiv.org e-Print Archive

Rank-DETR for High Quality Object Detection

Author: Hao Yiduo
Hu Han
Huang Gao
Liang Weicong
Pu Yifan
Yang Yukang
Yuan Yuhui
Zhang Chao
Publication venue
Publication date: 02/11/2023
Field of study

Modern detection transformers (DETRs) use a set of object queries to predict a list of bounding boxes, sort them by their classification confidence scores, and select the top-ranked predictions as the final detection results for the given input image. A highly performant object detector requires accurate ranking for the bounding box predictions. For DETR-based detectors, the top-ranked bounding boxes suffer from less accurate localization quality due to the misalignment between classification scores and localization accuracy, thus impeding the construction of high-quality detectors. In this work, we introduce a simple and highly performant DETR-based object detector by proposing a series of rank-oriented designs, combinedly called Rank-DETR. Our key contributions include: (i) a rank-oriented architecture design that can prompt positive predictions and suppress the negative ones to ensure lower false positive rates, as well as (ii) a rank-oriented loss function and matching cost design that prioritizes predictions of more accurate localization accuracy during ranking to boost the AP under high IoU thresholds. We apply our method to improve the recent SOTA methods (e.g., H-DETR and DINO-DETR) and report strong COCO object detection results when using different backbones such as ResNet-

50

, Swin-T, and Swin-L, demonstrating the effectiveness of our approach. Code is available at \url{https://github.com/LeapLabTHU/Rank-DETR}.Comment: NeurIPS 202

arXiv.org e-Print Archive

Time2Stop: Adaptive and Explainable Human-AI Loop for Smartphone Overuse Intervention

Author: Dey Anind K
Ghassemi Marzyeh
Lee Sung-Ju
Li Zhipeng
Orzikulova Adiba
Shi Yuanchun
Wang Yuntao
Xiao Han
Xu Xuhai "Orson"
Yan Yukang
Publication venue
Publication date: 03/03/2024
Field of study

Despite a rich history of investigating smartphone overuse intervention techniques, AI-based just-in-time adaptive intervention (JITAI) methods for overuse reduction are lacking. We develop Time2Stop, an intelligent, adaptive, and explainable JITAI system that leverages machine learning to identify optimal intervention timings, introduces interventions with transparent AI explanations, and collects user feedback to establish a human-AI loop and adapt the intervention model over time. We conducted an 8-week field experiment (N=71) to evaluate the effectiveness of both the adaptation and explanation aspects of Time2Stop. Our results indicate that our adaptive models significantly outperform the baseline methods on intervention accuracy (>32.8\% relatively) and receptivity (>8.0\%). In addition, incorporating explanations further enhances the effectiveness by 53.8\% and 11.4\% on accuracy and receptivity, respectively. Moreover, Time2Stop significantly reduces overuse, decreasing app visit frequency by 7.0

\sim

8.9\%. Our subjective data also echoed these quantitative measures. Participants preferred the adaptive interventions and rated the system highly on intervention time accuracy, effectiveness, and level of trust. We envision our work can inspire future research on JITAI systems with a human-AI loop to evolve with users

arXiv.org e-Print Archive