21 research outputs found
Global regularity to the Navier-Stokes equations for a class of large initial data
In [5], Chemin, Gallagher and Paicu proved the global regularity of solutions to the classical Navier-Stokes equations with a class of large initial data on T2 × R. This data varies slowly in vertical variable and has a norm which blows up as the small parameter ( represented by ǫ in the paper) tends to zero. However, to the best of our knowledge, the result is still unclear for the whole spaces R3. In this paper, we consider the generalized Navier-Stokes equations on Rn(n ≥ 3):
∂tu + u · ∇u + Dsu + ∇P = 0, div u = 0.
For some suitable number s, we prove that the Cauchy problem with initial data of the form u0ǫ(x) = (v0h(xǫ), ǫ−1v0n(xǫ))T , xǫ = (xh, ǫxn)T , is globally well-posed for all small ǫ > 0, provided that the initial velocity profile v0 is analytic in xn and certain norm of v0 is sufficiently small but independent of ǫ. In particular, our result is true for the n-dimensional classical Navier-Stokes equations with n ≥ 4 and the fractional Navier-Stokes equations with 1 ≤ s < 2 in 3D
Mobile Platform for livestock monitoring and inspection.
Livestock keepers acquire and manage information (e.g. identification numbers, images, etc.) about livestock to identify and keep track of livestock using systems with capabilities to extract such information. Examples of such systems are Radio Frequency Identification (RFID) systems which are used to collect and transmit livestock's information to host devices. Sophisticated RFID readers are very expensive, and more functional than the cheap ones whose use are mostly limited to reading and transmission of tag IDs. Cross-platform mobile applications will allow monitoring of livestock irrespective of the platform on which mobile devices are being operated. Farmers' secured access to records via web services is not limited to a device as they can login on any mobile device with the installed application. In this work, a mobile platform which consists of a cross-platform mobile application, webservice and database is developed to cost-effectively manage and exploit records of livestock acquired using a cheap RFID reader. The mobile application was developed using a Xamarin form framework. The programming language and development environment used are C# and Visual studio respectively. Records of livestock were acquired, posted, updated, deleted and retrieved from the database via a web service. Additional advantages offer by the solution implemented include, exporting of animals’ records via email and SMS, viewing of animal's record by scanning their tags or QR code of animals' passports, and login system to sign users in and out of the application. Development of RFID readers with sensors to acquire health-related parameters for health monitoring is recommended
Mask-Attention-Free Transformer for 3D Instance Segmentation
Recently, transformer-based methods have dominated 3D instance segmentation,
where mask attention is commonly involved. Specifically, object queries are
guided by the initial instance masks in the first cross-attention, and then
iteratively refine themselves in a similar manner. However, we observe that the
mask-attention pipeline usually leads to slow convergence due to low-recall
initial instance masks. Therefore, we abandon the mask attention design and
resort to an auxiliary center regression task instead. Through center
regression, we effectively overcome the low-recall issue and perform
cross-attention by imposing positional prior. To reach this goal, we develop a
series of position-aware designs. First, we learn a spatial distribution of 3D
locations as the initial position queries. They spread over the 3D space
densely, and thus can easily capture the objects in a scene with a high recall.
Moreover, we present relative position encoding for the cross-attention and
iterative refinement for more accurate position queries. Experiments show that
our approach converges 4x faster than existing work, sets a new state of the
art on ScanNetv2 3D instance segmentation benchmark, and also demonstrates
superior performance across various datasets. Code and models are available at
https://github.com/dvlab-research/Mask-Attention-Free-Transformer.Comment: Accepted to ICCV 2023. Code and models are available at
https://github.com/dvlab-research/Mask-Attention-Free-Transforme
Consistent123: One Image to Highly Consistent 3D Asset Using Case-Aware Diffusion Priors
Reconstructing 3D objects from a single image guided by pretrained diffusion
models has demonstrated promising outcomes. However, due to utilizing the
case-agnostic rigid strategy, their generalization ability to arbitrary cases
and the 3D consistency of reconstruction are still poor. In this work, we
propose Consistent123, a case-aware two-stage method for highly consistent 3D
asset reconstruction from one image with both 2D and 3D diffusion priors. In
the first stage, Consistent123 utilizes only 3D structural priors for
sufficient geometry exploitation, with a CLIP-based case-aware adaptive
detection mechanism embedded within this process. In the second stage, 2D
texture priors are introduced and progressively take on a dominant guiding
role, delicately sculpting the details of the 3D model. Consistent123 aligns
more closely with the evolving trends in guidance requirements, adaptively
providing adequate 3D geometric initialization and suitable 2D texture
refinement for different objects. Consistent123 can obtain highly 3D-consistent
reconstruction and exhibits strong generalization ability across various
objects. Qualitative and quantitative experiments show that our method
significantly outperforms state-of-the-art image-to-3D methods. See
https://Consistent123.github.io for a more comprehensive exploration of our
generated 3D assets
DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via Diffusion Models
We present DreamAvatar, a text-and-shape guided framework for generating
high-quality 3D human avatars with controllable poses. While encouraging
results have been reported by recent methods on text-guided 3D common object
generation, generating high-quality human avatars remains an open challenge due
to the complexity of the human body's shape, pose, and appearance. We propose
DreamAvatar to tackle this challenge, which utilizes a trainable NeRF for
predicting density and color for 3D points and pretrained text-to-image
diffusion models for providing 2D self-supervision. Specifically, we leverage
the SMPL model to provide shape and pose guidance for the generation. We
introduce a dual-observation-space design that involves the joint optimization
of a canonical space and a posed space that are related by a learnable
deformation field. This facilitates the generation of more complete textures
and geometry faithful to the target pose. We also jointly optimize the losses
computed from the full body and from the zoomed-in 3D head to alleviate the
common multi-face ''Janus'' problem and improve facial details in the generated
avatars. Extensive evaluations demonstrate that DreamAvatar significantly
outperforms existing methods, establishing a new state-of-the-art for
text-and-shape guided 3D human avatar generation.Comment: Project page: https://yukangcao.github.io/DreamAvatar
GlyphControl: Glyph Conditional Control for Visual Text Generation
Recently, there has been a growing interest in developing diffusion-based
text-to-image generative models capable of generating coherent and well-formed
visual text. In this paper, we propose a novel and efficient approach called
GlyphControl to address this task. Unlike existing methods that rely on
character-aware text encoders like ByT5 and require retraining of text-to-image
models, our approach leverages additional glyph conditional information to
enhance the performance of the off-the-shelf Stable-Diffusion model in
generating accurate visual text. By incorporating glyph instructions, users can
customize the content, location, and size of the generated text according to
their specific requirements. To facilitate further research in visual text
generation, we construct a training benchmark dataset called LAION-Glyph. We
evaluate the effectiveness of our approach by measuring OCR-based metrics and
CLIP scores of the generated visual text. Our empirical evaluations demonstrate
that GlyphControl outperforms the recent DeepFloyd IF approach in terms of OCR
accuracy and CLIP scores, highlighting the efficacy of our method.Comment: Technical report. The codes will be released at
https://github.com/AIGText/GlyphControl-releas
HeadSculpt: Crafting 3D Head Avatars with Text
Recently, text-guided 3D generative methods have made remarkable advancements
in producing high-quality textures and geometry, capitalizing on the
proliferation of large vision-language and image diffusion models. However,
existing methods still struggle to create high-fidelity 3D head avatars in two
aspects: (1) They rely mostly on a pre-trained text-to-image diffusion model
whilst missing the necessary 3D awareness and head priors. This makes them
prone to inconsistency and geometric distortions in the generated avatars. (2)
They fall short in fine-grained editing. This is primarily due to the inherited
limitations from the pre-trained 2D image diffusion models, which become more
pronounced when it comes to 3D head avatars. In this work, we address these
challenges by introducing a versatile coarse-to-fine pipeline dubbed HeadSculpt
for crafting (i.e., generating and editing) 3D head avatars from textual
prompts. Specifically, we first equip the diffusion model with 3D awareness by
leveraging landmark-based control and a learned textual embedding representing
the back view appearance of heads, enabling 3D-consistent head avatar
generations. We further propose a novel identity-aware editing score
distillation strategy to optimize a textured mesh with a high-resolution
differentiable rendering technique. This enables identity preservation while
following the editing instruction. We showcase HeadSculpt's superior fidelity
and editing capabilities through comprehensive experiments and comparisons with
existing methods.Comment: Webpage: https://brandonhan.uk/HeadSculpt
Rank-DETR for High Quality Object Detection
Modern detection transformers (DETRs) use a set of object queries to predict
a list of bounding boxes, sort them by their classification confidence scores,
and select the top-ranked predictions as the final detection results for the
given input image. A highly performant object detector requires accurate
ranking for the bounding box predictions. For DETR-based detectors, the
top-ranked bounding boxes suffer from less accurate localization quality due to
the misalignment between classification scores and localization accuracy, thus
impeding the construction of high-quality detectors. In this work, we introduce
a simple and highly performant DETR-based object detector by proposing a series
of rank-oriented designs, combinedly called Rank-DETR. Our key contributions
include: (i) a rank-oriented architecture design that can prompt positive
predictions and suppress the negative ones to ensure lower false positive
rates, as well as (ii) a rank-oriented loss function and matching cost design
that prioritizes predictions of more accurate localization accuracy during
ranking to boost the AP under high IoU thresholds. We apply our method to
improve the recent SOTA methods (e.g., H-DETR and DINO-DETR) and report strong
COCO object detection results when using different backbones such as
ResNet-, Swin-T, and Swin-L, demonstrating the effectiveness of our
approach. Code is available at \url{https://github.com/LeapLabTHU/Rank-DETR}.Comment: NeurIPS 202
Time2Stop: Adaptive and Explainable Human-AI Loop for Smartphone Overuse Intervention
Despite a rich history of investigating smartphone overuse intervention
techniques, AI-based just-in-time adaptive intervention (JITAI) methods for
overuse reduction are lacking. We develop Time2Stop, an intelligent, adaptive,
and explainable JITAI system that leverages machine learning to identify
optimal intervention timings, introduces interventions with transparent AI
explanations, and collects user feedback to establish a human-AI loop and adapt
the intervention model over time. We conducted an 8-week field experiment
(N=71) to evaluate the effectiveness of both the adaptation and explanation
aspects of Time2Stop. Our results indicate that our adaptive models
significantly outperform the baseline methods on intervention accuracy (>32.8\%
relatively) and receptivity (>8.0\%). In addition, incorporating explanations
further enhances the effectiveness by 53.8\% and 11.4\% on accuracy and
receptivity, respectively. Moreover, Time2Stop significantly reduces overuse,
decreasing app visit frequency by 7.08.9\%. Our subjective data also
echoed these quantitative measures. Participants preferred the adaptive
interventions and rated the system highly on intervention time accuracy,
effectiveness, and level of trust. We envision our work can inspire future
research on JITAI systems with a human-AI loop to evolve with users