79 research outputs found
Global regularity to the Navier-Stokes equations for a class of large initial data
In [5], Chemin, Gallagher and Paicu proved the global regularity of solutions to the classical Navier-Stokes equations with a class of large initial data on T2 × R. This data varies slowly in vertical variable and has a norm which blows up as the small parameter ( represented by ǫ in the paper) tends to zero. However, to the best of our knowledge, the result is still unclear for the whole spaces R3. In this paper, we consider the generalized Navier-Stokes equations on Rn(n ≥ 3):
∂tu + u · ∇u + Dsu + ∇P = 0, div u = 0.
For some suitable number s, we prove that the Cauchy problem with initial data of the form u0ǫ(x) = (v0h(xǫ), ǫ−1v0n(xǫ))T , xǫ = (xh, ǫxn)T , is globally well-posed for all small ǫ > 0, provided that the initial velocity profile v0 is analytic in xn and certain norm of v0 is sufficiently small but independent of ǫ. In particular, our result is true for the n-dimensional classical Navier-Stokes equations with n ≥ 4 and the fractional Navier-Stokes equations with 1 ≤ s < 2 in 3D
Mask-Attention-Free Transformer for 3D Instance Segmentation
Recently, transformer-based methods have dominated 3D instance segmentation,
where mask attention is commonly involved. Specifically, object queries are
guided by the initial instance masks in the first cross-attention, and then
iteratively refine themselves in a similar manner. However, we observe that the
mask-attention pipeline usually leads to slow convergence due to low-recall
initial instance masks. Therefore, we abandon the mask attention design and
resort to an auxiliary center regression task instead. Through center
regression, we effectively overcome the low-recall issue and perform
cross-attention by imposing positional prior. To reach this goal, we develop a
series of position-aware designs. First, we learn a spatial distribution of 3D
locations as the initial position queries. They spread over the 3D space
densely, and thus can easily capture the objects in a scene with a high recall.
Moreover, we present relative position encoding for the cross-attention and
iterative refinement for more accurate position queries. Experiments show that
our approach converges 4x faster than existing work, sets a new state of the
art on ScanNetv2 3D instance segmentation benchmark, and also demonstrates
superior performance across various datasets. Code and models are available at
https://github.com/dvlab-research/Mask-Attention-Free-Transformer.Comment: Accepted to ICCV 2023. Code and models are available at
https://github.com/dvlab-research/Mask-Attention-Free-Transforme
GlyphControl: Glyph Conditional Control for Visual Text Generation
Recently, there has been a growing interest in developing diffusion-based
text-to-image generative models capable of generating coherent and well-formed
visual text. In this paper, we propose a novel and efficient approach called
GlyphControl to address this task. Unlike existing methods that rely on
character-aware text encoders like ByT5 and require retraining of text-to-image
models, our approach leverages additional glyph conditional information to
enhance the performance of the off-the-shelf Stable-Diffusion model in
generating accurate visual text. By incorporating glyph instructions, users can
customize the content, location, and size of the generated text according to
their specific requirements. To facilitate further research in visual text
generation, we construct a training benchmark dataset called LAION-Glyph. We
evaluate the effectiveness of our approach by measuring OCR-based metrics and
CLIP scores of the generated visual text. Our empirical evaluations demonstrate
that GlyphControl outperforms the recent DeepFloyd IF approach in terms of OCR
accuracy and CLIP scores, highlighting the efficacy of our method.Comment: Technical report. The codes will be released at
https://github.com/AIGText/GlyphControl-releas
Data Pruning via Moving-one-Sample-out
In this paper, we propose a novel data-pruning approach called
moving-one-sample-out (MoSo), which aims to identify and remove the least
informative samples from the training set. The core insight behind MoSo is to
determine the importance of each sample by assessing its impact on the optimal
empirical risk. This is achieved by measuring the extent to which the empirical
risk changes when a particular sample is excluded from the training set.
Instead of using the computationally expensive leaving-one-out-retraining
procedure, we propose an efficient first-order approximator that only requires
gradient information from different training stages. The key idea behind our
approximation is that samples with gradients that are consistently aligned with
the average gradient of the training set are more informative and should
receive higher scores, which could be intuitively understood as follows: if the
gradient from a specific sample is consistent with the average gradient vector,
it implies that optimizing the network using the sample will yield a similar
effect on all remaining samples. Experimental results demonstrate that MoSo
effectively mitigates severe performance degradation at high pruning ratios and
achieves satisfactory performance across various settings.Comment: Accepted by the Thirty-seventh Conference on Neural Information
Processing Systems (NeurIPS 2023
- …