518 research outputs found
Comparative studies on control systems for a two-blade variable-speed wind turbine with a speed exclusion zone
in KKAy mice
and mechanisms of resveratrol on the amelioration of oxidative stress and hepatic steatosi
Global and Individualized Community Detection in Inhomogeneous Multilayer Networks
In network applications, it has become increasingly common to obtain datasets
in the form of multiple networks observed on the same set of subjects, where
each network is obtained in a related but different experiment condition or
application scenario. Such datasets can be modeled by multilayer networks where
each layer is a separate network itself while different layers are associated
and share some common information. The present paper studies community
detection in a stylized yet informative inhomogeneous multilayer network model.
In our model, layers are generated by different stochastic block models, the
community structures of which are (random) perturbations of a common global
structure while the connecting probabilities in different layers are not
related. Focusing on the symmetric two block case, we establish minimax rates
for both \emph{global estimation} of the common structure and
\emph{individualized estimation} of layer-wise community structures. Both
minimax rates have sharp exponents. In addition, we provide an efficient
algorithm that is simultaneously asymptotic minimax optimal for both estimation
tasks under mild conditions. The optimal rates depend on the \emph{parity} of
the number of most informative layers, a phenomenon that is caused by
inhomogeneity across layers.Comment: Corrected a few typos. 96 pages (main manuscript: 27 pages,
appendices: 69 pages), 5 figure
Warp-centric GPU meta-meshing and fast triangulation of billion-scale lattice structures
Lattice structures have been widely used in applications due to their
superior mechanical properties. To fabricate such structures, a geometric
processing step called triangulation is often employed to transform them into
the STL format before sending them to 3D printers. Because lattice structures
tend to have high geometric complexity, this step usually generates a large
amount of triangles, a memory and compute-intensive task. This problem
manifests itself clearly through large-scale lattice structures that have
millions or billions of struts. To address this problem, this paper proposes to
transform a lattice structure into an intermediate model called meta-mesh
before undergoing real triangulation. Compared to triangular meshes,
meta-meshes are very lightweight and much less compute-demanding. The meta-mesh
can also work as a base mesh reusable for conveniently and efficiently
triangulating lattice structures with arbitrary resolutions. A CPU+GPU
asynchronous meta-meshing pipeline has been developed to efficiently generate
meta-meshes from lattice structures. It shifts from the thread-centric GPU
algorithm design paradigm commonly used in CAD to the recent warp-centric
design paradigm to achieve high performance. This is achieved by a new data
compression method, a GPU cache-aware data structure, and a workload-balanced
scheduling method that can significantly reduce memory divergence and branch
divergence. Experimenting with various billion-scale lattice structures, the
proposed method is seen to be two orders of magnitude faster than previously
achievable
I-LLM: Efficient Integer-Only Inference for Fully-Quantized Low-Bit Large Language Models
Post-training quantization (PTQ) serves as a potent technique to accelerate
the inference of large language models (LLMs). Nonetheless, existing works
still necessitate a considerable number of floating-point (FP) operations
during inference, including additional quantization and de-quantization, as
well as non-linear operators such as RMSNorm and Softmax. This limitation
hinders the deployment of LLMs on the edge and cloud devices. In this paper, we
identify the primary obstacle to integer-only quantization for LLMs lies in the
large fluctuation of activations across channels and tokens in both linear and
non-linear operations. To address this issue, we propose I-LLM, a novel
integer-only fully-quantized PTQ framework tailored for LLMs. Specifically, (1)
we develop Fully-Smooth Block-Reconstruction (FSBR) to aggressively smooth
inter-channel variations of all activations and weights. (2) to alleviate
degradation caused by inter-token variations, we introduce a novel approach
called Dynamic Integer-only MatMul (DI-MatMul). This method enables dynamic
quantization in full-integer matrix multiplication by dynamically quantizing
the input and outputs with integer-only operations. (3) we design
DI-ClippedSoftmax, DI-Exp, and DI-Normalization, which utilize bit shift to
execute non-linear operators efficiently while maintaining accuracy. The
experiment shows that our I-LLM achieves comparable accuracy to the FP baseline
and outperforms non-integer quantization methods. For example, I-LLM can
operate at W4A4 with negligible loss of accuracy. To our knowledge, we are the
first to bridge the gap between integer-only quantization and LLMs. We've
published our code on anonymous.4open.science, aiming to contribute to the
advancement of this field
Joint-Motion Mutual Learning for Pose Estimation in Videos
Human pose estimation in videos has long been a compelling yet challenging
task within the realm of computer vision. Nevertheless, this task remains
difficult because of the complex video scenes, such as video defocus and
self-occlusion. Recent methods strive to integrate multi-frame visual features
generated by a backbone network for pose estimation. However, they often ignore
the useful joint information encoded in the initial heatmap, which is a
by-product of the backbone generation. Comparatively, methods that attempt to
refine the initial heatmap fail to consider any spatio-temporal motion
features. As a result, the performance of existing methods for pose estimation
falls short due to the lack of ability to leverage both local joint (heatmap)
information and global motion (feature) dynamics.
To address this problem, we propose a novel joint-motion mutual learning
framework for pose estimation, which effectively concentrates on both local
joint dependency and global pixel-level motion dynamics. Specifically, we
introduce a context-aware joint learner that adaptively leverages initial
heatmaps and motion flow to retrieve robust local joint feature. Given that
local joint feature and global motion flow are complementary, we further
propose a progressive joint-motion mutual learning that synergistically
exchanges information and interactively learns between joint feature and motion
flow to improve the capability of the model. More importantly, to capture more
diverse joint and motion cues, we theoretically analyze and propose an
information orthogonality objective to avoid learning redundant information
from multi-cues. Empirical experiments show our method outperforms prior arts
on three challenging benchmarks.Comment: 10 pages, 5 figure
TPMS2STEP: error-controlled and C2 continuity-preserving translation of TPMS models to STEP files based on constrained-PIA
Triply periodic minimal surface (TPMS) is emerging as an important way of
designing microstructures. However, there has been limited use of commercial
CAD/CAM/CAE software packages for TPMS design and manufacturing. This is mainly
because TPMS is consistently described in the functional representation (F-rep)
format, while modern CAD/CAM/CAE tools are built upon the boundary
representation (B-rep) format. One possible solution to this gap is translating
TPMS to STEP, which is the standard data exchange format of CAD/CAM/CAE.
Following this direction, this paper proposes a new translation method with
error-controlling and continuity-preserving features. It is based on an
approximation error-driven TPMS sampling algorithm and a constrained-PIA
algorithm. The sampling algorithm controls the deviation between the original
and translated models. With it, an error bound of on the deviation
can be ensured if two conditions called -density and
-approximation are satisfied. The constrained-PIA algorithm enforces
continuity constraints during TPMS approximation, and meanwhile attaining
high efficiency. A theoretical convergence proof of this algorithm is also
given. The effectiveness of the translation method has been demonstrated by a
series of examples and comparisons
Autophagy regulates the maturation of hematopoietic precursors in the embryo
An understanding of the mechanisms regulating embryonic hematopoietic stem cell (HSC) development would facilitate their regeneration. The aorta-gonad-mesonephros region is the site for HSC production from hemogenic endothelial cells (HEC). While several distinct regulators are involved in this process, it is not yet known whether macroautophagy (autophagy) plays a role in hematopoiesis in the pre-liver stage. Here, we show that different states of autophagy exist in hematopoietic precursors and correlate with hematopoietic potential based on the LC3-RFP-EGFP mouse model. Deficiency of autophagy-related gene 5 (Atg5) specifically in endothelial cells disrupts endothelial to hematopoietic transition (EHT), by blocking the autophagic process. Using combined approaches, including single-cell RNA-sequencing (scRNA-seq), we have confirmed that Atg5 deletion interrupts developmental temporal order of EHT to further affect the pre-HSC I maturation, and that autophagy influences hemogenic potential of HEC and the formation of pre-HSC I likely via the nucleolin pathway. These findings demonstrate a role for autophagy in the formation/maturation of hematopoietic precursors.</p
- …
