27 research outputs found
InceptionNeXt: When Inception Meets ConvNeXt
Inspired by the long-range modeling ability of ViTs, large-kernel
convolutions are widely studied and adopted recently to enlarge the receptive
field and improve model performance, like the remarkable work ConvNeXt which
employs 7x7 depthwise convolution. Although such depthwise operator only
consumes a few FLOPs, it largely harms the model efficiency on powerful
computing devices due to the high memory access costs. For example, ConvNeXt-T
has similar FLOPs with ResNet-50 but only achieves 60% throughputs when trained
on A100 GPUs with full precision. Although reducing the kernel size of ConvNeXt
can improve speed, it results in significant performance degradation. It is
still unclear how to speed up large-kernel-based CNN models while preserving
their performance. To tackle this issue, inspired by Inceptions, we propose to
decompose large-kernel depthwise convolution into four parallel branches along
channel dimension, i.e. small square kernel, two orthogonal band kernels, and
an identity mapping. With this new Inception depthwise convolution, we build a
series of networks, namely IncepitonNeXt, which not only enjoy high throughputs
but also maintain competitive performance. For instance, InceptionNeXt-T
achieves 1.6x higher training throughputs than ConvNeX-T, as well as attains
0.2% top-1 accuracy improvement on ImageNet-1K. We anticipate InceptionNeXt can
serve as an economical baseline for future architecture design to reduce carbon
footprint. Code is available at https://github.com/sail-sg/inceptionnext.Comment: Code: https://github.com/sail-sg/inceptionnex
MetaFormer Is Actually What You Need for Vision
Transformers have shown great potential in computer vision tasks. A common
belief is their attention-based token mixer module contributes most to their
competence. However, recent works show the attention-based module in
Transformers can be replaced by spatial MLPs and the resulted models still
perform quite well. Based on this observation, we hypothesize that the general
architecture of the Transformers, instead of the specific token mixer module,
is more essential to the model's performance. To verify this, we deliberately
replace the attention module in Transformers with an embarrassingly simple
spatial pooling operator to conduct only basic token mixing. Surprisingly, we
observe that the derived model, termed as PoolFormer, achieves competitive
performance on multiple computer vision tasks. For example, on ImageNet-1K,
PoolFormer achieves 82.1% top-1 accuracy, surpassing well-tuned Vision
Transformer/MLP-like baselines DeiT-B/ResMLP-B24 by 0.3%/1.1% accuracy with
35%/52% fewer parameters and 50%/62% fewer MACs. The effectiveness of
PoolFormer verifies our hypothesis and urges us to initiate the concept of
"MetaFormer", a general architecture abstracted from Transformers without
specifying the token mixer. Based on the extensive experiments, we argue that
MetaFormer is the key player in achieving superior results for recent
Transformer and MLP-like models on vision tasks. This work calls for more
future research dedicated to improving MetaFormer instead of focusing on the
token mixer modules. Additionally, our proposed PoolFormer could serve as a
starting baseline for future MetaFormer architecture design. Code is available
at https://github.com/sail-sg/poolformer.Comment: CVPR 2022 (Oral). Code: https://github.com/sail-sg/poolforme
MYC activation cooperates with Vhl and Ink4a/Arf loss to induce clear cell renal cell carcinoma
Renal carcinoma is a common and aggressive malignancy whose histopathogenesis is incompletely understood and that is largely resistant to cytotoxic chemotherapy. We present two mouse models of kidney cancer that recapitulate the genomic alterations found in human papillary (pRCC) and clear cell RCC (ccRCC), the most common RCC subtypes. MYC activation results in highly penetrant pRCC tumours (MYC), while MYC activation, when combined with Vhl and Cdkn2a (Ink4a/Arf) deletion (VIM), produce kidney tumours that approximate human ccRCC. RNAseq of the mouse tumours demonstrate that MYC tumours resemble Type 2 pRCC, which are known to harbour MYC activation. Furthermore, VIM tumours more closely simulate human ccRCC. Based on their high penetrance, short latency, and histologic fidelity, these models of papillary and clear cell RCC should be significant contributions to the field of kidney cancer research
Inception Transformer
Recent studies show that Transformer has strong capability of building
long-range dependencies, yet is incompetent in capturing high frequencies that
predominantly convey local information. To tackle this issue, we present a
novel and general-purpose Inception Transformer, or iFormer for short, that
effectively learns comprehensive features with both high- and low-frequency
information in visual data. Specifically, we design an Inception mixer to
explicitly graft the advantages of convolution and max-pooling for capturing
the high-frequency information to Transformers. Different from recent hybrid
frameworks, the Inception mixer brings greater efficiency through a channel
splitting mechanism to adopt parallel convolution/max-pooling path and
self-attention path as high- and low-frequency mixers, while having the
flexibility to model discriminative information scattered within a wide
frequency range. Considering that bottom layers play more roles in capturing
high-frequency details while top layers more in modeling low-frequency global
information, we further introduce a frequency ramp structure, i.e. gradually
decreasing the dimensions fed to the high-frequency mixer and increasing those
to the low-frequency mixer, which can effectively trade-off high- and
low-frequency components across different layers. We benchmark the iFormer on a
series of vision tasks, and showcase that it achieves impressive performance on
image classification, COCO detection and ADE20K segmentation. For example, our
iFormer-S hits the top-1 accuracy of 83.4% on ImageNet-1K, much higher than
DeiT-S by 3.6%, and even slightly better than much bigger model Swin-B (83.3%)
with only 1/4 parameters and 1/3 FLOPs. Code and models will be released at
https://github.com/sail-sg/iFormer.Comment: Code and models will be released at
https://github.com/sail-sg/iForme
Bio-inspired computing: 13th international conference, BIC-TA 2018, Beijing, China, November 2-4, 2018, proceedings, part II
Asynchronous mixing of kidney progenitor cells potentiates nephrogenesis in organoids
A fundamental challenge in emulating kidney tissue formation through directed differentiation of human pluripotent stem cells is that kidney development is iterative, and to reproduce the asynchronous mix of differentiation states found in the fetal kidney we combined cells differentiated at different times in the same organoid. Asynchronous mixing promoted nephrogenesis, and heterochronic organoids were well vascularized when engrafted under the kidney capsule. Micro-CT and injection of a circulating vascular marker demonstrated that engrafted kidney tissue was connected to the systemic circulation by 2 weeks after engraftment. Proximal tubule glucose uptake was confirmed, but despite these promising measures of graft function, overgrowth of stromal cells prevented long-term study. We propose that this is a technical feature of the engraftment procedure rather than a specific shortcoming of the directed differentiation because kidney organoids derived from primary cells and whole embryonic kidneys develop similar stromal overgrowth when engrafted under the kidney capsule
Aerodynamik eines stumpfen Kegels in reagierender Ueberschallstroemung
Aerodynamics of a blunted cone in reacting hypersonic flows. The flow of a planetary probe under hypervelocity re-entry conditions has two idiosyncrasies not present in (cold) hypersonic flows of conventional test facilities: the strong dissociation reactions occurring behind the bow shock wave and the freezing of the chemical reactions of the flow by the rapid expansion at the shoulder of the probe. The aims of the present study was to understand the relative importance of the two phenomena upon the total heat and pressure loads on a planetary probe and its possible payload. For the experimental study an instrumented blunted 140 cone was tested in the High Enthalpy Schock Tunnel (HEG) of the DLR in Goettingen. Numerical calculations were performed with a Thin-Layer Navier-Stokes code which is capable of simulating chemical and thermal nonequilibrium flows. The density contours of the flowfield by the holographic interferometer demonstrate the fast chemical freezing of the flow at the shoulder, nevertheless the extension of the non-equilibrium region behind the bow shock wave was overpredicted by the numerical calculations. For the forebody loads the prediction methods were reliable, whereas in the wake of the model considerable discrepancies between experimental and numerical results have been observed. (orig.)102 figs., 21 tabs., 146 refs.Available from FIZ Karlsruhe / FIZ - Fachinformationszzentrum Karlsruhe / TIB - Technische InformationsbibliothekSIGLEDEGerman