194 research outputs found
Prismer: A Vision-Language Model with An Ensemble of Experts
Recent vision-language models have shown impressive multi-modal generation
capabilities. However, typically they require training huge models on massive
datasets. As a more scalable alternative, we introduce Prismer, a data- and
parameter-efficient vision-language model that leverages an ensemble of domain
experts. Prismer only requires training of a small number of components, with
the majority of network weights inherited from readily-available, pre-trained
domain experts, and kept frozen during training. By leveraging experts from a
wide range of domains, we show that Prismer can efficiently pool this expert
knowledge and adapt it to various vision-language reasoning tasks. In our
experiments, we show that Prismer achieves fine-tuned and few-shot learning
performance which is competitive with current state-of-the-art models, whilst
requiring up to two orders of magnitude less training data. Code is available
at https://github.com/NVlabs/prismer.Comment: Tech Report. Project Page: https://shikun.io/projects/prismer Code:
https://github.com/NVlabs/prismer v2: fixed incorrect training cost estimate
and zero-shot NoCaps performance of SimVL
Imbalanced Gradients: A Subtle Cause of Overestimated Adversarial Robustness
Evaluating the robustness of a defense model is a challenging task in
adversarial robustness research. Obfuscated gradients, a type of gradient
masking, have previously been found to exist in many defense methods and cause
a false signal of robustness. In this paper, we identify a more subtle
situation called Imbalanced Gradients that can also cause overestimated
adversarial robustness. The phenomenon of imbalanced gradients occurs when the
gradient of one term of the margin loss dominates and pushes the attack towards
to a suboptimal direction. To exploit imbalanced gradients, we formulate a
Margin Decomposition (MD) attack that decomposes a margin loss into individual
terms and then explores the attackability of these terms separately via a
two-stage process. We also propose a MultiTargeted and an ensemble version of
our MD attack. By investigating 17 defense models proposed since 2018, we find
that 6 models are susceptible to imbalanced gradients and our MD attack can
decrease their robustness evaluated by the best baseline standalone attack by
another 2%. We also provide an in-depth analysis of the likely causes of
imbalanced gradients and effective countermeasures.Comment: 19 pages, 7 figue
Demonstration of MaskSearch: Efficiently Querying Image Masks for Machine Learning Workflows
We demonstrate MaskSearch, a system designed to accelerate queries over
databases of image masks generated by machine learning models. MaskSearch
formalizes and accelerates a new category of queries for retrieving images and
their corresponding masks based on mask properties, which support various
applications, from identifying spurious correlations learned by models to
exploring discrepancies between model saliency and human attention. This
demonstration makes the following contributions:(1) the introduction of
MaskSearch's graphical user interface (GUI), which enables interactive
exploration of image databases through mask properties, (2) hands-on
opportunities for users to explore MaskSearch's capabilities and constraints
within machine learning workflows, and (3) an opportunity for conference
attendees to understand how MaskSearch accelerates queries over image masks
Re-ViLM: Retrieval-Augmented Visual Language Model for Zero and Few-Shot Image Captioning
Augmenting pretrained language models (LMs) with a vision encoder (e.g.,
Flamingo) has obtained the state-of-the-art results in image-to-text
generation. However, these models store all the knowledge within their
parameters, thus often requiring enormous model parameters to model the
abundant visual concepts and very rich textual descriptions. Additionally, they
are inefficient in incorporating new data, requiring a computational-expensive
fine-tuning process. In this work, we introduce a Retrieval-augmented Visual
Language Model, Re-ViLM, built upon the Flamingo, that supports retrieving the
relevant knowledge from the external database for zero and in-context few-shot
image-to-text generations. By storing certain knowledge explicitly in the
external database, our approach reduces the number of model parameters and can
easily accommodate new data during evaluation by simply updating the database.
We also construct an interleaved image and text data that facilitates
in-context few-shot learning capabilities. We demonstrate that Re-ViLM
significantly boosts performance for image-to-text generation tasks, especially
for zero-shot and few-shot generation in out-of-domain settings with 4 times
less parameters compared with baseline methods.Comment: Findings of EMNLP 202
A novel high-DPI and monodisperse droplet inkjet printhead with the piezoelectric cutter
High dots per inch (DPI) is the core index of inkjet printer, which is hindered by satellite ink droplet. Herein, we propose a novel high-DPI and monodisperse droplet inkjet printhead with the piezoelectric cutter. The as-established model has optimized the inkjet printhead structural parameters, actuating and cutting signal waveforms. The cutter element achieves moving the break-up point to the middle of the ink column, reducing the length of tail and generating a monodisperse droplet. Additionally, the cutter consistently reduces the droplet length with different ink properties including viscosity, density, surface tension, and contact angle, exhibiting high applicability. The research results provide an in-depth study on the design of high-DPI and monodisperse inkjet printheads, offering an efficient approach to improve inkjet printhead performance
A novel high-DPI and monodisperse droplet inkjet printhead with the piezoelectric cutter
High dots per inch (DPI) is the core index of inkjet printer, which is hindered by satellite ink droplet. Herein, we propose a novel high-DPI and monodisperse droplet inkjet printhead with the piezoelectric cutter. The as-established model has optimized the inkjet printhead structural parameters, actuating and cutting signal waveforms. The cutter element achieves moving the break-up point to the middle of the ink column, reducing the length of tail and generating a monodisperse droplet. Additionally, the cutter consistently reduces the droplet length with different ink properties including viscosity, density, surface tension, and contact angle, exhibiting high applicability. The research results provide an in-depth study on the design of high-DPI and monodisperse inkjet printheads, offering an efficient approach to improve inkjet printhead performance
SRT1720 Alleviates ANIT-Induced Cholestasis in a Mouse Model
Intrahepatic cholestasis is a kind of clinical syndrome along with hepatotoxicity which caused by intrahepatic and systemic accumulations of bile acid. There are several crucial generating factors of the pathogenesis of cholestasis, such as inflammation, dysregulation of bile acid transporters and oxidative stress. SIRT1 is regarded as a class III histone deacetylase (HDAC). According to a set of researches, SIRT1 is one of the most important factors which can regulate the hepatic bile acid metabolism. SRT1720 is a kind of activator of SIRT1 which is 1000 times more potent than resveratrol, and this paper is aimed to study its protective influence on hepatotoxicity and cholestasis induced by alpha-naphthylisothiocyanate (ANIT) in mice. The findings revealed that SRT1720 treatment increased FXR and Nrf2 gene expressions to shield against hepatotoxicity and cholestasis induced by ANIT. The mRNA levels of hepatic bile acid transporters were also altered by SRT1720. Furthermore, SRT1720 enhanced the antioxidative system by increasing Nrf2, SOD, GCLc, GCLm, Nqo1, and HO-1 gene expressions. In conclusion, a protective influence could be provided by SRT1720 to cure ANIT-induced hepatotoxicity and cholestasis, which was partly through FXR and Nrf2 activations. These results indicated that SIRT1 could be regarded as a therapeutic target to cure the cholestasis
- …
