1,048 research outputs found
Dye-Sensitized Solar Cells Using Mesocarbon Microbead-Based Counter Electrodes
The dye-sensitized solar cells (DSCs) equipped with mesocarbon microbead (MCMB)-based counter electrodes were explored to examine their cell performance. Three types of nanosized additives including platinum, carbon nanotubes (CNTs), and carbon black (CB) are well dispersed and coated over microscaled MCMB powders. In the design of the counter electrodes, the MCMB graphite offers an excellent medium that allows charge transfer from the ITO substrate to the dye molecule. The active materials such as Pt, CNT, and nanosize CB act as an active site provider for the redox reaction. Among these counter electrodes, the DSCs fabricated with CB electrode exhibit the highest power conversion efficiency. This improved efficiency can be attributed to the fact that the CB nanoparticles not only offer a large number of catalytic sites but also low charge transfer resistance, facilitating a rapid reaction kinetics. Such design of carbon counter electrode has been confirmed to be a promising candidate for replacing Pt electrodes
An Early Evaluation of GPT-4V(ision)
In this paper, we evaluate different abilities of GPT-4V including visual
understanding, language understanding, visual puzzle solving, and understanding
of other modalities such as depth, thermal, video, and audio. To estimate
GPT-4V's performance, we manually construct 656 test instances and carefully
evaluate the results of GPT-4V. The highlights of our findings are as follows:
(1) GPT-4V exhibits impressive performance on English visual-centric benchmarks
but fails to recognize simple Chinese texts in the images; (2) GPT-4V shows
inconsistent refusal behavior when answering questions related to sensitive
traits such as gender, race, and age; (3) GPT-4V obtains worse results than
GPT-4 (API) on language understanding tasks including general language
understanding benchmarks and visual commonsense knowledge evaluation
benchmarks; (4) Few-shot prompting can improve GPT-4V's performance on both
visual understanding and language understanding; (5) GPT-4V struggles to find
the nuances between two similar images and solve the easy math picture puzzles;
(6) GPT-4V shows non-trivial performance on the tasks of similar modalities to
image, such as video and thermal. Our experimental results reveal the ability
and limitations of GPT-4V and we hope our paper can provide some insights into
the application and research of GPT-4V.Comment: Technical Report. Data are available at
https://github.com/albertwy/GPT-4V-Evaluatio
Sentiment Word Aware Multimodal Refinement for Multimodal Sentiment Analysis with ASR Errors
Multimodal sentiment analysis has attracted increasing attention and lots of
models have been proposed. However, the performance of the state-of-the-art
models decreases sharply when they are deployed in the real world. We find that
the main reason is that real-world applications can only access the text
outputs by the automatic speech recognition (ASR) models, which may be with
errors because of the limitation of model capacity. Through further analysis of
the ASR outputs, we find that in some cases the sentiment words, the key
sentiment elements in the textual modality, are recognized as other words,
which makes the sentiment of the text change and hurts the performance of
multimodal sentiment models directly. To address this problem, we propose the
sentiment word aware multimodal refinement model (SWRM), which can dynamically
refine the erroneous sentiment words by leveraging multimodal sentiment clues.
Specifically, we first use the sentiment word position detection module to
obtain the most possible position of the sentiment word in the text and then
utilize the multimodal sentiment word refinement module to dynamically refine
the sentiment word embeddings. The refined embeddings are taken as the textual
inputs of the multimodal feature fusion module to predict the sentiment labels.
We conduct extensive experiments on the real-world datasets including
MOSI-Speechbrain, MOSI-IBM, and MOSI-iFlytek and the results demonstrate the
effectiveness of our model, which surpasses the current state-of-the-art models
on three datasets. Furthermore, our approach can be adapted for other
multimodal feature fusion models easily. Data and code are available at
https://github.com/albertwy/SWRM.Comment: Findings of ACL 202
- …