12 research outputs found
A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise
The surge of interest towards Multi-modal Large Language Models (MLLMs),
e.g., GPT-4V(ision) from OpenAI, has marked a significant trend in both
academia and industry. They endow Large Language Models (LLMs) with powerful
capabilities in visual understanding, enabling them to tackle diverse
multi-modal tasks. Very recently, Google released Gemini, its newest and most
capable MLLM built from the ground up for multi-modality. In light of the
superior reasoning capabilities, can Gemini challenge GPT-4V's leading position
in multi-modal learning? In this paper, we present a preliminary exploration of
Gemini Pro's visual understanding proficiency, which comprehensively covers
four domains: fundamental perception, advanced cognition, challenging vision
tasks, and various expert capacities. We compare Gemini Pro with the
state-of-the-art GPT-4V to evaluate its upper limits, along with the latest
open-sourced MLLM, Sphinx, which reveals the gap between manual efforts and
black-box systems. The qualitative samples indicate that, while GPT-4V and
Gemini showcase different answering styles and preferences, they can exhibit
comparable visual reasoning capabilities, and Sphinx still trails behind them
concerning domain generalizability. Specifically, GPT-4V tends to elaborate
detailed explanations and intermediate steps, and Gemini prefers to output a
direct and concise answer. The quantitative evaluation on the popular MME
benchmark also demonstrates the potential of Gemini to be a strong challenger
to GPT-4V. Our early investigation of Gemini also observes some common issues
of MLLMs, indicating that there still remains a considerable distance towards
artificial general intelligence. Our project for tracking the progress of MLLM
is released at
https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models.Comment: Total 120 pages. See our project at
https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Model
Fourth-order coherence-function theory of laser-induced molecular reorientational grating and population grating
We have employed fourth-order coherence-function theory to study the influence of the partial-coherence properties of pump beams on the laser-induced gratings. First, we examine the formation of molecular reorientational grating. The different roles of phase fluctuation and amplitude fluctuation have been pointed out. A time-delayed method has been proposed to distinguish molecular reorientational grating from thermal grating. We then apply the fourth-order theory to study the Bragg reflection from a population grating. We obtain an analytic solution which enables us to make an extensive investigation on the temporal behaviour of the Bragg reflection signal. This study is especially helpful for elucidating the generation mechanism of population grating.Nous avons utilisé la théorie des fonctions de corrélation au quatrième ordre pour étudier l'influence des propriétés de cohérence partielle des faisceaux pompe sur les réseaux induits par laser. Tout d'abord, nous examinons la formation du réseau de réorientation moléculaire. Les rôles respectifs des fluctuations de phase et des fluctuations d'amplitude sont dégagés. On propose une méthode de retard temporel pour distinguer le réseau de réorientation moléculaire du réseau de population. Nous appliquons ensuite la théorie au quatrième ordre pour étudier la réflexion de Bragg sur le réseau de population. Nous obtenons une solution analytique qui nous permet d'étudier en détail le comportement temporel du signal de réflexion de Bragg. Cette étude est tout particulièrement utile pour éclaircir le mécanisme de formation du réseau de population
Population-specific GSTM1 copy number variation
As one of the major glutathione conjugation enzymes, GSTM1 detoxifies a number of drugs and xenobiotics. Its expression and activity have been shown to correlate both with cancer risks and drug resistance. Through a genome-wide association study, we identified a significant association between HapMap SNP rs366631 and GSTM1 expression. In this study, utilizing lymphoblastoid cell lines derived from International HapMap Consortium CEU and YRI populations, we designed and performed site-specific genotyping assays for both rs366631 and a highly homologous GSTM1 upstream site. Copy number variation (CNV) assays were performed for three different regions of the GSTM1 gene. We demonstrated that HapMap SNP rs366631 is a non-polymorphic site. The false genotyping call arises from sequence homology, a common GSTM1 region deletion and a non-specific genotyping platform used to identify the SNP. However, the HapMap call for rs366631 genotype is an indicator of GSTM1 upstream region deletion. Furthermore, this upstream deletion can be used as a marker of GSTM1 gene deletion. Using a novel GSTM1 CNV assay, we showed a population-specific CNV in this region upstream of the gene. More than 75% of the Caucasian (CEU) samples exhibit GSTM1 deletion and none contain two copies of GSTM1. In contrast, up to 25% of African (YRI) samples were found to have two copies of GSTM1. In conclusion, HapMap rs366631 is a pseudo-SNP that can be used as a GSTM1 deletion marker. Both the pseudo-SNP allele frequency and GSTM1 upstream region CNV show population-specific patterns between CEU and YRI samples