58 research outputs found
MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models
Multimodal Large Language Model (MLLM) relies on the powerful LLM to perform
multimodal tasks, showing amazing emergent abilities in recent studies, such as
writing poems based on an image. However, it is difficult for these case
studies to fully reflect the performance of MLLM, lacking a comprehensive
evaluation. In this paper, we fill in this blank, presenting the first MLLM
Evaluation benchmark MME. It measures both perception and cognition abilities
on a total of 14 subtasks. In order to avoid data leakage that may arise from
direct use of public datasets for evaluation, the annotations of
instruction-answer pairs are all manually designed. The concise instruction
design allows us to fairly compare MLLMs, instead of struggling in prompt
engineering. Besides, with such an instruction, we can also easily carry out
quantitative statistics. A total of 10 advanced MLLMs are comprehensively
evaluated on our MME, which not only suggests that existing MLLMs still have a
large room for improvement, but also reveals the potential directions for the
subsequent model optimization.Comment: https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Model
A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise
The surge of interest towards Multi-modal Large Language Models (MLLMs),
e.g., GPT-4V(ision) from OpenAI, has marked a significant trend in both
academia and industry. They endow Large Language Models (LLMs) with powerful
capabilities in visual understanding, enabling them to tackle diverse
multi-modal tasks. Very recently, Google released Gemini, its newest and most
capable MLLM built from the ground up for multi-modality. In light of the
superior reasoning capabilities, can Gemini challenge GPT-4V's leading position
in multi-modal learning? In this paper, we present a preliminary exploration of
Gemini Pro's visual understanding proficiency, which comprehensively covers
four domains: fundamental perception, advanced cognition, challenging vision
tasks, and various expert capacities. We compare Gemini Pro with the
state-of-the-art GPT-4V to evaluate its upper limits, along with the latest
open-sourced MLLM, Sphinx, which reveals the gap between manual efforts and
black-box systems. The qualitative samples indicate that, while GPT-4V and
Gemini showcase different answering styles and preferences, they can exhibit
comparable visual reasoning capabilities, and Sphinx still trails behind them
concerning domain generalizability. Specifically, GPT-4V tends to elaborate
detailed explanations and intermediate steps, and Gemini prefers to output a
direct and concise answer. The quantitative evaluation on the popular MME
benchmark also demonstrates the potential of Gemini to be a strong challenger
to GPT-4V. Our early investigation of Gemini also observes some common issues
of MLLMs, indicating that there still remains a considerable distance towards
artificial general intelligence. Our project for tracking the progress of MLLM
is released at
https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models.Comment: Total 120 pages. See our project at
https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Model
CDK5-dependent BAG3 degradation modulates synaptic protein turnover
阿尔茨海默病(AD)是严重威胁人类健康的重大神经系统疾病,AD的发生发展与衰老密切相关,目前临床治疗方法十分有限。因此迫切需要从AD致病早期入手,发现和鉴定导致AD神经功能紊乱的机制和靶点,为AD的早期防治提供基础。张杰教授及其团队从高通量磷酸化蛋白质组学入手,系统研究了CDK5在神经细胞中的磷酸化底物,鉴定出了在蛋白质量控制中发挥重要功能的BAG3蛋白是CDK5的全新底物。课题组从磷酸化蛋白质组学入手,发现和阐明了细胞周期蛋白激酶5(CDK5)通过调控BAG3在维持突触蛋白水平调控中的作用机制,及其在阿尔茨海默病(AD)发生发展中的机理。
该研究是多个团队历时8年合作完成的,香港中文大学的周熙文教授、美国匹兹堡大学的Karl Herrup教授、美国Sanford-Burnham研究所的许华曦教授、美国梅奥医学中心的卜国军教授,厦门大学医学院的文磊教授、张云武教授、赵颖俊教授、薛茂强教授,军事医学科学院的袁增强教授等都参与了该工作。
厦门大学医学院2012级博士生周杰超等为文章的第一作者,张杰教授为通讯作者。Background
Synaptic protein dyshomeostasis and functional loss is an early invariant feature of
Alzheimer’s disease (AD), yet the unifying etiological pathway remains largely unknown.
Knowing that cyclin-dependent kinase 5 (CDK5) plays critical roles in synaptic formation
and degeneration, its phosphorylation targets were re-examined in search for candidates with
direct global impacts on synaptic protein dynamics, and the associated regulatory network
was also analyzed.
Methods
Quantitative phospho-proteomics and bioinformatics analyses were performed to identify
top-ranked candidates. A series of biochemical assays were used to investigate the associated
regulatory signaling networks. Histological, electrochemical and behavioral assays were
performed in conditional knockout, shRNA-mediated knockdown and AD-related mice
models to evaluate its relevance to synaptic homeostasis and functions.
Results
Among candidates with known implications in synaptic modulations, BCL2-associated
athanogene-3 (BAG3) ranked the highest. CDK5-mediated phosphorylation on
Ser297/Ser291 (Mouse/Human) destabilized BAG3. Loss of BAG3 unleashed the selective
protein degradative function of the HSP70 machinery. In neurons, this resulted in enhanced
degradation of a number of glutamatergic synaptic proteins. Conditional neuronal knockout of
Bag3 in vivo led to impairment of learning and memory functions. In human AD and
related-mouse models, aberrant CDK5-mediated loss of BAG3 yielded similar effects on
synaptic homeostasis. Detrimental effects of BAG3 loss on learning and memory functions
were confirmed in these mice, and such were reversed by ectopic BAG3 re-expression.
Conclusions
Our results highlight that neuronal CDK5-BAG3-HSP70 signaling axis plays a critical
role in modulating synaptic homeostasis. Dysregulation of the signaling pathway directly
contributes to synaptic dysfunction and AD pathogenesis.This work was supported by the National Science Foundation in China (Grant: 31571055, 81522016, 81271421 to J.Z.; 81801337 to L.L; 81774377 and 81373999 to L.W.); Fundamental Research Funds for the Central Universities of China-Xiamen University (Grant: 20720150062, 20720180049 and 20720160075 to J.Z.); Fundamental Research Funds for Fujian Province University Leading Talents (Grant JAT170003 to L.L); Hong Kong Research Grants Council (HKUST12/CRF/13G, GRF660813, GRF16101315, AoE/M-05/12 to K.H.; GRF16103317, GRF16100718 and GRF16100219 to H.-M,C.); Offices of Provost, VPRG and Dean of Science, HKUST (VPRGO12SC02 to K.H.); Chinese University of Hong Kong (CUHK) Improvement on Competitiveness in Hiring New Faculty Funding Scheme (Ref. 133), CUHK Faculty Startup Fund and Alzheimer’s Association Research Fellowship (AARF-17-531566) to H.-M, C.
该研究受到了国家自然科学基金、厦门大学校长基金、福建省卫生教育联合攻关基金等的资助
Mengdan Sun's Quick Files
The Quick Files feature was discontinued and it’s files were migrated into this Project on March 11, 2022. The file URL’s will still resolve properly, and the Quick Files logs are available in the Project’s Recent Activity
Competitive dendrite growth during directional solidification of a transparent alloy: Modeling and experiment
A two-dimensional (2-D) cellular automaton-finite difference method (CA-FDM) model and in situ observation experiments of directional solidification using a transparent alloy of SCN-2wt.% ACE are employed to investigate various microstructural evolution of columnar dendrites during directional solidification. In the present model, the growth of columnar dendrites is simulated using a CA technique. The solute diffusion is solved using the FDM. The model is capable of visualizing the interaction between the formation of dendrite arrays with identical or different growth orientations, and the evolving solute concentration field. Several dendritic competitive growth modes between two converging and diverging dendrite arrays are reproduced. The simulation results agree well with the experimental observations. The simulations are also performed to study the effects of temperature gradient and cooling rate on the growth morphology of diverging dendrites. It is found that with the increase of temperature gradient and cooling rate, the tertiary branches produced from the well-developed side branches of the unfavorably oriented grain at the divergent grain boundaries are more likely to become the new primary dendrite arms
A More Realistic Markov Process Model for Explaining the Disjunction Effect in One-Shot Prisoner’s Dilemma Game
The quantum model has been considered to be advantageous over the Markov model in explaining irrational behaviors (e.g., the disjunction effect) during decision making. Here, we reviewed and re-examined the ability of the quantum belief–action entanglement (BAE) model and the Markov belief–action (BA) model in explaining the disjunction effect considering a more realistic setting. The results indicate that neither of the two models can truly represent the underlying cognitive mechanism. Thus, we proposed a more realistic Markov model to explain the disjunction effect in the prisoner’s dilemma game. In this model, the probability transition pattern of a decision maker (DM) is dependent on the information about the opponent’s action, Also, the relationship between the cognitive components in the evolution dynamics is moderated by the DM’s degree of subjective uncertainty (DSN). The results show that the disjunction effect can be well predicted by a more realistic Markov model. Model comparison suggests the superiority of the proposed Markov model over the quantum BAE model in terms of absolute model performance, relative model performance, and model flexibility. Therefore, we suggest that the key to successfully explaining the disjunction effect is to consider the underlying cognitive mechanism properly
A More Realistic Markov Process Model for Explaining the Disjunction Effect in One-Shot Prisoner’s Dilemma Game
The quantum model has been considered to be advantageous over the Markov model in explaining irrational behaviors (e.g., the disjunction effect) during decision making. Here, we reviewed and re-examined the ability of the quantum belief–action entanglement (BAE) model and the Markov belief–action (BA) model in explaining the disjunction effect considering a more realistic setting. The results indicate that neither of the two models can truly represent the underlying cognitive mechanism. Thus, we proposed a more realistic Markov model to explain the disjunction effect in the prisoner’s dilemma game. In this model, the probability transition pattern of a decision maker (DM) is dependent on the information about the opponent’s action, Also, the relationship between the cognitive components in the evolution dynamics is moderated by the DM’s degree of subjective uncertainty (DSN). The results show that the disjunction effect can be well predicted by a more realistic Markov model. Model comparison suggests the superiority of the proposed Markov model over the quantum BAE model in terms of absolute model performance, relative model performance, and model flexibility. Therefore, we suggest that the key to successfully explaining the disjunction effect is to consider the underlying cognitive mechanism properly
Providing depth information in the display for pursuit and compensatory tracking and optimization in 3‐D space
Objective The formats of tracking displays exert important influences on tracking performance. Few previous studies explored the 3‐D tracking display formats. The present study aimed to construct the 3‐D formats for the manual pursuit and compensatory tracking displays by adding the depth information. Based on the results of tracking performance, we further optimized the preferable tracking format. Method Three experiments were conducted. Experiment 1 was a confirmatory experiment to compare the effects of the two display formats on 2‐D manual tracking performance with previous studies. Experiment 2 extended the investigation to a 3‐D display by adding a depth cue indicating the relative size of the control marker and target. Experiment 3 was an optimisation experiment in which an improved 3‐D tracking display was modified, i.e., an extra depth cue was complemented to clearly signify the relative position of the target and the control marker. Results Pursuit tracking performance was better than compensatory tracking performance in both 2‐D (Experiment 1) and 3‐D space (Experiment 2). It also found that the extra depth cue significantly improved the tracking success rate and the subjective satisfaction of the pursuit display format in 3‐D space (Experiment 3). Conclusions These findings indicated that the depth cues could be used in tracking display in 3‐D space and have important implications for the design of some motor training and tracking systems
The applicability of eye‐controlled highlighting to the field of visual searching
Objective With the increasing amount of information presented on current human–computer interfaces, eye‐controlled highlighting has been proposed, as a new display technique, to optimise users’ task performances. However, it is unknown to what extent the eye‐controlled highlighting display facilitates visual search performance. The current study examined the facilitative effect of eye‐controlled highlighting display technique on visual search with two major attributes of visual stimuli: stimulus type and the visual similarity between targets and distractors. Method In Experiment 1, we used digits and Chinese words as materials to explore the generalisation of the facilitative effect of the eye‐controlled highlighting. In Experiment 2, we used Chinese words to examine the effect of target‐distractor similarity on the facilitation of eye‐controlled highlighting display. Results The eye‐controlling highlighting display improved visual search performance when words were used as searching target and when the target‐distractor similarity was high. No facilitative effect was found when digits were used as searching target or target‐distractor similarity was low. Conclusions The effectiveness of the eye‐controlled highlighting on a visual task was influenced by both stimulus type and target‐distractor similarity. These findings provided guidelines for modern interface design with eye‐based displays implemented
- …