108 research outputs found
Analyzing the Profitability Determinants: Evidence from Chinese Banks, 2016 to 2020
The reform of Chinese banking industry has imposed a profound effect on banks profitability in the recent decade. Among these actions of reform, we believe the reform of interest rate liberalization is one of the most important reforms in the recent five years. In this paper, we conduct a research on determinants of profitability for Chinese banking industry over year 2016 to 2020. Two methods are applied to the study which are fixed effect estimation and two step system Generalized Method of Moment (S-GMM). We found that smaller banks with lower equity to assets ratio, lower credit risk and better cost management tend to outperform those banks with higher equity to assets ratio, higher credit risk and poor cost management. We also found a positive and significant correlation between z score and bank profitability, taxation was also found to be positively correlated with bank performance but only with minor effect. Most importantly, we found an evidence that joint-stock commercial banks(JSCBs) tend to outperform other types of banks and a negative shock brought by the COVID-19 to the banks profitability. This shock is especially obvious in 2020
Stretching Sentence-pair NLI Models to Reason over Long Documents and Clusters
Natural Language Inference (NLI) has been extensively studied by the NLP
community as a framework for estimating the semantic relation between sentence
pairs. While early work identified certain biases in NLI models, recent
advancements in modeling and datasets demonstrated promising performance. In
this work, we further explore the direct zero-shot applicability of NLI models
to real applications, beyond the sentence-pair setting they were trained on.
First, we analyze the robustness of these models to longer and out-of-domain
inputs. Then, we develop new aggregation methods to allow operating over full
documents, reaching state-of-the-art performance on the ContractNLI dataset.
Interestingly, we find NLI scores to provide strong retrieval signals, leading
to more relevant evidence extractions compared to common similarity-based
methods. Finally, we go further and investigate whole document clusters to
identify both discrepancies and consensus among sources. In a test case, we
find real inconsistencies between Wikipedia pages in different languages about
the same topic.Comment: Findings of EMNLP 202
Interpretable by Design Visual Question Answering
Model interpretability has long been a hard problem for the AI community
especially in the multimodal setting, where vision and language need to be
aligned and reasoned at the same time. In this paper, we specifically focus on
the problem of Visual Question Answering (VQA). While previous researches try
to probe into the network structures of black-box multimodal models, we propose
to tackle the problem from a different angle -- to treat interpretability as an
explicit additional goal.
Given an image and question, we argue that an interpretable VQA model should
be able to tell what conclusions it can get from which part of the image, and
show how each statement help to arrive at an answer. We introduce InterVQA:
Interpretable-by-design VQA, where we design an explicit intermediate dynamic
reasoning structure for VQA problems and enforce symbolic reasoning that only
use the structure for final answer prediction to take place. InterVQA produces
high-quality explicit intermediate reasoning steps, while maintaining similar
to the state-of-the-art (sota) end-task performance.Comment: Multimodal, Vision and Languag
Theoretical Model Construction of Deformation-Force for Soft Grippers Part I: Co-rotational Modeling and Force Control for Design Optimization
Compliant grippers, owing to adaptivity and safety, have attracted
considerable attention for unstructured grasping in real applications, such as
industrial or logistic scenarios. However, accurately modeling the
bidirectional relationship between shape deformation and contact force for such
grippers, the Fin-Ray grippers as an example, remains stagnant to date. To
address this research gap, this article devises, presents, and experimentally
validates a universal bidirectional force-displacement mathematical model for
compliant grippers based on the co-rotational concept, which endows such
grippers with an intrinsic force sensing capability and offers a better insight
into the design optimization. In Part I of the article, we introduce the
fundamental theory of the co-rotational approach, where arbitrary large
deformation of beam elements can be modeled. Its intrinsic principle allows
taking materials with varying stiffness, various connection types, and key
design parameters into consideration with few assumptions. Further, the
force-displacement relationship is numerically derived, providing accurate
displacement estimations of the gripper under external forces with minor
computational loads. The performance of the proposed method is experimentally
verified through comparison with Finite Element Analysis (FEA) in simulation,
obtaining a fair degree of accuracy (6%), and design optimization of Fin-Ray
grippers is systematically investigated. Part II of this article demonstrating
the force sensing capabilities and the effects of representative co-rotational
modeling parameters on model accuracy is released in Arxiv
ExpertQA: Expert-Curated Questions and Attributed Answers
As language models are adapted by a more sophisticated and diverse set of
users, the importance of guaranteeing that they provide factually correct
information supported by verifiable sources is critical across fields of study
& professions. This is especially the case for high-stakes fields, such as
medicine and law, where the risk of propagating false information is high and
can lead to undesirable societal consequences. Previous work studying
factuality and attribution has not focused on analyzing these characteristics
of language model outputs in domain-specific scenarios. In this work, we
present an evaluation study analyzing various axes of factuality and
attribution provided in responses from a few systems, by bringing domain
experts in the loop. Specifically, we first collect expert-curated questions
from 484 participants across 32 fields of study, and then ask the same experts
to evaluate generated responses to their own questions. We also ask experts to
revise answers produced by language models, which leads to ExpertQA, a
high-quality long-form QA dataset with 2177 questions spanning 32 fields, along
with verified answers and attributions for claims in the answers.Comment: Dataset & code is available at
https://github.com/chaitanyamalaviya/expertq
Theoretical Model Construction of Deformation-Force for Soft Grippers Part II: Displacement Control Based Intrinsic Force Sensing
Force-aware grasping is an essential capability for most robots in practical
applications. Especially for compliant grippers, such as Fin-Ray grippers, it
still remains challenging to build a bidirectional mathematical model that
mutually maps the shape deformation and contact force. Part I of this article
has constructed the force-displacement relationship for design optimization
through the co-rotational theory. In Part II, we further devise a
displacement-force mathematical model, enabling the compliant gripper to
precisely estimate contact force from deformations sensor-free. The presented
displacement-force model elaborately investigates contact forces and provides
force feedback for a force control system of a gripper, where deformation
appears as displacements in contact points. Afterward, simulation experiments
are conducted to evaluate the performance of the proposed model through
comparisons with the finite-element analysis (FEA) in Ansys. Simulation results
reveal that the proposed model accurately estimates contact force, with an
average error of around 3% and 4% for single or multiple node cases,
respectively, regardless of various design parameters (Part I of this article
is released in Arxiv1
Dense X Retrieval: What Retrieval Granularity Should We Use?
Dense retrieval has become a prominent method to obtain relevant context or
world knowledge in open-domain NLP tasks. When we use a learned dense retriever
on a retrieval corpus at inference time, an often-overlooked design choice is
the retrieval unit in which the corpus is indexed, e.g. document, passage, or
sentence. We discover that the retrieval unit choice significantly impacts the
performance of both retrieval and downstream tasks. Distinct from the typical
approach of using passages or sentences, we introduce a novel retrieval unit,
proposition, for dense retrieval. Propositions are defined as atomic
expressions within text, each encapsulating a distinct factoid and presented in
a concise, self-contained natural language format. We conduct an empirical
comparison of different retrieval granularity. Our results reveal that
proposition-based retrieval significantly outperforms traditional passage or
sentence-based methods in dense retrieval. Moreover, retrieval by proposition
also enhances the performance of downstream QA tasks, since the retrieved texts
are more condensed with question-relevant information, reducing the need for
lengthy input tokens and minimizing the inclusion of extraneous, irrelevant
information
The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts
As the influence of large language models (LLMs) spans across global
communities, their safety challenges in multilingual settings become paramount
for alignment research. This paper examines the variations in safety challenges
faced by LLMs across different languages and discusses approaches to
alleviating such concerns. By comparing how state-of-the-art LLMs respond to
the same set of malicious prompts written in higher- vs. lower-resource
languages, we observe that (1) LLMs tend to generate unsafe responses much more
often when a malicious prompt is written in a lower-resource language, and (2)
LLMs tend to generate more irrelevant responses to malicious prompts in
lower-resource languages. To understand where the discrepancy can be
attributed, we study the effect of instruction tuning with reinforcement
learning from human feedback (RLHF) or supervised finetuning (SFT) on the
HH-RLHF dataset. Surprisingly, while training with high-resource languages
improves model alignment, training in lower-resource languages yields minimal
improvement. This suggests that the bottleneck of cross-lingual alignment is
rooted in the pretraining stage. Our findings highlight the challenges in
cross-lingual LLM safety, and we hope they inform future research in this
direction
The Trickle-down Impact of Reward (In-)consistency on RLHF
Standard practice within Reinforcement Learning from Human Feedback (RLHF)
involves optimizing against a Reward Model (RM), which itself is trained to
reflect human preferences for desirable generations. A notable subject that is
understudied is the (in-)consistency of RMs -- whether they can recognize the
semantic changes to different prompts and appropriately adapt their reward
assignments -- and their impact on the downstream RLHF model.
In this paper, we visit a series of research questions relevant to RM
inconsistency: (1) How can we measure the consistency of reward models? (2) How
consistent are the existing RMs and how can we improve them? (3) In what ways
does reward inconsistency influence the chatbots resulting from the RLHF model
training?
We propose Contrast Instructions -- a benchmarking strategy for the
consistency of RM. Each example in Contrast Instructions features a pair of
lexically similar instructions with different ground truth responses. A
consistent RM is expected to rank the corresponding instruction and response
higher than other combinations. We observe that current RMs trained with the
standard ranking objective fail miserably on Contrast Instructions compared to
average humans. To show that RM consistency can be improved efficiently without
using extra training budget, we propose two techniques ConvexDA and
RewardFusion, which enhance reward consistency through extrapolation during the
RM training and inference stage, respectively. We show that RLHF models trained
with a more consistent RM yield more useful responses, suggesting that reward
inconsistency exhibits a trickle-down effect on the downstream RLHF process
Conceptual and Unbiased Reasoning in Language Models
Conceptual reasoning, the ability to reason in abstract and high-level
perspectives, is key to generalization in human cognition. However, limited
study has been done on large language models' capability to perform conceptual
reasoning. In this work, we bridge this gap and propose a novel
conceptualization framework that forces models to perform conceptual reasoning
on abstract questions and generate solutions in a verifiable symbolic space.
Using this framework as an analytical tool, we show that existing large
language models fall short on conceptual reasoning, dropping 9% to 28% on
various benchmarks compared to direct inference methods. We then discuss how
models can improve since high-level abstract reasoning is key to unbiased and
generalizable decision-making. We propose two techniques to add trustworthy
induction signals by generating familiar questions with similar underlying
reasoning paths and asking models to perform self-refinement. Experiments show
that our proposed techniques improve models' conceptual reasoning performance
by 8% to 11%, achieving a more robust reasoning system that relies less on
inductive biases.Comment: Preprint under revie
- …