164 research outputs found
Corporate Governance and Firm Performance of Listed Small and Medium-Sized Enterprises in China
The purpose of this dissertation is to explore the relationship between corporate governance and firm performance of small and medium-sized firms in China. Based on data analysis from a sample of 517 listed small and medium-sized firms over a five-year period (2016-2020), this dissertation suggests that corporate governance has a considerable impact on the performance of listed small and medium-sized enterprises in China. According to the findings of the data analysis, board size, CEO age, CEO compensation, board independence, CEO duality, and insider ownership have a substantial impact on the performance of listed small and medium-sized firms in China, whereas CEO tenure does not appear to be a significant determinant of firm performance. This dissertation advances our understanding of the relationship between corporate governance and firm performance and has significant consequences for business executives, policymakers, and academics who prescribe corporate governance for small and medium-sized enterprises.
Keywords: Financial performance, Corporate governance, Listed small and medium-sized enterprises, Chin
VeRi3D: Generative Vertex-based Radiance Fields for 3D Controllable Human Image Synthesis
Unsupervised learning of 3D-aware generative adversarial networks has lately
made much progress. Some recent work demonstrates promising results of learning
human generative models using neural articulated radiance fields, yet their
generalization ability and controllability lag behind parametric human models,
i.e., they do not perform well when generalizing to novel pose/shape and are
not part controllable. To solve these problems, we propose VeRi3D, a generative
human vertex-based radiance field parameterized by vertices of the parametric
human template, SMPL. We map each 3D point to the local coordinate system
defined on its neighboring vertices, and use the corresponding vertex feature
and local coordinates for mapping it to color and density values. We
demonstrate that our simple approach allows for generating photorealistic human
images with free control over camera pose, human pose, shape, as well as
enabling part-level editing
Towards Omni-supervised Referring Expression Segmentation
Referring Expression Segmentation (RES) is an emerging task in computer
vision, which segments the target instances in images based on text
descriptions. However, its development is plagued by the expensive segmentation
labels. To address this issue, we propose a new learning task for RES called
Omni-supervised Referring Expression Segmentation (Omni-RES), which aims to
make full use of unlabeled, fully labeled and weakly labeled data, e.g.,
referring points or grounding boxes, for efficient RES training. To accomplish
this task, we also propose a novel yet strong baseline method for Omni-RES
based on the recently popular teacher-student learning, where the weak labels
are not directly transformed into supervision signals but used as a yardstick
to select and refine high-quality pseudo-masks for teacher-student learning. To
validate the proposed Omni-RES method, we apply it to a set of state-of-the-art
RES models and conduct extensive experiments on a bunch of RES datasets. The
experimental results yield the obvious merits of Omni-RES than the
fully-supervised and semi-supervised training schemes. For instance, with only
10% fully labeled data, Omni-RES can help the base model achieve 100% fully
supervised performance, and it also outperform the semi-supervised alternative
by a large margin, e.g., +14.93% on RefCOCO and +14.95% on RefCOCO+,
respectively. More importantly, Omni-RES also enable the use of large-scale
vision-langauges like Visual Genome to facilitate low-cost RES training, and
achieve new SOTA performance of RES, e.g., 80.66 on RefCOCO
Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models
With ever increasing parameters and computation, vision-language pre-trained
(VLP) models exhibit prohibitive expenditure in downstream task adaption.
Recent endeavors mainly focus on parameter efficient transfer learning (PETL)
for VLP models by only updating a small number of parameters. However,
excessive computational overhead still plagues the application of VLPs. In this
paper, we aim at parameter and computation efficient transfer learning (PCETL)
for VLP models. In particular, PCETL not only needs to limit the number of
trainable parameters in VLP models, but also to reduce the computational
redundancy during inference, thus enabling a more efficient transfer. To
approach this target, we propose a novel dynamic architecture skipping (DAS)
approach towards effective PCETL. Instead of directly optimizing the intrinsic
architectures of VLP models, DAS first observes the significances of their
modules to downstream tasks via a reinforcement learning (RL) based process,
and then skips the redundant ones with lightweight networks, i.e., adapters,
according to the obtained rewards. In this case, the VLP model can well
maintain the scale of trainable parameters while speeding up its inference on
downstream tasks. To validate DAS, we apply it to two representative VLP
models, namely ViLT and METER, and conduct extensive experiments on a bunch of
VL tasks. The experimental results not only show the great advantages of DAS in
reducing computational complexity, e.g. -11.97% FLOPs of METER on VQA2.0, but
also confirm its competitiveness against existing PETL methods in terms of
parameter scale and performance. Our source code is given in our appendix
Dynamic graph learning: A structure-driven approach
The purpose of this paper is to infer a dynamic graph as a global (collective) model of time-varying measurements at a set of network nodes. This model captures both pairwise as well as higher order interactions (i.e., more than two nodes) among the nodes. The motivation of this work lies in the search for a connectome model which properly captures brain functionality across all regions of the brain, and possibly at individual neurons. We formulate it as an optimization problem, a quadratic objective functional and tensor information of observed node signals over short time intervals. The proper regularization constraints reflect the graph smoothness and other dynamics involving the underlying graph’s Laplacian, as well as the time evolution smoothness of the underlying graph. The resulting joint optimization is solved by a continuous relaxation of the weight parameters and an introduced novel gradient-projection scheme. While the work may be applicable to any time-evolving data set (e.g., fMRI), we apply our algorithm to a real-world dataset comprising recorded activities of individual brain cells. The resulting model is shown to be not only viable but also efficiently computable
Adapting Pre-trained Language Models to Vision-Language Tasks via Dynamic Visual Prompting
Pre-trained language models (PLMs) have played an increasing role in
multimedia research. In terms of vision-language (VL) tasks, they often serve
as a language encoder and still require an additional fusion network for VL
reasoning, resulting in excessive memory overhead. In this paper, we focus on
exploring PLMs as a stand-alone model for VL reasoning tasks. Inspired by the
recently popular prompt tuning, we first prove that the processed visual
features can be also projected onto the semantic space of PLMs and act as
prompt tokens to bridge the gap between single- and multi-modal learning.
However, this solution exhibits obvious redundancy in visual information and
model inference, and the placement of prompt tokens also greatly affects the
final performance. Based on these observations, we further propose a novel
transfer learning approach for PLMs, termed Dynamic Visual Prompting (DVP).
Concretely, DVP first deploys a cross-attention module to obtain text-related
and compact visual prompt tokens, thereby greatly reducing the input length of
PLMs. To obtain the optimal placement, we also equip DVP with a
reinforcement-learning based search algorithm, which can automatically merge
DVP with PLMs for different VL tasks via a very short search process. In
addition, we also experiment DVP with the recently popular adapter approach to
keep the most parameters of PLMs intact when adapting to VL tasks, helping PLMs
achieve a quick shift between single- and multi-modal tasks. We apply DVP to
two representative PLMs, namely BERT and T5, and conduct extensive experiments
on a set of VL reasoning benchmarks including VQA2.0, GQA and SNLIVE. The
experimental results not only show the advantage of DVP on efficiency and
performance, but also confirm its superiority in adapting pre-trained language
models to VL tasks
Towards Efficient Visual Adaption via Structural Re-parameterization
Parameter-efficient transfer learning (PETL) is an emerging research spot
aimed at inexpensively adapting large-scale pre-trained models to downstream
tasks. Recent advances have achieved great success in saving storage costs for
various vision tasks by updating or injecting a small number of parameters
instead of full fine-tuning. However, we notice that most existing PETL methods
still incur non-negligible latency during inference. In this paper, we propose
a parameter-efficient and computationally friendly adapter for giant vision
models, called RepAdapter. Specifically, we prove that the adaption modules,
even with a complex structure, can be seamlessly integrated into most giant
vision models via structural re-parameterization. This property makes
RepAdapter zero-cost during inference. In addition to computation efficiency,
RepAdapter is more effective and lightweight than existing PETL methods due to
its sparse structure and our careful deployment. To validate RepAdapter, we
conduct extensive experiments on 27 benchmark datasets of three vision tasks,
i.e., image and video classifications and semantic segmentation. Experimental
results show the superior performance and efficiency of RepAdapter than the
state-of-the-art PETL methods. For instance, by updating only 0.6% parameters,
we can improve the performance of ViT from 38.8 to 55.1 on Sun397. Its
generalizability is also well validated by a bunch of vision models, i.e., ViT,
CLIP, Swin-Transformer and ConvNeXt. Our source code is released at
https://github.com/luogen1996/RepAdapter
Approximated Prompt Tuning for Vision-Language Pre-trained Models
Prompt tuning is a parameter-efficient way to deploy large-scale pre-trained
models to downstream tasks by adding task-specific tokens. In terms of
vision-language pre-trained (VLP) models, prompt tuning often requires a large
number of learnable tokens to bridge the gap between the pre-training and
downstream tasks, which greatly exacerbates the already high computational
overhead. In this paper, we revisit the principle of prompt tuning for
Transformer-based VLP models and reveal that the impact of soft prompt tokens
can be actually approximated via independent information diffusion steps,
thereby avoiding the expensive global attention modeling and reducing the
computational complexity to a large extent. Based on this finding, we propose a
novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer
learning. To validate APT, we apply it to two representative VLP models, namely
ViLT and METER, and conduct extensive experiments on a bunch of downstream
tasks. Meanwhile, the generalization of APT is also validated on CLIP for image
classification. The experimental results not only show the superior performance
gains and computation efficiency of APT against the conventional prompt tuning
methods, e.g., +6.6% accuracy and -64.62% additional computation overhead on
METER, but also confirm its merits over other parameter-efficient transfer
learning approaches
Quantifying Causes of Arctic Amplification via Deep Learning based Time-series Causal Inference
The warming of the Arctic, also known as Arctic amplification, is led by
several atmospheric and oceanic drivers. However, the details of its underlying
thermodynamic causes are still unknown. Inferring the causal effects of
atmospheric processes on sea ice melt using fixed treatment effect strategies
leads to unrealistic counterfactual estimations. Such models are also prone to
bias due to time-varying confoundedness. Further, the complex non-linearity in
Earth science data makes it infeasible to perform causal inference using
existing marginal structural techniques. In order to tackle these challenges,
we propose TCINet - time-series causal inference model to infer causation under
continuous treatment using recurrent neural networks and a novel probabilistic
balancing technique. Through experiments on synthetic and observational data,
we show how our research can substantially improve the ability to quantify
leading causes of Arctic sea ice melt, further paving paths for causal
inference in observational Earth science
- …