Search CORE

197 research outputs found

The Co-Evolution of Global Legitimation and Technology Upgrading: The Case of Huawei

Author: Fan Di
Su Yiyi
Wu Sihong
Publication venue: Digital Commons @ New Haven
Publication date: 01/11/2021
Field of study

This study explores the underlying relationship between acquisition of global legitimacy and the search for technology upgrading by Chinese multinational enterprises (MNEs). Using Huawei’s investment in Russia, Kenya, the United Kingdom and Canada as an in-depth case study, we observe that through corporate social responsibility (CSR) activities in foreign markets and engaging with local community, Chinese MNEs can acquire global legitimacy and gradually catch up with industry leaders. However, the process of global legitimation and innovation continues to evolve. We find that, together with engaging in CSR activities, acquisition of sophisticated knowledge and creation of innovation bring more legitimacy challenges to these firms. Thus, we suggest that Chinese MNEs’ global legitimation and innovation processes are closely coupled and mutually influential, resulting in co-evolution

Directory of Open Access Journals

Digital Commons @ New Haven

DORec: Decomposed Object Reconstruction Utilizing 2D Self-Supervised Features

Author: Ji Sihui
Li Sicheng
Liao Yiyi
Wang Yue
Wu Jun
Xiong Rong
Publication venue
Publication date: 19/10/2023
Field of study

Decomposing a target object from a complex background while reconstructing is challenging. Most approaches acquire the perception for object instances through the use of manual labels, but the annotation procedure is costly. The recent advancements in 2D self-supervised learning have brought new prospects to object-aware representation, yet it remains unclear how to leverage such noisy 2D features for clean decomposition. In this paper, we propose a Decomposed Object Reconstruction (DORec) network based on neural implicit representations. Our key idea is to transfer 2D self-supervised features into masks of two levels of granularity to supervise the decomposition, including a binary mask to indicate the foreground regions and a K-cluster mask to indicate the semantically similar regions. These two masks are complementary to each other and lead to robust decomposition. Experimental results show the superiority of DORec in segmenting and reconstructing the foreground object on various datasets

arXiv.org e-Print Archive

The Insulation Properties of Oil-Impregnated Insulation Paper Reinforced with Nano-TiO 2

Author: Chao Tang
Cheng Lv
Lijun Yang
Ruijin Liao
Weiqiang Wu
Yiyi Zhang
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2013
Field of study

Oil-impregnated insulation paper has been widely used in transformers because of its low cost and desirable physical and electrical properties. However, research to improve the insulation properties of oil-impregnated insulation paper is rarely found. In this paper, nano-TiO2 was used to stick to the surface of cellulose which was used to make insulation paper. After oil-impregnated insulation paper reinforced by nano-TiO2 was prepared, the tensile strength, breakdown strength, and dielectric properties of the oil-impregnated insulation paper were investigated to determine whether the modified paper had a better insulation performance. The results show that there were no major changes in tensile strength, and the value of the breakdown strength was greatly improved from 51.13 kV/mm to 61.78 kV/mm. Also, the values of the relative dielectric constant, the dielectric loss, and conductivity declined. The discussion reveals that nano-TiO2 plays a major role in the phenomenon. Because of the existence of nano-TiO2, the contact interface of cellulose and oil was changed, and a large number of shallow traps were produced. These shallow traps changed the insulation properties of oil-impregnated insulation paper. The results show that the proposed solution offers a new method to improve the properties of oil-impregnated insulation paper

Crossref

Directory of Open Access Journals

Parameter and Computation Efficient Transfer Learning for Vision-Language Pre-trained Models

Author: Huang Shubin
Ji Rongrong
Sun Xiaoshuai
Wu Qiong
Yu Wei
Zhou Yiyi
Publication venue
Publication date: 06/09/2023
Field of study

With ever increasing parameters and computation, vision-language pre-trained (VLP) models exhibit prohibitive expenditure in downstream task adaption. Recent endeavors mainly focus on parameter efficient transfer learning (PETL) for VLP models by only updating a small number of parameters. However, excessive computational overhead still plagues the application of VLPs. In this paper, we aim at parameter and computation efficient transfer learning (PCETL) for VLP models. In particular, PCETL not only needs to limit the number of trainable parameters in VLP models, but also to reduce the computational redundancy during inference, thus enabling a more efficient transfer. To approach this target, we propose a novel dynamic architecture skipping (DAS) approach towards effective PCETL. Instead of directly optimizing the intrinsic architectures of VLP models, DAS first observes the significances of their modules to downstream tasks via a reinforcement learning (RL) based process, and then skips the redundant ones with lightweight networks, i.e., adapters, according to the obtained rewards. In this case, the VLP model can well maintain the scale of trainable parameters while speeding up its inference on downstream tasks. To validate DAS, we apply it to two representative VLP models, namely ViLT and METER, and conduct extensive experiments on a bunch of VL tasks. The experimental results not only show the great advantages of DAS in reducing computational complexity, e.g. -11.97% FLOPs of METER on VQA2.0, but also confirm its competitiveness against existing PETL methods in terms of parameter scale and performance. Our source code is given in our appendix

arXiv.org e-Print Archive

Older adults’ experiences with using information and communication technology and tech support services in New York City: findings and recommendations for post-pandemic digital pedagogy for older adults

Author: Mark Brennan-Ing
Ruth Finkelstein
Yiyi Wu
Publication venue: 'Frontiers Media SA'
Publication date: 01/04/2023
Field of study

IntroductionAlthough Information and Communication Technology (ICT) has great potential to help older adults cope with challenges associated with aging, the intended benefits of ICT are not always realized in this population due to access barriers and low digital literacy. During the COVID-19 pandemic, numerous tech support initiatives for older adults got underway. However, evaluation of the effectiveness of these initiatives is less common. This research partnered with a large, multi-service organization in New York City that gave some groups of their clients ICT devices, unlimited broadband, and access to technology training in response to COVID-19 lockdowns. This study investigates older adults’ experiences with ICT and ICT support services to better inform the existing and emerging tech support for older adults during and beyond the pandemic.MethodsData were obtained from interviewer-administered surveys of 35 older adult recipients of ICT devices, connectivity, and training in New York City. The average age was 74 years (range = 55–90 years). The group was diverse regarding race/ethnicity (Black 29%, Latino 19%, White 43%). All had low incomes. Surveys consisted of multiple-choice items and open-ended responses.ResultsThe study found that one size does not fit all when it comes to ICT training and support for older adults. While connection to devices and services and tech support led to a degree of ICT adoption, the newly learned skills did not always lead to expanded device usage. The readily available tech support training and support do not guarantee service utilization, as success with tech services is related to one’s pre-existing ICT competence.DiscussionThe study concludes that customized training based on individuals’ skills rather than age is needed. Tech support training should start by understanding an individual’s interests and incorporate tech education to help users identify a wide range of existing and emerging online services that can meet their needs. Service organizations should consider including an assessment of ICT access, use, and skills into their standard intake protocols to ensure effective service delivery

Directory of Open Access Journals

Adapting Pre-trained Language Models to Vision-Language Tasks via Dynamic Visual Prompting

Author: Chen Weijie
Huang Shubin
Ji Rongrong
Sun Xiaoshuai
Wu Qiong
Zhang Rongsheng
Zhou Yiyi
Publication venue
Publication date: 01/06/2023
Field of study

Pre-trained language models (PLMs) have played an increasing role in multimedia research. In terms of vision-language (VL) tasks, they often serve as a language encoder and still require an additional fusion network for VL reasoning, resulting in excessive memory overhead. In this paper, we focus on exploring PLMs as a stand-alone model for VL reasoning tasks. Inspired by the recently popular prompt tuning, we first prove that the processed visual features can be also projected onto the semantic space of PLMs and act as prompt tokens to bridge the gap between single- and multi-modal learning. However, this solution exhibits obvious redundancy in visual information and model inference, and the placement of prompt tokens also greatly affects the final performance. Based on these observations, we further propose a novel transfer learning approach for PLMs, termed Dynamic Visual Prompting (DVP). Concretely, DVP first deploys a cross-attention module to obtain text-related and compact visual prompt tokens, thereby greatly reducing the input length of PLMs. To obtain the optimal placement, we also equip DVP with a reinforcement-learning based search algorithm, which can automatically merge DVP with PLMs for different VL tasks via a very short search process. In addition, we also experiment DVP with the recently popular adapter approach to keep the most parameters of PLMs intact when adapting to VL tasks, helping PLMs achieve a quick shift between single- and multi-modal tasks. We apply DVP to two representative PLMs, namely BERT and T5, and conduct extensive experiments on a set of VL reasoning benchmarks including VQA2.0, GQA and SNLIVE. The experimental results not only show the advantage of DVP on efficiency and performance, but also confirm its superiority in adapting pre-trained language models to VL tasks

arXiv.org e-Print Archive

Approximated Prompt Tuning for Vision-Language Pre-trained Models

Author: Dai Pingyang
Huang Shubin
Ji Rongrong
Jiang Guannan
Shu Annan
Wu Qiong
Zhou Yiyi
Publication venue
Publication date: 27/06/2023
Field of study

Prompt tuning is a parameter-efficient way to deploy large-scale pre-trained models to downstream tasks by adding task-specific tokens. In terms of vision-language pre-trained (VLP) models, prompt tuning often requires a large number of learnable tokens to bridge the gap between the pre-training and downstream tasks, which greatly exacerbates the already high computational overhead. In this paper, we revisit the principle of prompt tuning for Transformer-based VLP models and reveal that the impact of soft prompt tokens can be actually approximated via independent information diffusion steps, thereby avoiding the expensive global attention modeling and reducing the computational complexity to a large extent. Based on this finding, we propose a novel Approximated Prompt Tuning (APT) approach towards efficient VL transfer learning. To validate APT, we apply it to two representative VLP models, namely ViLT and METER, and conduct extensive experiments on a bunch of downstream tasks. Meanwhile, the generalization of APT is also validated on CLIP for image classification. The experimental results not only show the superior performance gains and computation efficiency of APT against the conventional prompt tuning methods, e.g., +6.6% accuracy and -64.62% additional computation overhead on METER, but also confirm its merits over other parameter-efficient transfer learning approaches

arXiv.org e-Print Archive

Towards Language-guided Visual Recognition via Dynamic Convolutions

Author: Ding Xinghao
Gao Yue
Huang Feiyue
Ji Rongrong
Luo Gen
Sun Xiaoshuai
Wu Yongjian
Zhou Yiyi
Publication venue
Publication date: 17/10/2021
Field of study

In this paper, we are committed to establishing an unified and end-to-end multi-modal network via exploring the language-guided visual recognition. To approach this target, we first propose a novel multi-modal convolution module called Language-dependent Convolution (LaConv). Its convolution kernels are dynamically generated based on natural language information, which can help extract differentiated visual features for different multi-modal examples. Based on the LaConv module, we further build the first fully language-driven convolution network, termed as LaConvNet, which can unify the visual recognition and multi-modal reasoning in one forward structure. To validate LaConv and LaConvNet, we conduct extensive experiments on four benchmark datasets of two vision-and-language tasks, i.e., visual question answering (VQA) and referring expression comprehension (REC). The experimental results not only shows the performance gains of LaConv compared to the existing multi-modal modules, but also witness the merits of LaConvNet as an unified network, including compact network, high generalization ability and excellent performance, e.g., +4.7% on RefCOCO+

arXiv.org e-Print Archive

What Goes beyond Multi-modal Fusion in One-stage Referring Expression Comprehension: An Empirical Study

Author: Huang Shubin
Ji Rongrong
Luo Gen
Sun Jiamu
Sun Xiaoshuai
Wu Yongjian
Ye Qixiang
Zhou Yiyi
Publication venue
Publication date: 16/04/2022
Field of study

Most of the existing work in one-stage referring expression comprehension (REC) mainly focuses on multi-modal fusion and reasoning, while the influence of other factors in this task lacks in-depth exploration. To fill this gap, we conduct an empirical study in this paper. Concretely, we first build a very simple REC network called SimREC, and ablate 42 candidate designs/settings, which covers the entire process of one-stage REC from network design to model training. Afterwards, we conduct over 100 experimental trials on three benchmark datasets of REC. The extensive experimental results not only show the key factors that affect REC performance in addition to multi-modal fusion, e.g., multi-scale features and data augmentation, but also yield some findings that run counter to conventional understanding. For example, as a vision and language (V&L) task, REC does is less impacted by language prior. In addition, with a proper combination of these findings, we can improve the performance of SimREC by a large margin, e.g., +27.12% on RefCOCO+, which outperforms all existing REC methods. But the most encouraging finding is that with much less training overhead and parameters, SimREC can still achieve better performance than a set of large-scale pre-trained models, e.g., UNITER and VILLA, portraying the special role of REC in existing V&L research

arXiv.org e-Print Archive