Search CORE

710 research outputs found

Industrial Agglomeration, Production Networks and FDI Promotion The Case Study of China

Author: Shen Minghui
Zhao Jianglin
Zhou Xiaobing
Publication venue
Publication date
Field of study

Chinas Industrial clustering is a distinguished economic phenomenon over the last 20 years. It began to enter into its fast track in the mid-1990s and developed rapidly in recent years. Both market-driven force and government-driven force contribute to Chinese industrial clusters. The opening and stable macroeconomic policies create a favorable climate for the industrial clustering. Local government has made its contribution to construction on both hardware and software environments for industrial clusters. The major contribution of FDI to the local industrial clustering lies in helping integrating Chinese domestic industries into international division of labor and at the same time forging a relatively integrated production chain for Chinese domestic industries. At present, China has stepped into the new phase of industrial clusters upgrading. Chinese government is gradually improving the local software infrastructure for industry clustering.Industrial Agglomeration, China, Production Networks, FDI, foreign direct investment

Research Papers in Economics

AutoML from Software Engineering Perspective: Landscapes and Challenges

Author: Chen Zhenpeng
Wang Chao
Zhou Minghui
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2023
Field of study

Machine learning (ML) has been widely adopted in modern software, but the manual configuration of ML (e.g., hyper-parameter configuration) poses a significant challenge to software developers. Therefore, automated ML (AutoML), which seeks the optimal configuration of ML automatically, has received increasing attention from the software engineering community. However, to date, there is no comprehensive understanding of how AutoML is used by developers and what challenges developers encounter in using AutoML for software development. To fill this knowledge gap, we conduct the first study on understanding the use and challenges of AutoML from software developers’ perspective. We collect and analyze 1,554 AutoML downstream repositories, 769 AutoML-related Stack Overflow questions, and 1,437 relevant GitHub issues. The results suggest the increasing popularity of AutoML in a wide range of topics, but also the lack of relevant expertise. We manually identify specific challenges faced by developers for AutoML-enabled software. Based on the results, we derive a series of implications for AutoML framework selection, framework development, and research

UCL Discovery

Characterizing Deep Learning Package Supply Chains in PyPI: Domains, Clusters, and Disengagement

Author: Gao Kai
He Runzhi
Xie Bing
Zhou Minghui
Publication venue
Publication date: 28/06/2023
Field of study

Deep learning (DL) package supply chains (SCs) are critical for DL frameworks to remain competitive. However, vital knowledge on the nature of DL package SCs is still lacking. In this paper, we explore the domains, clusters, and disengagement of packages in two representative PyPI DL package SCs to bridge this knowledge gap. We analyze the metadata of nearly six million PyPI package distributions and construct version-sensitive SCs for two popular DL frameworks: TensorFlow and PyTorch. We find that popular packages (measured by the number of monthly downloads) in the two SCs cover 34 domains belonging to eight categories. Applications, Infrastructure, and Sciences categories account for over 85% of popular packages in either SC and TensorFlow and PyTorch SC have developed specializations on Infrastructure and Applications packages respectively. We employ the Leiden community detection algorithm and detect 131 and 100 clusters in the two SCs. The clusters mainly exhibit four shapes: Arrow, Star, Tree, and Forest with increasing dependency complexity. Most clusters are Arrow or Star, but Tree and Forest clusters account for most packages (Tensorflow SC: 70%, PyTorch SC: 90%). We identify three groups of reasons why packages disengage from the SC (i.e., remove the DL framework and its dependents from their installation dependencies): dependency issues, functional improvements, and ease of installation. The most common disengagement reason in the two SCs are different. Our study provides rich implications on the maintenance and dependency management practices of PyPI DL SCs.Comment: Manuscript submitted to ACM Transactions on Software Engineering and Methodolog

arXiv.org e-Print Archive

Limit and infi nitesimal analysis

Author: Liu Di
Wang Yong
Zhang Minghui
Zhou Ying
Publication venue: Universe Scientific Publishing Pte. Ltd.
Publication date: 11/01/2024
Field of study

Limit is a very important basic knowledge in calculus, the defi nition of limit is a simple logical proposition, which involves the existence, arbitrariness, and inequality problems. In this paper, one gives some explanations of the limit, the relationship between the limit and the infi nitesimal are demonstrated, from the infi nitesimal order to understand the limit

Electronics Science Technology and Application (E-Journal)

An approach to syndrome differentiation in traditional chinese medicine based on neural network

Author: Shi Minghui
Zhou Changle
Publication venue
Publication date: 01/01/2007
Field of study

Although the traditional knowledge representation based on rules is simple and explicit, it is not effective in the field of syndrome differentiation in Traditional Chinese Medicine (TCM), which involves many uncertain concepts. To represent uncertain knowledge of syndrome differentiation in TCM, two methods were presented respectively based on certainty factors and certainty intervals. Exploiting these two methods, an approach to syndrome differentiation in TCM was proposed based on neural networks to avoid some limitations of other approaches. The main advantage of the approach is that it may realize uncertain inference of syndrome differentiation in TCM, whereas it doesn't request experts to provide all possible combinations for certainty degrees of symptoms and syndromes. Rather than Back Propagation (BP) algorithm but its modification was employed to improve the capability of generalization of neural networks. First, the standard feedforward multilayer BP neural network and its modification were introduced. Next, two methods for knowledge representation, respectively based on certainty factors and certainty intervals, were presented Then, the algorithm was proposed based on neural network for the uncertain inference of syndrome differentiation in TCM. Finally, an example was demonstrated to illustrate the algorithm

Xiamen University Institutional Repository

How Early Participation Determines Long-Term Sustained Activity in GitHub Projects?

Author: He Hao
Xiao Wenxin
Xu Weiwei
Zhang Yuxia
Zhou Minghui
Publication venue
Publication date: 28/09/2023
Field of study

Although the open source model bears many advantages in software development, open source projects are always hard to sustain. Previous research on open source sustainability mainly focuses on projects that have already reached a certain level of maturity (e.g., with communities, releases, and downstream projects). However, limited attention is paid to the development of (sustainable) open source projects in their infancy, and we believe an understanding of early sustainability determinants is crucial for project initiators, incubators, newcomers, and users. In this paper, we aim to explore the relationship between early participation factors and long-term project sustainability. We leverage a novel methodology combining the Blumberg model of performance and machine learning to predict the sustainability of 290,255 GitHub projects. Specificially, we train an XGBoost model based on early participation (first three months of activity) in 290,255 GitHub projects and we interpret the model using LIME. We quantitatively show that early participants have a positive effect on project's future sustained activity if they have prior experience in OSS project incubation and demonstrate concentrated focus and steady commitment. Participation from non-code contributors and detailed contribution documentation also promote project's sustained activity. Compared with individual projects, building a community that consists of more experienced core developers and more active peripheral developers is important for organizational projects. This study provides unique insights into the incubation and recognition of sustainable open source projects, and our interpretable prediction approach can also offer guidance to open source project initiators and newcomers.Comment: The 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2023

arXiv.org e-Print Archive