710 research outputs found
Industrial Agglomeration, Production Networks and FDI Promotion The Case Study of China
Chinas Industrial clustering is a distinguished economic phenomenon over the last 20 years. It began to enter into its fast track in the mid-1990s and developed rapidly in recent years. Both market-driven force and government-driven force contribute to Chinese industrial clusters. The opening and stable macroeconomic policies create a favorable climate for the industrial clustering. Local government has made its contribution to construction on both hardware and software environments for industrial clusters. The major contribution of FDI to the local industrial clustering lies in helping integrating Chinese domestic industries into international division of labor and at the same time forging a relatively integrated production chain for Chinese domestic industries. At present, China has stepped into the new phase of industrial clusters upgrading. Chinese government is gradually improving the local software infrastructure for industry clustering.Industrial Agglomeration, China, Production Networks, FDI, foreign direct investment
AutoML from Software Engineering Perspective: Landscapes and Challenges
Machine learning (ML) has been widely adopted in modern software, but the manual configuration of ML (e.g., hyper-parameter configuration) poses a significant challenge to software developers. Therefore, automated ML (AutoML), which seeks the optimal configuration of ML automatically, has received increasing attention from the software engineering community. However, to date, there is no comprehensive understanding of how AutoML is used by developers and what challenges developers encounter in using AutoML for software development. To fill this knowledge gap, we conduct the first study on understanding the use and challenges of AutoML from software developers’ perspective. We collect and analyze 1,554 AutoML downstream repositories, 769 AutoML-related Stack Overflow questions, and 1,437 relevant GitHub issues. The results suggest the increasing popularity of AutoML in a wide range of topics, but also the lack of relevant expertise. We manually identify specific challenges faced by developers for AutoML-enabled software. Based on the results, we derive a series of implications for AutoML framework selection, framework development, and research
Characterizing Deep Learning Package Supply Chains in PyPI: Domains, Clusters, and Disengagement
Deep learning (DL) package supply chains (SCs) are critical for DL frameworks
to remain competitive. However, vital knowledge on the nature of DL package SCs
is still lacking. In this paper, we explore the domains, clusters, and
disengagement of packages in two representative PyPI DL package SCs to bridge
this knowledge gap. We analyze the metadata of nearly six million PyPI package
distributions and construct version-sensitive SCs for two popular DL
frameworks: TensorFlow and PyTorch. We find that popular packages (measured by
the number of monthly downloads) in the two SCs cover 34 domains belonging to
eight categories. Applications, Infrastructure, and Sciences categories account
for over 85% of popular packages in either SC and TensorFlow and PyTorch SC
have developed specializations on Infrastructure and Applications packages
respectively. We employ the Leiden community detection algorithm and detect 131
and 100 clusters in the two SCs. The clusters mainly exhibit four shapes:
Arrow, Star, Tree, and Forest with increasing dependency complexity. Most
clusters are Arrow or Star, but Tree and Forest clusters account for most
packages (Tensorflow SC: 70%, PyTorch SC: 90%). We identify three groups of
reasons why packages disengage from the SC (i.e., remove the DL framework and
its dependents from their installation dependencies): dependency issues,
functional improvements, and ease of installation. The most common
disengagement reason in the two SCs are different. Our study provides rich
implications on the maintenance and dependency management practices of PyPI DL
SCs.Comment: Manuscript submitted to ACM Transactions on Software Engineering and
Methodolog
Limit and infi nitesimal analysis
Limit is a very important basic knowledge in calculus, the defi nition of limit is a simple logical proposition, which involves
the existence, arbitrariness, and inequality problems. In this paper, one gives some explanations of the limit, the relationship between the
limit and the infi nitesimal are demonstrated, from the infi nitesimal order to understand the limit
An approach to syndrome differentiation in traditional chinese medicine based on neural network
Although the traditional knowledge representation based on rules is simple and explicit, it is not effective in the field of syndrome differentiation in Traditional Chinese Medicine (TCM), which involves many uncertain concepts. To represent uncertain knowledge of syndrome differentiation in TCM, two methods were presented respectively based on certainty factors and certainty intervals. Exploiting these two methods, an approach to syndrome differentiation in TCM was proposed based on neural networks to avoid some limitations of other approaches. The main advantage of the approach is that it may realize uncertain inference of syndrome differentiation in TCM, whereas it doesn't request experts to provide all possible combinations for certainty degrees of symptoms and syndromes. Rather than Back Propagation (BP) algorithm but its modification was employed to improve the capability of generalization of neural networks. First, the standard feedforward multilayer BP neural network and its modification were introduced. Next, two methods for knowledge representation, respectively based on certainty factors and certainty intervals, were presented Then, the algorithm was proposed based on neural network for the uncertain inference of syndrome differentiation in TCM. Finally, an example was demonstrated to illustrate the algorithm
How Early Participation Determines Long-Term Sustained Activity in GitHub Projects?
Although the open source model bears many advantages in software development,
open source projects are always hard to sustain. Previous research on open
source sustainability mainly focuses on projects that have already reached a
certain level of maturity (e.g., with communities, releases, and downstream
projects). However, limited attention is paid to the development of
(sustainable) open source projects in their infancy, and we believe an
understanding of early sustainability determinants is crucial for project
initiators, incubators, newcomers, and users.
In this paper, we aim to explore the relationship between early participation
factors and long-term project sustainability. We leverage a novel methodology
combining the Blumberg model of performance and machine learning to predict the
sustainability of 290,255 GitHub projects. Specificially, we train an XGBoost
model based on early participation (first three months of activity) in 290,255
GitHub projects and we interpret the model using LIME. We quantitatively show
that early participants have a positive effect on project's future sustained
activity if they have prior experience in OSS project incubation and
demonstrate concentrated focus and steady commitment. Participation from
non-code contributors and detailed contribution documentation also promote
project's sustained activity. Compared with individual projects, building a
community that consists of more experienced core developers and more active
peripheral developers is important for organizational projects. This study
provides unique insights into the incubation and recognition of sustainable
open source projects, and our interpretable prediction approach can also offer
guidance to open source project initiators and newcomers.Comment: The 31st ACM Joint European Software Engineering Conference and
Symposium on the Foundations of Software Engineering (ESEC/FSE 2023
- …