62,860 research outputs found
git2net - Mining Time-Stamped Co-Editing Networks from Large git Repositories
Data from software repositories have become an important foundation for the
empirical study of software engineering processes. A recurring theme in the
repository mining literature is the inference of developer networks capturing
e.g. collaboration, coordination, or communication from the commit history of
projects. Most of the studied networks are based on the co-authorship of
software artefacts defined at the level of files, modules, or packages. While
this approach has led to insights into the social aspects of software
development, it neglects detailed information on code changes and code
ownership, e.g. which exact lines of code have been authored by which
developers, that is contained in the commit log of software projects.
Addressing this issue, we introduce git2net, a scalable python software that
facilitates the extraction of fine-grained co-editing networks in large git
repositories. It uses text mining techniques to analyse the detailed history of
textual modifications within files. This information allows us to construct
directed, weighted, and time-stamped networks, where a link signifies that one
developer has edited a block of source code originally written by another
developer. Our tool is applied in case studies of an Open Source and a
commercial software project. We argue that it opens up a massive new source of
high-resolution data on human collaboration patterns.Comment: MSR 2019, 12 pages, 10 figure
Recommended from our members
An Overview of the Use of Neural Networks for Data Mining Tasks
In the recent years the area of data mining has experienced a considerable demand for technologies that extract knowledge from large and complex data sources. There is a substantial commercial interest as well as research investigations in the area that aim to develop new and improved approaches for extracting information, relationships, and patterns from datasets. Artificial Neural Networks (NN) are popular biologically inspired intelligent methodologies, whose classification, prediction and pattern recognition capabilities have been utilised successfully in many areas, including science, engineering, medicine, business, banking, telecommunication, and many other fields. This paper highlights from a data mining perspective the implementation of NN, using supervised and unsupervised learning, for pattern recognition, classification, prediction and cluster analysis, and focuses the discussion on their usage in bioinformatics and financial data analysis tasks
Performance Analysis of Blockchain Platforms
Blockchain technologies have drawn massive attention to the world these past few years mostly because of the burst of cryptocurrencies like Bitcoin, Etherium, Ripple and many others. A Blockchain, also known as distributed ledger technology, has demonstrated huge potential in saving time and costs. This open-source technology which generates a decentralized public ledger of transactions is widely appreciated for ensuring a high level of privacy through encryption and thus sharing the transaction details only amongst the participants involved in the transactions. The Blockchain is used not only for cryptocurrency but also by various companies to meet their business ends, such as efficient management of supply chains and logistics. The rise and fall of numerous crypto-currencies based on blockchain technology have generated debate among tech-giants and regulatory bodies. There are various groups which are working on standardizing the blockchain technology. At the same time, numerous groups are actively working, developing and fine-tuning their own blockchain platforms. Platforms such as etherium, hyperledger, parity, etc. have their own pros and cons. This research is focused on the performance analysis of blockchain platforms which gives a comparative understanding of these platforms
Modeling Financial Time Series with Artificial Neural Networks
Financial time series convey the decisions and actions of a population of human actors over time. Econometric and regressive models have been developed in the past decades for analyzing these time series. More recently, biologically inspired artificial neural network models have been shown to overcome some of the main challenges of traditional techniques by better exploiting the non-linear, non-stationary, and oscillatory nature of noisy, chaotic human interactions. This review paper explores the options, benefits, and weaknesses of the various forms of artificial neural networks as compared with regression techniques in the field of financial time series analysis.CELEST, a National Science Foundation Science of Learning Center (SBE-0354378); SyNAPSE program of the Defense Advanced Research Project Agency (HR001109-03-0001
- …