Measuring Copying of Java Archives

Abstract

Copying the whole of a library is one of the major types of reuse in software development.In Java, a single library archive file often contains other libraries it depends on, but users of the library hardly know about such inner libraries.Since reusing libraries is a black-box method, developers may combine some librarieswithout knowing that those libraries contain the same library inside independently.As a result, a library may contain inside several copies of a library it reuses.In this research, we measured copying of jar archives in the Maven Central Repository, a collection of open source Java libraries.Our results show that about 14% of top-level jar files are reused in other jar filesand some of them are duplicated in a single jar file.We also found that some libraries contain two or more different versions of the same library

    Similar works