603 research outputs found
An Introduction to Software Ecosystems
This chapter defines and presents different kinds of software ecosystems. The
focus is on the development, tooling and analytics aspects of software
ecosystems, i.e., communities of software developers and the interconnected
software components (e.g., projects, libraries, packages, repositories,
plug-ins, apps) they are developing and maintaining. The technical and social
dependencies between these developers and software components form a
socio-technical dependency network, and the dynamics of this network change
over time. We classify and provide several examples of such ecosystems. The
chapter also introduces and clarifies the relevant terms needed to understand
and analyse these ecosystems, as well as the techniques and research methods
that can be used to analyse different aspects of these ecosystems.Comment: Preprint of chapter "An Introduction to Software Ecosystems" by Tom
Mens and Coen De Roover, published in the book "Software Ecosystems: Tooling
and Analytics" (eds. T. Mens, C. De Roover, A. Cleve), 2023, ISBN
978-3-031-36059-6, reproduced with permission of Springer. The final
authenticated version of the book and this chapter is available online at:
https://doi.org/10.1007/978-3-031-36060-
Automatic Specialization of Third-Party Java Dependencies
Modern software systems rely on a multitude of third-party dependencies. This
large-scale code reuse reduces development costs and time, and it poses new
challenges with respect to maintenance and security. Techniques such as tree
shaking or shading can remove dependencies that are completely unused by a
project, which partly address these challenges. Yet, the remaining dependencies
are likely to be used only partially, leaving room for further reduction of
third-party code. In this paper, we propose a novel technique to specialize
dependencies of Java projects, based on their actual usage. For each
dependency, we systematically identify the subset of its functionalities that
is necessary to build the project, and remove the rest. Each specialized
dependency is repackaged. Then, we generate specialized dependency trees where
the original dependencies are replaced by the specialized versions and we
rebuild the project. We implement our technique in a tool called DepTrim, which
we evaluate with 30 notable open-source Java projects. DepTrim specializes a
total of 343 (86.6%) dependencies across these projects, and successfully
rebuilds each project with a specialized dependency tree. Moreover, through
this specialization, DepTrim removes a total of 60,962 (47.0%) classes from the
dependencies, reducing the ratio of dependency classes to project classes from
8.7x in the original projects to 4.4x after specialization. These results
indicate the relevance of dependency specialization to significantly reduce the
share of third-party code in Java projects.Comment: 17 pages, 2 figures, 4 tables, 1 algorithm, 2 code listings, 3
equation
Coverage-Based Debloating for Java Bytecode
Software bloat is code that is packaged in an application but is actually not
necessary to run the application. The presence of software bloat is an issue
for security, for performance, and for maintenance. In this paper, we introduce
a novel technique for debloating Java bytecode, which we call coverage-based
debloating. We leverage a combination of state-of-the-art Java bytecode
coverage tools to precisely capture what parts of a project and its
dependencies are used at runtime. Then, we automatically remove the parts that
are not covered to generate a debloated version of the compiled project. We
successfully generate debloated versions of 220 open-source Java libraries,
which are syntactically correct and preserve their original behavior according
to the workload. Our results indicate that 68.3% of the libraries' bytecode and
20.5% of their total dependencies can be removed through coverage-based
debloating. Meanwhile, we present the first experiment that assesses the
utility of debloated libraries with respect to client applications that reuse
them. We show that 80.9% of the clients with at least one test that uses the
library successfully compile and pass their test suite when the original
library is replaced by its debloated version
Jolie Microservices: An Experiment
Os microsserviços estão cada vez mais presentes no mundo das tecnologias de informação, por providenciarem uma nova forma construir sistemas mais escaláveis, ágeis e flexíveis. Apesar disto, estes trazem consigo o problema da complexidade de comunicação entre microsserviços, fazendo com que o sistema seja difícil de manter e de se perceber. Linguagens de programação específicas a microsserviços como Jolie entram em cena para tentar resolver este problema e simplificar a construção de sistemas com arquiteturas de microsserviços.
Este trabalho fornece uma visão ampla do estado da arte da linguagem de programação Jolie onde é primeiramente detalhado o porquê de surgirem linguagens específicas a microsserviços e como a linguagem Jolie está construída de maneira a coincidir com as arquiteturas de microsserviços através de recursos nativos.
Para demonstrar todas as vantagens de usar esta linguagem em comparação com as abordagens mais mainstream é pensado um experimento de desenvolvimento de um sistema de microsserviços no âmbito de uma aplicação de e-commerce. Este sistema é construído de forma igual usando duas bases tecnológicas – Jolie e Spring Boot. O Spring Boot é considerado a tecnologia mais usada para desenvolver sistemas de microsserviços sendo o candidato ideal para comparação. É pensada toda a análise e design deste experimento.
Em seguida, a implementação da solução é detalhada a partir das configurações do sistema, escolhas arquitetónicas e como elas são implementadas. Componentes como API gateway, mediadores de mensagens, bases de dados, orquestração de microsserviços, e conteinerização para cada microsserviço e outros componentes do sistema.
Pol último as soluções são comparadas e analisadas com base na abordagem Goals, Questions, Metrics (GQM). São analisadas relativamente a atributos de qualidade como manutenção, escalabilidade, desempenho e testabilidade. Após esta análise pode-se concluir que a solução construída com Jolie apresenta diferenças na manutenção sendo significativamente superior à solução baseada em Spring Boot e apresenta diferenças em termos de performance sendo ligeiramente inferior à solução construída com Spring Boot. O trabalho termina com a indicação das conquistas, dificuldades, ameaças à validade, possíveis trabalhos futuros e observações finais.Microservices are increasingly present in the world of information technologies, as they provide a new way to build more scalable, agile, and flexible systems. Despite this, they bring with them the problem of communication complexity between microservices, making the system difficult to maintain and understand. Microservices-specific programming languages like Jolie come into play to try to solve this problem and simplify the construction of systems with microservices architectures.
This work provides a broad view of the State of Art of the Jolie programming language, where it is first detailed why microservices-specific languages emerge and how the Jolie language is built to match microservices architectures through native resources.
To demonstrate all the advantages of using this language compared to more mainstream approaches, an experiment is designed to develop a microservices system within an e-commerce application. This system is built equally using two technological foundations – Jolie and Spring Boot. Spring Boot is considered the most used technology to develop microservices systems and is an ideal candidate for comparison. The entire analysis and design of this experiment are thought through.
Then the implementation of the solution is detailed from system configurations, architectural choices, and how they are implemented. Components such as API gateway, message brokers, databases, microservices orchestration, and containerization for each microservice and other components of the system.
Finally, the solutions are compared and analyzed based on the Goals, Questions, Metrics (GQM) approach. They are analyzed for quality attributes such as maintainability, scalability, performance, and testability. After this analysis, it can be concluded that the solution built with Jolie presents differences in maintenance being significant superior to the solution based on Spring Boot, and it presents differences in terms of performance being slightly inferior to the solution built with Spring Boot. The work ends with an indication of the achievements, difficulties, threats to validity, possible future work, and final observations
Analyzing 2.3 Million Maven Dependencies to Reveal an Essential Core in APIs
This paper addresses the following question: does a small, essential, core
set of API members emerges from the actual usage of the API by client
applications? To investigate this question, we study the 99 most popular
libraries available in Maven Central and the 865,560 client programs that
declare dependencies towards them, summing up to 2.3M dependencies. Our key
findings are as follows: 43.5% of the dependencies declared by the clients are
not used in the bytecode; all APIs contain a large part of rarely used types
and a few frequently used types, and the ratio varies according to the nature
of the API, its size and its design; we can systematically extract a reuse-core
from APIs that is sufficient to provide for most clients, the median size of
this subset is 17% of the API that can serve 83% of the clients. This study is
novel both in its scale and its findings about unused dependencies and the
reuse-core of APIs. Our results provide concrete insights to improve Maven's
build process with a mechanism to detect unused dependencies. They also support
the need to reduce the size of APIs to facilitate API learning and maintenance.Comment: 15 pages, 13 figures, 3 tables, 2 listing
Simplified vector-thread architectures for flexible and efficient data-parallel accelerators
Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student submitted PDF version of thesis.Includes bibliographical references (p. 165-170).This thesis explores a new approach to building data-parallel accelerators that is based on simplifying the instruction set, microarchitecture, and programming methodology for a vector-thread architecture. The thesis begins by categorizing regular and irregular data-level parallelism (DLP), before presenting several architectural design patterns for data-parallel accelerators including the multiple-instruction multiple-data (MIMD) pattern, the vector single-instruction multiple-data (vector-SIMD) pattern, the single-instruction multiple-thread (SIMT) pattern, and the vector-thread (VT) pattern. Our recently proposed VT pattern includes many control threads that each manage their own array of microthreads. The control thread uses vector memory instructions to efficiently move data and vector fetch instructions to broadcast scalar instructions to all microthreads. These vector mechanisms are complemented by the ability for each microthread to direct its own control flow. In this thesis, I introduce various techniques for building simplified instances of the VT pattern. I propose unifying the VT control-thread and microthread scalar instruction sets to simplify the microarchitecture and programming methodology. I propose a new single-lane VT microarchitecture based on minimal changes to the vector-SIMD pattern.(cont.) Single-lane cores are simpler to implement than multi-lane cores and can achieve similar energy efficiency. This new microarchitecture uses control processor embedding to mitigate the area overhead of single-lane cores, and uses vector fragments to more efficiently handle both regular and irregular DLP as compared to previous VT architectures. I also propose an explicitly data-parallel VT programming methodology that is based on a slightly modified scalar compiler. This methodology is easier to use than assembly programming, yet simpler to implement than an automatically vectorizing compiler. To evaluate these ideas, we have begun implementing the Maven data-parallel accelerator. This thesis compares a simplified Maven VT core to MIMD, vector-SIMD, and SIMT cores. We have implemented these cores with an ASIC methodology, and I use the resulting gate-level models to evaluate the area, performance, and energy of several compiled microbenchmarks. This work is the first detailed quantitative comparison of the VT pattern to other patterns. My results suggest that future data-parallel accelerators based on simplified VT architectures should be able to combine the energy efficiency of vector-SIMD accelerators with the flexibility of MIMD accelerators.by Christopher Francis Batten.Ph.D
Tracing Community Genealogy: How New Communities Emerge from the Old
The process by which new communities emerge is a central research issue in
the social sciences. While a growing body of research analyzes the formation of
a single community by examining social networks between individuals, we
introduce a novel community-centered perspective. We highlight the fact that
the context in which a new community emerges contains numerous existing
communities. We reveal the emerging process of communities by tracing their
early members' previous community memberships.
Our testbed is Reddit, a website that consists of tens of thousands of
user-created communities. We analyze a dataset that spans over a decade and
includes the posting history of users on Reddit from its inception to April
2017. We first propose a computational framework for building genealogy graphs
between communities. We present the first large-scale characterization of such
genealogy graphs. Surprisingly, basic graph properties, such as the number of
parents and max parent weight, converge quickly despite the fact that the
number of communities increases rapidly over time. Furthermore, we investigate
the connection between a community's origin and its future growth. Our results
show that strong parent connections are associated with future community
growth, confirming the importance of existing community structures in which a
new community emerges. Finally, we turn to the individual level and examine the
characteristics of early members. We find that a diverse portfolio across
existing communities is the most important predictor for becoming an early
member in a new community.Comment: 10 pages, 7 figures, to appear in Proceedings of ICWSM 2018, data and
more at https://chenhaot.com/papers/community-genealogy.htm
- …