1,640 research outputs found
Toward a Code-Clone Search through the Entire Lifecycle of a Software Product
This paper presents a clone-detection method/tool currently under devel-opment. This tool is useful as a code-clone search through the entire lifecycle ofa software product; The tool searches code examples and analyzes of code clonesin both preventive and postmortem ways[LRHK10]. The approach is based on asequence equivalence on execution paths[Kam13] and extends the equivalence toinclude gaps, thus type-3[BKA + 07] clone detection. Each of the detected clones isa sub-sequence of an execution path of a given program, in other words, a set of codefragments of multiple procedures (methods) which can be executed in a run of theprogram. The approach is relaxed in terms of adaptability to incomplete (not-yet-finished) code, but also makes use of concrete information such as types (includinghierarchy) and dynamic dispatch when such information is available
Using mobility and exception handling to achieve mobile agents that survive server crash failures
Mobile agent technology, when designed and used effectively, can minimize bandwidth consumption and autonomously provide a snapshot of the current context of a distributed system. Protecting mobile agents from server crashes is a challenging issue, since developers normally have no control over remote servers. Server crash failures can leave replicas, instable storage, unavailable for an unknown time period. Furthermore, few systems have considered the need for using a fault tolerant protocol among a group of collaborating mobile agents. This thesis uses exception handling to protect mobile agents from server crash failures. An exception model is proposed for mobile agents and two exception handler designs are investigated. The first exists at the server that created the mobile agent and uses a timeout mechanism. The second, the mobile shadow scheme, migrates with the mobile agent and operates at the previous server visited by the mobile agent. A case study application has been developed to compare the performance of the two exception handler designs. Performance results demonstrate that although the second design is slower it offers the smaller trip time when handling a server crash. Furthermore, no modification of the server environment is necessary. This thesis shows that the mobile shadow exception handling scheme reduces complexity for a group of mobile agents to survive server crashes. The scheme deploys a replica that monitors the server occupied by the master, at each stage of the itinerary. The replica exists at the previous server visited in the itinerary. Consequently, each group member is a single fault tolerant entity with respect to server crash failures. Other schemes introduce greater complexity and performance overheads since, for each stage of the itinerary, a group of replicas is sent to servers that offer an equivalent service. In addition, future research is established for fault tolerance in groups of collaborating mobile agents
A Semester Long Classroom Course Mimicking a Software Company and a New Hire Experience for Computer Science Students Preparing to Enter the Software Industry
Students in a Computer Science degree programs must learn to code before they can be taught Software Engineering skills. This core skill set is how to program and consists of the constructs of various languages, how to create short programs or applications, independent assignments, and arrive at solutions that utilize the skills being covered in the language for that course (Chatley & Field, 2017). As an upperclassman, students will often be allowed to apply these skills in newer ways and have the opportunity to work on longer, more involved assignments although frequently still independent or in small groups of two to three students. Once these students graduate and enter the software industry they will find that most companies follow specific development methodologies from one of the many forms of Agile through Waterfall. All while working in large groups or teams where each developer is responsible for specific pieces of the functionality, participating in design meetings and code reviews, as well as using code versioning systems, such as git, a program management system, such as Jira, all in a very collaborative environment. This study will develop a course that will allow students to apply these skills in a more realistic setting while remaining on-campus and monitoring the students’ beliefs on their preparedness for the world outside of the computer science building
Leveraging Evolutionary Changes for Software Process Quality
Real-world software applications must constantly evolve to remain relevant.
This evolution occurs when developing new applications or adapting existing
ones to meet new requirements, make corrections, or incorporate future
functionality. Traditional methods of software quality control involve software
quality models and continuous code inspection tools. These measures focus on
directly assessing the quality of the software. However, there is a strong
correlation and causation between the quality of the development process and
the resulting software product. Therefore, improving the development process
indirectly improves the software product, too. To achieve this, effective
learning from past processes is necessary, often embraced through post mortem
organizational learning. While qualitative evaluation of large artifacts is
common, smaller quantitative changes captured by application lifecycle
management are often overlooked. In addition to software metrics, these smaller
changes can reveal complex phenomena related to project culture and management.
Leveraging these changes can help detect and address such complex issues.
Software evolution was previously measured by the size of changes, but the
lack of consensus on a reliable and versatile quantification method prevents
its use as a dependable metric. Different size classifications fail to reliably
describe the nature of evolution. While application lifecycle management data
is rich, identifying which artifacts can model detrimental managerial practices
remains uncertain. Approaches such as simulation modeling, discrete events
simulation, or Bayesian networks have only limited ability to exploit
continuous-time process models of such phenomena. Even worse, the accessibility
and mechanistic insight into such gray- or black-box models are typically very
low. To address these challenges, we suggest leveraging objectively [...]Comment: Ph.D. Thesis without appended papers, 102 page
Pitfalls in Language Models for Code Intelligence: A Taxonomy and Survey
Modern language models (LMs) have been successfully employed in source code
generation and understanding, leading to a significant increase in research
focused on learning-based code intelligence, such as automated bug repair, and
test case generation. Despite their great potential, language models for code
intelligence (LM4Code) are susceptible to potential pitfalls, which hinder
realistic performance and further impact their reliability and applicability in
real-world deployment. Such challenges drive the need for a comprehensive
understanding - not just identifying these issues but delving into their
possible implications and existing solutions to build more reliable language
models tailored to code intelligence. Based on a well-defined systematic
research approach, we conducted an extensive literature review to uncover the
pitfalls inherent in LM4Code. Finally, 67 primary studies from top-tier venues
have been identified. After carefully examining these studies, we designed a
taxonomy of pitfalls in LM4Code research and conducted a systematic study to
summarize the issues, implications, current solutions, and challenges of
different pitfalls for LM4Code systems. We developed a comprehensive
classification scheme that dissects pitfalls across four crucial aspects: data
collection and labeling, system design and learning, performance evaluation,
and deployment and maintenance. Through this study, we aim to provide a roadmap
for researchers and practitioners, facilitating their understanding and
utilization of LM4Code in reliable and trustworthy ways
Exploiting Similarity Patterns to Build Generic Test Case Templates for Software Product Line Testing
Ph.DDOCTOR OF PHILOSOPH
Automated Refactoring in Software Automation Platforms
Software Automation Platforms (SAPs) enable faster development and reduce the need
to use code to construct applications. SAPs provide abstraction and automation, result-
ing in a low-entry barrier for users with less programming skills to become proficient
developers. An unfortunate consequence of using SAPs is the production of code with a
higher technical debt since such developers are less familiar with the software develop-
ment best practices. Hence, SAPs should aim to produce a simpler software construction
and evolution pipeline beyond providing a rapid software development environment.
One simple example of such high technical debt is the Unlimited Records anti-pattern,
which occurs whenever queries are unbounded, i.e. the maximum number of records to be
fetched is not explicitly limited. Limiting the number of records retrieved may, in many
cases, improve the performance of applications by reducing screen-loading time, thus
making applications faster and more responsive, which is a top priority for developers. A
second example is the Duplicated Code anti-pattern that severely affects code readability
and maintainability, and can even be the cause of bug propagation. To overcome this
problem we will resort to automated refactoring as it accelerates the refactoring process
and provides provably correct modifications.
This dissertation aims to study and develop a solution for automated refactorings in
the context of OutSystems (an industry-leading SAP). This was carried out by implement-
ing automated techniques for automatically refactoring a set of selected anti-patterns in
OutSystems logic that are currently detected by the OutSystems technical debt monitor-
ing tool.As Plataformas de Automação de Software (PAS) habilitam os seus utilizadores a desen-
volver aplicações de forma mais rápida e reduzem a necessidade de escrever código. Estas
fornecem abstração e automação, o que auxilia utilizadores com menos formação técnica a
tornarem-se programadores proficientes. No entanto, a integração de programadores com
menos formação técnica também contribui para a produção de código com alta dívida
técnica, uma vez que os mesmos estão menos familiarizados com as melhores práticas
de desenvolvimento de software. Desta forma, as PAS devem ter como objetivo a cons-
trução e evolução de software de forma simples para além de fornecer um ambiente de
desenvolvimento de software rápido.
Um exemplo de alta dívida técnica é o anti-padrão Unlimited Records, que ocorre
sempre que o número máximo de registos a ser retornado por uma consulta à base de
dados não é explicitamente limitado. Limitar o número de registos devolvidos pode, em
muitos casos, melhorar o desempenho das aplicações, reduzindo o tempo que demora a
carregar o ecrã, tornando assim as aplicações mais rápidas e responsivas, sendo esta uma
das principais prioridades dos programadores. Um segundo exemplo é o anti-padrão
Código Duplicado que afeta gravemente a legibilidade e manutenção do código, e que
pode causar a propagação de erros. Para superar este problema, recorreremos à reestru-
turação automatizada, pois acelera o processo de reestruturação através de modificações
comprovadamente corretas.
O objetivo desta dissertação é estudar e desenvolver uma solução para reestruturação
automatizada no contexto da OutSystems (uma PAS líder neste setor). Tal foi realizado
através da implementação de técnicas automatizadas para reestruturar um conjunto de
anti-padrões que são atualmente detetados pela ferramenta de monitorização de dívida
técnica da OutSystems
ASSESSING THE QUALITY OF SOFTWARE DEVELOPMENT TUTORIALS AVAILABLE ON THE WEB
Both expert and novice software developers frequently access software development resources available on the Web in order to lookup or learn new APIs, tools and techniques. Software quality is affected negatively when developers fail to find high-quality information relevant to their problem. While there is a substantial amount of freely available resources that can be accessed online, some of the available resources contain information that suffers from error proneness, copyright infringement, security concerns, and incompatible versions. Use of such toxic information can have a strong negative effect on developer’s efficacy. This dissertation focuses specifically on software tutorials, aiming to automatically evaluate the quality of such documents available on the Web. In order to achieve this goal, we present two contributions: 1) scalable detection of duplicated code snippets; 2) automatic identification of valid version ranges.
Software tutorials consist of a combination of source code snippets and natural language text. The code snippets in a tutorial can originate from different sources, perhaps carrying stringent licensing requirements or known security vulnerabilities. Developers, typically unaware of this, can reuse these code snippets in their project. First, in this thesis, we present our work on a Web-scale code clone search technique that is able to detect duplicate code snippets between large scale document and source code corpora in order to trace toxic code snippets.
As software libraries and APIs evolve over time, existing software development tutorials can become outdated. It is difficult for software developers and especially novices to determine the expected version of the software implicit in a specific tutorial in order to decide whether the tutorial is applicable to their software development environment. To overcome this challenge, in this thesis we present a novel technique for automatic identification of the valid version range of software development tutorials on the Web
- …