Search CORE

25 research outputs found

SEMANTIC PATCH INFERENCE

Author: Andersen Jesper
Publication venue
Publication date: 01/01/2009
Field of study

Copenhagen University Research Information System

Software Maintenance At Commit-Time

Author: Nayrolles Mathieu
Publication venue
Publication date: 26/06/2018
Field of study

Software maintenance activities such as debugging and feature enhancement are known to be challenging and costly, which explains an ever growing line of research in software maintenance areas including mining software repository, default prevention, clone detection, and bug reproduction. The main goal is to improve the productivity of software developers as they undertake maintenance tasks. Existing tools, however, operate in an offline fashion, i.e., after the changes to the systems have been made. Studies have shown that software developers tend to be reluctant to use these tools as part of a continuous development process. This is because they require installation and training, hindering their integration with developers’ workflow, which in turn limits their adoption. In this thesis, we propose novel approaches to support software developers at commit-time. As part of the developer’s workflow, a commit marks the end of a given task. We show how commits can be used to catch unwanted modifications to the system, and prevent the introduction of clones and bugs, before these modifications reach the central code repository. We also propose a bug reproduction technique that is based on model checking and crash traces. Furthermore, we propose a new way for classifying bugs based on the location of fixes that can serve as the basis for future research in this field of study. The techniques proposed in this thesis have been tested on over 400 open and closed (industrial) systems, resulting in high levels of precision and recall. They are also scalable and non-intrusive

Concordia University Research Repository

Recommended from our members

Emergent Forms of Online Sociality in Disasters Arising from Natural Hazards

Author: Kogan Marina
Publication venue: University of Colorado Boulder
Publication date: 01/01/2017
Field of study

Disasters arising from natural hazards are associated with breakdown of existing structures, but they also result in creation of new social ties in the process of self-organization and problem solving by those affected. This dissertation focuses on emergent forms of sociality that arise in the context of crisis. Specifically, it considers collaborative work practices, social network structures, and organizational forms that emerge on social media during disasters arising from natural hazards. Social media platforms support highly-distributed social environments, and the forms of sociality that emerge in these contexts are affected by the affordances of their technical features, especially those that more or less successfully facilitate the creation of a shared information space. Thus, this dissertation is organized around two important aspects of social media spaces: the availability of an explicitly-shared site of work and the availability of a visible, legible record of activity.This dissertation investigates the forms of sociality that emerge during disasters in three social media activities: retweeting, crisis mapping in OpenStreetMap (OSM), and Twitter reply conversations. These three social media activities highlight various availability of an explicitly-shared site of work and visible record of activity. The studies of retweeting and reply conversations investigate the Twitter activity in response to the 2012 Hurricane Sandy—the second costliest hurricane in US history and the most tweeted about event to date at the time. Analysis of crisis mapping in OpenStreetMap—an open, editable, volunteer-based map of the world—focuses on the OSM activity after the 2010 Haiti earthquake, which was the first major disaster event supported by OpenStreetMap. For these investigations, the dissertation elaborates and develops human-centered data science methods—a set of methodological approaches that both harness the power of computational techniques and account for the highly-situated nature of the social activity in crisis. Finally, the dissertation positions the findings from the three studies within the larger context of high-tempo, high-volume social media activity and highlights how the framework of the two intersecting dimensions of the shared information space reveals larger patterns within the emergent forms of sociality across contexts

CU Scholar Institutional Repository

Ubiquitous Semantic Applications

Author: Ermilov Timofey
Publication venue
Publication date: 18/12/2014
Field of study

As Semantic Web technology evolves many open areas emerge, which attract more research focus. In addition to quickly expanding Linked Open Data (LOD) cloud, various embeddable metadata formats (e.g. RDFa, microdata) are becoming more common. Corporations are already using existing Web of Data to create new technologies that were not possible before. Watson by IBM an artificial intelligence computer system capable of answering questions posed in natural language can be a great example. On the other hand, ubiquitous devices that have a large number of sensors and integrated devices are becoming increasingly powerful and fully featured computing platforms in our pockets and homes. For many people smartphones and tablet computers have already replaced traditional computers as their window to the Internet and to the Web. Hence, the management and presentation of information that is useful to a user is a main requirement for today’s smartphones. And it is becoming extremely important to provide access to the emerging Web of Data from the ubiquitous devices. In this thesis we investigate how ubiquitous devices can interact with the Semantic Web. We discovered that there are five different approaches for bringing the Semantic Web to ubiquitous devices. We have outlined and discussed in detail existing challenges in implementing this approaches in section 1.2. We have described a conceptual framework for ubiquitous semantic applications in chapter 4. We distinguish three client approaches for accessing semantic data using ubiquitous devices depending on how much of the semantic data processing is performed on the device itself (thin, hybrid and fat clients). These are discussed in chapter 5 along with the solution to every related challenge. Two provider approaches (fat and hybrid) can be distinguished for exposing data from ubiquitous devices on the Semantic Web. These are discussed in chapter 6 along with the solution to every related challenge. We conclude our work with a discussion on each of the contributions of the thesis and propose future work for each of the discussed approach in chapter 7

Qucosa - Publikationsserver der Universität Leipzig

Aide-mémoire: Improving a Project’s Collective Memory via Pull Request–Issue Links

Author: Barr Earl T
Pârţachi Profir-Petru
White David R
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 29/03/2023
Field of study

Links between pull request and the issues they address document and accelerate the development of a software project but are often omitted. We present a new tool, Aide-mémoire, to suggest such links when a developer submits a pull request or closes an issue, smoothly integrating into existing workflows. In contrast to previous state-of-the-art approaches that repair related commit histories, Aide-mémoire is designed for continuous, real-time, and long-term use, employing Mondrian forest to adapt over a project’s lifetime and continuously improve traceability. Aide-mémoire is tailored for two specific instances of the general traceability problem—namely, commit to issue and pull request to issue links, with a focus on the latter—and exploits data inherent to these two problems to outperform tools for general purpose link recovery. Our approach is online, language-agnostic, and scalable. We evaluate over a corpus of 213 projects and six programming languages, achieving a mean average precision of 0.95. Adopting Aide-mémoire is both efficient and effective: A programmer need only evaluate a single suggested link 94% of the time, and 16% of all discovered links were originally missed by developers

UCL Discovery

Melhoria das práticas de construção de software: um caso de estudo

Author: Palos Bruno Tiago Carneiro
Publication venue: Universidade de Aveiro
Publication date: 01/01/2012
Field of study

Mestrado em Engenharia de Computadores e TelemáticaEm muitos projetos de desenvolvimento de software não são utilizados processos e práticas explícitos com o intuito de garantir a qualidade do produto final. Nesses casos, a organização do ambiente de construção nasce das acções imediatas do dia-a-dia da equipa de desenvolvimento de forma não estruturada e não escalável. No contexto dos projetos de investigação com desenvolvimento de software, em que as equipas são marcadamente mutáveis, a definição de estratégias para o processo de construção de software é essencial para agilizar o desenvolvimento, aumentar a produtividade e controlar a evolução do produto. Este trabalho visa a análise e definição de estratégias para a construção de software usando como caso de estudo o projeto Rede Telemática Saúde (RTS) do Instituto de Engenharia Eletrónica e Telemática de Aveiro, e a sua implementação, através da introdução de boas práticas e ferramentas que permitem melhorar a evolução do sistema. A implementação dessas estratégias inclui disciplinas de gestão de configurações, que asseguram a consistência das versões do projeto e respetivas dependências, e um ambiente de integração contínua que controla todo o código-fonte produzido pela equipa de programadores usando testes automatizados. Cada versão é composta por um conjunto de tarefas ou tópicos atribuídos a cada colaborador que são geridos por critérios de prioridade, alavancando a agilidade do processo de desenvolvimento. Todo o ciclo é representável numa plataforma de gestão dessas tarefas, essencial à gestão de alto nível. Complementarmente, realizou-se um estudo para caracterizar as práticas correntes no processo de construção de software, através de um inquérito à indústria de software portuguesa. As estratégias propostas e implementadas permitiram redefinir o processo de construção no projeto RTS, introduzindo um maior controlo sobre a linha de produção, especialmente na identificação antecipada de defeitos e controlo de versões. Estes resultados estão alinhados com as necessidades prioritárias identificadas no inquérito à indústria.Software projects often neglect the use explicit processes and practices to ensure final product quality are. On those cases, the arrangement of the construction environment arises from pressing needs of the development team daily routine in a non-structured and non-scalable way. In the context of research projects that need software development, in which teams are strongly mutable, the definition of strategies for software construction practices is essential to streamline development, increase productivity and to control the product evolution. This study aims at analyzing and define software construction strategies using as a case study the Rede Telemática Saúde project (RTS) of the Institute of Electronics and Telematics Engineering of Aveiro (IEETA), and their implementation, by introducing best practices and tools that help improving the system evolution. Such strategies include particular topics of configuration management, which ensure consistency of versions and their dependencies, and a continuous integration environment by validating the source-code produced by developers using automated testing. Every version is composed of a set of tasks or topics particularly assigned to each team member and managed by priority criteria to leverage the agility of the development process. Such tasks and the whole development cycle are mapped on a management platform, which is essential to high-level management. Additionally, an industry study was carried out to characterize current practices on software construction process, through a survey to the Portuguese software industry. The proposed and implemented strategies allowed redefining the construction process on the RTS project, introducing more control over the production line, especially on version control and early identification of defects. Those results are aligned with identified priority needs in the industry survey

Repositório Institucional da Universidade de Aveiro

Efficient Extraction and Query Benchmarking of Wikipedia Data

Author: Morsey Mohamed
Publication venue
Publication date: 12/04/2013
Field of study

Knowledge bases are playing an increasingly important role for integrating information between systems and over the Web. Today, most knowledge bases cover only specific domains, they are created by relatively small groups of knowledge engineers, and it is very cost intensive to keep them up-to-date as domains change. In parallel, Wikipedia has grown into one of the central knowledge sources of mankind and is maintained by thousands of contributors. The DBpedia (http://dbpedia.org) project makes use of this large collaboratively edited knowledge source by extracting structured content from it, interlinking it with other knowledge bases, and making the result publicly available. DBpedia had and has a great effect on the Web of Data and became a crystallization point for it. Furthermore, many companies and researchers use DBpedia and its public services to improve their applications and research approaches. However, the DBpedia release process is heavy-weight and the releases are sometimes based on several months old data. Hence, a strategy to keep DBpedia always in synchronization with Wikipedia is highly required. In this thesis we propose the DBpedia Live framework, which reads a continuous stream of updated Wikipedia articles, and processes it. DBpedia Live processes that stream on-the-fly to obtain RDF data and updates the DBpedia knowledge base with the newly extracted data. DBpedia Live also publishes the newly added/deleted facts in files, in order to enable synchronization between our DBpedia endpoint and other DBpedia mirrors. Moreover, the new DBpedia Live framework incorporates several significant features, e.g. abstract extraction, ontology changes, and changesets publication. Basically, knowledge bases, including DBpedia, are stored in triplestores in order to facilitate accessing and querying their respective data. Furthermore, the triplestores constitute the backbone of increasingly many Data Web applications. It is thus evident that the performance of those stores is mission critical for individual projects as well as for data integration on the Data Web in general. Consequently, it is of central importance during the implementation of any of these applications to have a clear picture of the weaknesses and strengths of current triplestore implementations. We introduce a generic SPARQL benchmark creation procedure, which we apply to the DBpedia knowledge base. Previous approaches often compared relational and triplestores and, thus, settled on measuring performance against a relational database which had been converted to RDF by using SQL-like queries. In contrast to those approaches, our benchmark is based on queries that were actually issued by humans and applications against existing RDF data not resembling a relational schema. Our generic procedure for benchmark creation is based on query-log mining, clustering and SPARQL feature analysis. We argue that a pure SPARQL benchmark is more useful to compare existing triplestores and provide results for the popular triplestore implementations Virtuoso, Sesame, Apache Jena-TDB, and BigOWLIM. The subsequent comparison of our results with other benchmark results indicates that the performance of triplestores is by far less homogeneous than suggested by previous benchmarks. Further, one of the crucial tasks when creating and maintaining knowledge bases is validating their facts and maintaining the quality of their inherent data. This task include several subtasks, and in thesis we address two of those major subtasks, specifically fact validation and provenance, and data quality The subtask fact validation and provenance aim at providing sources for these facts in order to ensure correctness and traceability of the provided knowledge This subtask is often addressed by human curators in a three-step process: issuing appropriate keyword queries for the statement to check using standard search engines, retrieving potentially relevant documents and screening those documents for relevant content. The drawbacks of this process are manifold. Most importantly, it is very time-consuming as the experts have to carry out several search processes and must often read several documents. We present DeFacto (Deep Fact Validation), which is an algorithm for validating facts by finding trustworthy sources for it on the Web. DeFacto aims to provide an effective way of validating facts by supplying the user with relevant excerpts of webpages as well as useful additional information including a score for the confidence DeFacto has in the correctness of the input fact. On the other hand the subtask of data quality maintenance aims at evaluating and continuously improving the quality of data of the knowledge bases. We present a methodology for assessing the quality of knowledge bases’ data, which comprises of a manual and a semi-automatic process. The first phase includes the detection of common quality problems and their representation in a quality problem taxonomy. In the manual process, the second phase comprises of the evaluation of a large number of individual resources, according to the quality problem taxonomy via crowdsourcing. This process is accompanied by a tool wherein a user assesses an individual resource and evaluates each fact for correctness. The semi-automatic process involves the generation and verification of schema axioms. We report the results obtained by applying this methodology to DBpedia

Qucosa - Publikationsserver der Universität Leipzig

A survey on software coupling relations and tools

Author: Bacchelli Alberto
Baum Tobias
Fregnan Enrico
Palomba Fabio
Publication venue: Elsevier
Publication date: 26/11/2018
Field of study

Context Coupling relations reflect the dependencies between software entities and can be used to assess the quality of a program. For this reason, a vast amount of them has been developed, together with tools to compute their related metrics. However, this makes the coupling measures suitable for a given application challenging to find. Goals The first objective of this work is to provide a classification of the different kinds of coupling relations, together with the metrics to measure them. The second consists in presenting an overview of the tools proposed until now by the software engineering academic community to extract these metrics. Method This work constitutes a systematic literature review in software engineering. To retrieve the referenced publications, publicly available scientific research databases were used. These sources were queried using keywords inherent to software coupling. We included publications from the period 2002 to 2017 and highly cited earlier publications. A snowballing technique was used to retrieve further related material. Results Four groups of coupling relations were found: structural, dynamic, semantic and logical. A fifth set of coupling relations includes approaches too recent to be considered an independent group and measures developed for specific environments. The investigation also retrieved tools that extract the metrics belonging to each coupling group. Conclusion This study shows the directions followed by the research on software coupling: e.g., developing metrics for specific environments. Concerning the metric tools, three trends have emerged in recent years: use of visualization techniques, extensibility and scalability. Finally, some coupling metrics applications were presented (e.g., code smell detection), indicating possible future research directions. Public preprint [https://doi.org/10.5281/zenodo.2002001]

ZORA

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY