Search CORE

659 research outputs found

Dealing with clones in software : a practical approach from detection towards management

Author: Uddin Md Sharif
Publication venue: 'University of Saskatchewan Library'
Publication date
Field of study

Despite the fact that duplicated fragments of code also called code clones are considered one of the prominent code smells that may exist in software, cloning is widely practiced in industrial development. The larger the system, the more people involved in its development and the more parts developed by different teams result in an increased possibility of having cloned code in the system. While there are particular benefits of code cloning in software development, research shows that it might be a source of various troubles in evolving software. Therefore, investigating and understanding clones in a software system is important to manage the clones efficiently. However, when the system is fairly large, it is challenging to identify and manage those clones properly. Among the various types of clones that may exist in software, research shows detection of near-miss clones where there might be minor to significant differences (e.g., renaming of identifiers and additions/deletions/modifications of statements) among the cloned fragments is costly in terms of time and memory. Thus, there is a great demand of state-of-the-art technologies in dealing with clones in software. Over the years, several tools have been developed to detect and visualize exact and similar clones. However, usually the tools are standalone and do not integrate well with a software developer's workflow. In this thesis, first, a study is presented on the effectiveness of a fingerprint based data similarity measurement technique named 'simhash' in detecting clones in large scale code-base. Based on the positive outcome of the study, a time efficient detection approach is proposed to find exact and near-miss clones in software, especially in large scale software systems. The novel detection approach has been made available as a highly configurable and fully fledged standalone clone detection tool named 'SimCad', which can be configured for detection of clones in both source code and non-source code based data. Second, we show a robust use of the clone detection approach studied earlier by assembling its detection service as a portable library named 'SimLib'. This library can provide tightly coupled (integrated) clone detection functionality to other applications as opposed to loosely coupled service provided by a typical standalone tool. Because of being highly configurable and easily extensible, this library allows the user to customize its clone detection process for detecting clones in data having diverse characteristics. We performed a user study to get some feedback on installation and use of the 'SimLib' API (Application Programming Interface) and to uncover its potential use as a third-party clone detection library. Third, we investigated on what tools and techniques are currently in use to detect and manage clones and understand their evolution. The goal was to find how those tools and techniques can be made available to a developer's own software development platform for convenient identification, tracking and management of clones in the software. Based on that, we developed a clone-aware software development platform named 'SimEclipse' to promote the practical use of code clone research and to provide better support for clone management in software. Finally, we evaluated 'SimEclipse' by conducting a user study on its effectiveness, usability and information management. We believe that both researchers and developers would enjoy and utilize the benefit of using these tools in different aspect of code clone research and manage cloned code in software systems

eCommons@USASK

University of Saskatchewan Research Archive

BamView: visualizing and interpretation of next-generation sequencing read alignments.

Author: Berriman Matthew
Carver Tim
Harris Simon R.
McQuillan Jacqueline A.
Otto Thomas D.
Parkhill Julian
Publication venue: 'Oxford University Press (OUP)'
Publication date: 16/01/2012
Field of study

So-called next-generation sequencing (NGS) has provided the ability to sequence on a massive scale at low cost, enabling biologists to perform powerful experiments and gain insight into biological processes. BamView has been developed to visualize and analyse sequence reads from NGS platforms, which have been aligned to a reference sequence. It is a desktop application for browsing the aligned or mapped reads [Ruffalo, M, LaFramboise, T, Koyutürk, M. Comparative analysis of algorithms for next-generation sequencing read alignment. Bioinformatics 2011;27:2790-6] at different levels of magnification, from nucleotide level, where the base qualities can be seen, to genome or chromosome level where overall coverage is shown. To enable in-depth investigation of NGS data, various views are provided that can be configured to highlight interesting aspects of the data. Multiple read alignment files can be overlaid to compare results from different experiments, and filters can be applied to facilitate the interpretation of the aligned reads. As well as being a standalone application it can be used as an integrated part of the Artemis genome browser, BamView allows the user to study NGS data in the context of the sequence and annotation of the reference genome. Single nucleotide polymorphism (SNP) density and candidate SNP sites can be highlighted and investigated, and read-pair information can be used to discover large structural insertions and deletions. The application will also calculate simple analyses of the read mapping, including reporting the read counts and reads per kilobase per million mapped reads (RPKM) for genes selected by the user

PubMed Central

Enlighten

Deteksi Similarity Source Code Menggunakan Metode Deteksi Abstract Syntax Tree

Author: Dzil Jalal Ahmad Fadly
Prasetya E. B. (Eka)
Publication venue: 'Faculty of Islamic Studies - University of Muhammadiyah Jakarta'
Publication date: 01/11/2014
Field of study

Laboratorium Fakultas Teknik Infomatika Universitas Muhammadiyah Jakarta (FT–InformatikaUMJ) sebagai tempat pembelajaran bagi para mahasiswa informatika yang mengikuti kelaspemrograman selalu memberikan tugas-tugas sebagai salah satu media pengukur tingkatpemahaman mahasiswa. Banyaknya tugas source code menggunakan bahasa Java yang harusdiperiksa oleh Assisten Laboratorium mengakibatkan sulitnya melakukan pemeriksaan apabiladilakukan satu per satu serta sulitnya mengukur kredibilitas masing-masing tugas milik mahasiswa.Tugas-tugas terperiksa yang memiliki tingkat similarity (kemiripan) yang cukup tinggi antar codedapat dijadikan acuan adanya tindakan-tindakan kecuranganseperti melakukan tindakan plagiatcode terhadap tugas mahasiswa lain. Metode deteksi kemiripan code menggunakan Abstract SyntaxTree dapat digunakan untuk merubah code menjadi node ataupun token unik masing-masing codeterperiksa. Semakin besar kemiripan maka semakin besar kemungkinan code tersebut merupakanhasil plagiat. Aplikasi Java\u27s Source Code Similarity Detector (JSC-SD) yang diusulkan akanmendeteksi kemiripan code melalui beberapa proses, yaitu proses parsing code menjadi AST yangkemudian akan diukur kemiripan tingkat kemiripannya menggunakan algoritma LevenstehinDistance dan Smith-Waterman dan pada proses terakhir adalah pendeteksian code clone dari sourcecode terperiksa. Hasil akhir yang didapat adalah grafik persentase kemiripan antar code serta linecode yang dicurigai similar

Neliti

Prosiding Semnastek

Wireless and Physical Security via Embedded Sensor Networks

Author: Bestavros Azer
Ocean Michael J.
Publication venue: Boston University Computer Science Department
Publication date: 15/01/2008
Field of study

Wireless Intrusion Detection Systems (WIDS) monitor 802.11 wireless frames (Layer-2) in an attempt to detect misuse. What distinguishes a WIDS from a traditional Network IDS is the ability to utilize the broadcast nature of the medium to reconstruct the physical location of the offending party, as opposed to its possibly spoofed (MAC addresses) identity in cyber space. Traditional Wireless Network Security Systems are still heavily anchored in the digital plane of "cyber space" and hence cannot be used reliably or effectively to derive the physical identity of an intruder in order to prevent further malicious wireless broadcasts, for example by escorting an intruder off the premises based on physical evidence. In this paper, we argue that Embedded Sensor Networks could be used effectively to bridge the gap between digital and physical security planes, and thus could be leveraged to provide reciprocal benefit to surveillance and security tasks on both planes. Toward that end, we present our recent experience integrating wireless networking security services into the SNBENCH (Sensor Network workBench). The SNBENCH provides an extensible framework that enables the rapid development and automated deployment of Sensor Network applications on a shared, embedded sensing and actuation infrastructure. The SNBENCH's extensible architecture allows an engineer to quickly integrate new sensing and response capabilities into the SNBENCH framework, while high-level languages and compilers allow novice SN programmers to compose SN service logic, unaware of the lower-level implementation details of tools on which their services rely. In this paper we convey the simplicity of the service composition through concrete examples that illustrate the power and potential of Wireless Security Services that span both the physical and digital plane.National Science Foundation (CISE/CSR 0720604, ENG/EFRI 0735974, CIES/CNS 0520166, CNS/ITR 0205294, CISE/ERA RI 0202067

Boston University Institutional Repository (OpenBU)

Generic code clone detection model for java applications

Author: Mubarak-Ali Al-Fahim
Sulaiman Shahida
Publication venue: 'IOP Publishing'
Publication date: 01/01/2020
Field of study

Code clone is a common term used for codes that are repeated multiple times in a program. There are Type 1, Type 2, Type 3 and Type 4 code clones. Various code clone detection approaches and models have been used to detect a code clone. However, a major challenge faced in detecting code clone using these models is the lack of generality in detecting all clone types. To address this problem, Generic Code Clone Detection (GCCD) model that consists of five processes which are Preprocessing, Transformation, Parameterization, Categorization and Match Detection process is proposed. Initially, a pre-processing process produces source units through the application of five combinatorial rules. This is followed by the transformation process to produce transformed source units based on the letter to number substitution concept. Next, a parameterization process produces parameters used in categorization and match detection process. Next, a categorization process groups the source units into pools. Finally, a match detection process uses a hybrid exact matching with Euclidean distance to detect the clones. Based on these processes, a prototype of the GCCD was developed using Netbeans 8.0. The model was compared with the Generic Pipeline Model (GPM). The comparisons showed that the GCCD was able to detect clone pairs of Type-1 until Type-4 while the GPM was able to detect clone pair for Type-1 only. Furthermore, the GCCD prototype was empirically tested with Bellons benchmark data and it was able to detect clones in Java applications with up to 203,000 line of codes. As a conclusion, the GCCD model is able to overcome the lack of generality in detecting all code clone types by detecting Type 1, Type 2, Type 3 and Type 4 clones

Universiti Teknologi Malaysia Institutional Repository

ICD Management (ICDM) tool for embedded systems on aircrafts

Author: Fernández De La Hoz Carlos
Lafoz Pastor Ismael
Murciano Díaz Carlos
Ángel Mozas Pajares Miguel
Publication venue: HAL CCSD
Publication date: 19/05/2010
Field of study

International audienceIn the scope of on-board systems development and integration for aircrafts, ICDs (Interface Control Document/Description) are fundamental sources of information at same level that the functional requirements. ICD description must include feasible (bus capacity not exceeded, processing load under specific limits) and accurate (data precision, data retransmission rate, data consistency) inter-connectivity mechanisms, to support the application requirements needs, as well as the information must be easy to update and to maintain.On the other hand, DOORS (IBM) is a tool specialized on requirements management. It faces with aspects as traceability, version control, security (access control), changes control, data persistency and accessibility. DOORS is widely used and it is recognized as a reference tool in the scope of aeronautic developments.Nevertheless, ICD information usually is not managed with DOORS. The reason can be found in the specificities of the nature of ICD information, which suggests other approaches more suitable than DOORS.The alternative approaches are very diverse but most of them focus on handling information typified according to specific meta-models for ICD: databases, customized spreadsheets, and tabular descriptions. But whatever is the solution, it must address also, by its own means, the features of ICD derived of its “requirement nature”.This paper presents a solution based on the integration of DOORS with a user front-end developed with Model Driven Architecture (MDA) and Open Source technologies. It is intended to provide means to manage ICD data in a robust way (guaranteeing the correctness -syntactic, completeness, non ambiguity- of data, changes, versions, dependencies, ...) and additionally, to provide mechanisms to focus the effort on the data transformation rather than on the manual elaboration of derived artefacts

Tool support for continuous quality controlling

Author: Deissenboeck Florian
Hummel Benjamin
Juergens Elmar
Mas y Parareda Benedikt
Pizka Markus
Wagner Stefan
Publication venue
Publication date: 01/01/2008
Field of study

Over time, software systems suffer gradual quality decay and therefore costs can rise if organizations fail to take proactive countermeasures. Quality control is the first step to avoiding this cost trap. Continuous quality assessments help users identify quality problems early, when their removal is still inexpensive; they also aid decision making by providing an integrated view of a software system's current status. As a side effect, continuous and timely feedback helps developers and maintenance personnel improve their skills and thereby decreases the likelihood of future quality defects. To make regular quality control feasible, it must be highly automated, and assessment results must be presented in an aggregated manner to avoid overwhelming users with data. This article offers an overview of tools that aim to address these issues. The authors also discuss their own flexible, open-source toolkit, which supports the creation of dashboards for quality control

A Framework for Seamless Variant Management and Incremental Migration to a Software Product-Line

Author: Mahmood Wardah
Publication venue
Publication date: 01/01/2022
Field of study

Context: Software systems often need to exist in many variants in order to satisfy varying customer requirements and operate under varying software and hardware environments. These variant-rich systems are most commonly realized using cloning, a convenient approach to create new variants by reusing existing ones. Cloning is readily available, however, the non-systematic reuse leads to difficult maintenance. An alternative strategy is adopting platform-oriented development approaches, such as Software Product-Line Engineering (SPLE). SPLE offers systematic reuse, and provides centralized control, and thus, easier maintenance. However, adopting SPLE is a risky and expensive endeavor, often relying on significant developer intervention. Researchers have attempted to devise strategies to synchronize variants (change propagation) and migrate from clone&own to an SPL, however, they are limited in accuracy and applicability. Additionally, the process models for SPLE in literature, as we will discuss, are obsolete, and only partially reflect how adoption is approached in industry. Despite many agile practices prescribing feature-oriented software development, features are still rarely documented and incorporated during actual development, making SPL-migration risky and error-prone.Objective: The overarching goal of this PhD is to bridge the gap between clone&own and software product-line engineering in a risk-free, smooth, and accurate manner. Consequently, in the first part of the PhD, we focus on the conceptualization, formalization, and implementation of a framework for migrating from a lean architecture to a platform-based one.Method: Our objectives are met by means of (i) understanding the literature relevant to variant-management and product-line migration and determining the research gaps (ii) surveying the dominant process models for SPLE and comparing them against the contemporary industrial practices, (iii) devising a framework for incremental SPL adoption, and (iv) investigating the benefit of using features beyond PL migration; facilitating model comprehension.Results: Four main results emerge from this thesis. First, we present a qualitative analysis of the state-of-the-art frameworks for change propagation and product-line migration. Second, we compare the contemporary industrial practices with the ones prescribed in the process models for SPL adoption, and provide an updated process model that unifies the two to accurately reflect the real practices and guide future practitioners. Third, we devise a framework for incremental migration of variants into a fully integrated platform by exploiting explicitly recorded metadata pertaining to clone and feature-to-asset traceability. Last, we investigate the impact of using different variability mechanisms on the comprehensibility of various model-related tasks.Future work: As ongoing and future work, we aim to integrate our framework with existing IDEs and conduct a developer study to determine the efficiency and effectiveness of using our framework. We also aim to incorporate safe-evolution in our operators

Chalmers Research