Search CORE

3 research outputs found

Graphs Resemblance based Software Birthmarks through Data Mining for Piracy Control

Author: Iqbal M.
Iqbal M.
Mahmood Y.
Mahmood Y.
Safyab M.
Safyab M.
Sarwar S.
Sarwar S.
Ul Qayyum Z.
Ul Qayyum Z.
Publication venue: 'Pleiades Publishing Ltd'
Publication date: 01/01/2019
Field of study

The emergence of software artifacts greatly emphasizes the need for protecting intellectual property rights (IPR) hampered by software piracy requiring effective measures for software piracy control. Software birthmarking targets to counter ownership theft of software by identifying similarity of their origins. A novice birthmarking approach has been proposed in this paper that is based on hybrid of text-mining and graph-mining techniques. The code elements of a program and their relations with other elements have been identified through their properties (i.e code constructs) and transformed into Graph Manipulation Language (GML). The software birthmarks generated by exploiting the graph theoretic properties (through clustering coefficient) are used for the classifications of similarity or dissimilarity of two programs. The proposed technique has been evaluated over metrics of credibility, resilience, method theft, modified code detection and self-copy detection for programs asserting the effectiveness of proposed approach against software ownership theft. The comparative analysis of proposed approach with contemporary ones shows better results for having properties and relations of program nodes and for employing dynamic techniques of graph mining without adding any overhead (such as increased program size and processing cost)

LSBU Research Open

Docencia en sistemas de acceso á información: detección de plaxios, emprego de tecnoloxías avanzadas para desenvolvemento software e achegamento da experiencia na industria á aula

Author: Barreiro Álvaro
López-Otero Paula
Parapar Javier
Valcarce Daniel
Publication venue: 'Universidade da Coruna'
Publication date: 01/01/2019
Field of study

[Resumo] Este artigo presenta as actividades desenvolvidas polo grupo de innovación educativa en Sistemas de Acceso á Información durante o curso 2017/2018. Este grupo, con docencia na Facultade de Informática da Universidade da Coruña, realizou accións en tres liñas de actuación diferentes. A primeira delas, dirixida á mellora da calidade nos métodos de avaliación, consiste no emprego dun protocolo para a detección de plaxios en prácticas de programación. A segunda actividade pretende mellorar a empregabilidade do alumnado e consiste en utilizar unha metodoloxía de aprendizaxe baseada en proxectos xunto cunha serie de ferramentas avanzadas para desenvolvemento software, permitindo recrear a actividade que deberán levar a cabo cando se incorporen ao mundo laboral. Por último, e de cara a aumentar o coñecemento das alternativas profesionais do alumnado, organizáronse unha serie de seminarios e charlas impartidas por profesionais dunha empresa internacional, unha empresa local multidisciplinar e un investigador da contorna académica. A experiencia obtida das diferentes actividades foi satisfactoria e enriquecedora tanto para o alumnado como para o profesorado, que xa baralla melloras de cara aos vindeiros cursos académicos.[Abstract] This paper presents the activities performed by the educative innovation group in Information Access Systems during the academic year 2017/2018. This group, with teaching at the Faculty of Informatics of the University of A Coruña, carried out actions addressing three different topics. The first action was designed to improve the quality of the evaluation methods, and consisted in following a protocol for detecting plagiarism in programming exercises. The second activity aimed to improve the employability of the students and consisted in using a methodology based on project-based learning along with a series of advanced tools for software development, which recreated the activity that the students will carry out when they obtain their first job. Lastly, heading towards a better knowledge about the available professional alternatives, a series of seminars and talks were organized, which were performed by professionals from an international company, a local interdisciplinary company, and a researcher from an academic institution. The experience obtained from the different activities was satisfactory for both students and teachers, who are already considering improvements for the next academic year

Repositorio da Universidade da Coruña

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CroLSSim: Cross‐language software similarity detector using hybrid approach of LSA‐based AST‐MDrep features and CNN‐LSTM model

Author: Alazab M.
Alazab M.
Cheng X.
Cheng X.
Naeem H.
Naeem H.
Naeem M.
Naeem M.
Ullah F.
Ullah F.
Publication venue: Wiley
Publication date: 01/01/2022
Field of study

Software similarity in different programming codes is a rapidly evolving field because of its numerous applications in software development, software cloning, software plagiarism, and software forensics. Currently, software researchers and developers search cross-language open-source repositories for similar applications for a variety of reasons, such as reusing programming code, analyzing different implementations, and looking for a better application. However, it is a challenging task because each programming language has a unique syntax and semantic structure. In this paper, a novel tool called Cross-Language Software Similarity (CroLSSim) is designed to detect similar software applications written in different programming codes. First, the Abstract Syntax Tree (AST) features are collected from different programming codes. These are high-quality features that can show the abstract view of each program. Then, Methods Description (MDrep) in combination with AST is used to examine the relationship among different method calls. Second, the Term Frequency Inverse Document Frequency approach is used to retrieve the local and global weights from AST-MDrep features. Third, the Latent Semantic Analysis-based features extraction and selection method is proposed to extract the semantic anchors in reduced dimensional space. Fourth, the Convolution Neural Network (CNN)-based features extraction method is proposed to mine the deep features. Finally, a hybrid deep learning model of CNN-Long-Short-Term Memory is designed to detect semantically similar software applications from these latent variables. The data set contains approximately 9.5K Java, 8.8K C#, and 7.4K C++ software applications obtained from GitHub. The proposed approach outperforms as compared with the state-of-the-art methods

Middlesex University Research Repository