867 research outputs found

    On the real world practice of Behaviour Driven Development

    Get PDF
    Surveys of industry practice over the last decade suggest that Behaviour Driven Development is a popular Agile practice. For example, 19% of respondents to the 14th State of Agile annual survey reported using BDD, placing it in the top 13 practices reported. As well as potential benefits, the adoption of BDD necessarily involves an additional cost of writing and maintaining Gherkin features and scenarios, and (if used for acceptance testing,) the associated step functions. Yet there is a lack of published literature exploring how BDD is used in practice and the challenges experienced by real world software development efforts. This gap is significant because without understanding current real world practice, it is hard to identify opportunities to address and mitigate challenges. In order to address this research gap concerning the challenges of using BDD, this thesis reports on a research project which explored: (a) the challenges of applying agile and undertaking requirements engineering in a real world context; (b) the challenges of applying BDD specifically and (c) the application of BDD in open-source projects to understand challenges in this different context. For this purpose, we progressively conducted two case studies, two series of interviews, four iterations of action research, and an empirical study. The first case study was conducted in an avionics company to discover the challenges of using an agile process in a large scale safety critical project environment. Since requirements management was found to be one of the biggest challenges during the case study, we decided to investigate BDD because of its reputation for requirements management. The second case study was conducted in the company with an aim to discover the challenges of using BDD in real life. The case study was complemented with an empirical study of the practice of BDD in open source projects, taking a study sample from the GitHub open source collaboration site. As a result of this Ph.D research, we were able to discover: (i) challenges of using an agile process in a large scale safety-critical organisation, (ii) current state of BDD in practice, (iii) technical limitations of Gherkin (i.e., the language for writing requirements in BDD), (iv) challenges of using BDD in a real project, (v) bad smells in the Gherkin specifications of open source projects on GitHub. We also presented a brief comparison between the theoretical description of BDD and BDD in practice. This research, therefore, presents the results of lessons learned from BDD in practice, and serves as a guide for software practitioners planning on using BDD in their projects

    A systematic literature review on source code similarity measurement and clone detection: techniques, applications, and challenges

    Full text link
    Measuring and evaluating source code similarity is a fundamental software engineering activity that embraces a broad range of applications, including but not limited to code recommendation, duplicate code, plagiarism, malware, and smell detection. This paper proposes a systematic literature review and meta-analysis on code similarity measurement and evaluation techniques to shed light on the existing approaches and their characteristics in different applications. We initially found over 10000 articles by querying four digital libraries and ended up with 136 primary studies in the field. The studies were classified according to their methodology, programming languages, datasets, tools, and applications. A deep investigation reveals 80 software tools, working with eight different techniques on five application domains. Nearly 49% of the tools work on Java programs and 37% support C and C++, while there is no support for many programming languages. A noteworthy point was the existence of 12 datasets related to source code similarity measurement and duplicate codes, of which only eight datasets were publicly accessible. The lack of reliable datasets, empirical evaluations, hybrid methods, and focuses on multi-paradigm languages are the main challenges in the field. Emerging applications of code similarity measurement concentrate on the development phase in addition to the maintenance.Comment: 49 pages, 10 figures, 6 table

    Evaluation Methodologies in Software Protection Research

    Full text link
    Man-at-the-end (MATE) attackers have full control over the system on which the attacked software runs, and try to break the confidentiality or integrity of assets embedded in the software. Both companies and malware authors want to prevent such attacks. This has driven an arms race between attackers and defenders, resulting in a plethora of different protection and analysis methods. However, it remains difficult to measure the strength of protections because MATE attackers can reach their goals in many different ways and a universally accepted evaluation methodology does not exist. This survey systematically reviews the evaluation methodologies of papers on obfuscation, a major class of protections against MATE attacks. For 572 papers, we collected 113 aspects of their evaluation methodologies, ranging from sample set types and sizes, over sample treatment, to performed measurements. We provide detailed insights into how the academic state of the art evaluates both the protections and analyses thereon. In summary, there is a clear need for better evaluation methodologies. We identify nine challenges for software protection evaluations, which represent threats to the validity, reproducibility, and interpretation of research results in the context of MATE attacks

    Securing web applications through vulnerability detection and runtime defenses

    Get PDF
    Social networks, eCommerce, and online news attract billions of daily users. The PHP interpreter powers a host of web applications, including messaging, development environments, news, and video games. The abundance of personal, financial, and other sensitive information held by these applications makes them prime targets for cyber attacks. Considering the significance of safeguarding online platforms against cyber attacks, researchers investigated different approaches to protect web applications. However, regardless of the community’s achievements in improving the security of web applications, new vulnerabilities and cyber attacks occur on a daily basis (CISA, 2021; Bekerman and Yerushalmi, 2020). In general, cyber security threat mitigation techniques are divided into two categories: prevention and detection. In this thesis, I focus on tackling challenges in both prevention and detection scenarios and propose novel contributions to improve the security of PHP applications. Specifically, I propose methods for holistic analyses of both the web applications and the PHP interpreter to prevent cyber attacks and detect security vulnerabilities in PHP web applications. For prevention techniques, I propose three approaches called Saphire, SQLBlock, and Minimalist. I first present Saphire, an integrated analysis of both the PHP interpreter and web applications to defend against remote code execution (RCE) attacks by creating a system call sandbox. The evaluation of Saphire shows that, unlike prior work, Saphire protects web applications against RCE attacks in our dataset. Next, I present SQLBlock, which generates SQL profiles for PHP web applications through a hybrid static-dynamic analysis to prevent SQL injection attacks. My third contribution is Minimalist, which removes unnecessary code from PHP web applications according to prior user interaction. My results demonstrate that, on average, Minimalist debloats 17.78% of the source-code in PHP web applications while removing up to 38% of security vulnerabilities. Finally, as a contribution to vulnerability detection, I present Argus, a hybrid static-dynamic analysis over the PHP interpreter, to identify a comprehensive set of PHP built-in functions that an attacker can use to inject malicious input to web applications (i.e., injection-sink APIs). I discovered more than 300 injection-sink APIs in PHP 7.2 using Argus, an order of magnitude more than the most exhaustive list used in prior work. Furthermore, I integrated Argus’ results with existing program analysis tools, which identified 13 previously unknown XSS and insecure deserialization vulnerabilities in PHP web applications. In summary, I improve the security of PHP web applications through a holistic analysis of both the PHP interpreter and the web applications. I further apply hybrid static-dynamic analysis techniques to the PHP interpreter as well as PHP web applications to provide prevention mechanisms against cyber attacks or detect previously unknown security vulnerabilities. These achievements are only possible due to the holistic analysis of the web stack put forth in my research

    A Survey of GPT-3 Family Large Language Models Including ChatGPT and GPT-4

    Full text link
    Large language models (LLMs) are a special class of pretrained language models obtained by scaling model size, pretraining corpus and computation. LLMs, because of their large size and pretraining on large volumes of text data, exhibit special abilities which allow them to achieve remarkable performances without any task-specific training in many of the natural language processing tasks. The era of LLMs started with OpenAI GPT-3 model, and the popularity of LLMs is increasing exponentially after the introduction of models like ChatGPT and GPT4. We refer to GPT-3 and its successor OpenAI models, including ChatGPT and GPT4, as GPT-3 family large language models (GLLMs). With the ever-rising popularity of GLLMs, especially in the research community, there is a strong need for a comprehensive survey which summarizes the recent research progress in multiple dimensions and can guide the research community with insightful future research directions. We start the survey paper with foundation concepts like transformers, transfer learning, self-supervised learning, pretrained language models and large language models. We then present a brief overview of GLLMs and discuss the performances of GLLMs in various downstream tasks, specific domains and multiple languages. We also discuss the data labelling and data augmentation abilities of GLLMs, the robustness of GLLMs, the effectiveness of GLLMs as evaluators, and finally, conclude with multiple insightful future research directions. To summarize, this comprehensive survey paper will serve as a good resource for both academic and industry people to stay updated with the latest research related to GPT-3 family large language models.Comment: Preprint under review, 58 page

    Value Creation with Extended Reality Technologies - A Methodological Approach for Holistic Deployments

    Get PDF
    Mit zunehmender Rechenkapazität und Übertragungsleistung von Informationstechnologien wächst die Anzahl möglicher Anwendungs-szenarien für Extended Reality (XR)-Technologien in Unternehmen. XR-Technologien sind Hardwaresysteme, Softwaretools und Methoden zur Erstellung von Inhalten, um Virtual Reality, Augmented Reality und Mixed Reality zu erzeugen. Mit der Möglichkeit, Nutzern Inhalte auf immersive, interaktive und intelligente Weise zu vermitteln, können XR-Technologien die Produktivität in Unternehmen steigern und Wachstumschancen eröffnen. Obwohl XR-Anwendungen in der Industrie seit mehr als 25 Jahren wissenschaftlich erforscht werden, gelten nach wie vor als unausgereift. Die Hauptgründe dafür sind die zugrundeliegende Komplexität, die Fokussierung der Forschung auf die Untersuchung spezifische Anwendungsszenarien, die unzu-reichende Wirtschaftlichkeit von Einsatzszenarien und das Fehlen von geeigneten Implementierungsmodellen für XR-Technologien. Grundsätzlich wird der Mehrwert von Technologien durch deren Integration in die Wertschöpfungsarchitektur von Geschäftsmodellen freigesetzt. Daher wird in dieser Arbeit eine Methodik für den Einsatz von XR-Technologien in der Wertschöpfung vorgestellt. Das Hauptziel der Methodik ist es, die Identifikation geeigneter Einsatzszenarien zu ermöglichen und mit einem strukturierten Ablauf die Komplexität der Umsetzung zu beherrschen. Um eine ganzheitliche Anwendbarkeit zu ermöglichen, basiert die Methodik auf einem branchen- und ge-schäftsprozessunabhängigen Wertschöpfungsreferenzmodell. Dar-über hinaus bezieht sie sich auf eine ganzheitliche Morphologie von XR-Technologien und folgt einer iterativen Einführungssequenz. Das Wertschöpfungsmodell wird durch ein vorliegendes Potential, eine Wertschöpfungskette, ein Wertschöpfungsnetzwerk, physische und digitale Ressourcen sowie einen durch den Einsatz von XR-Technologien realisierten Mehrwert repräsentiert. XR-Technologien werden durch eine morphologische Struktur mit Anwendungsmerk-malen und erforderlichen technologischen Ressourcen repräsentiert. Die Umsetzung erfolgt in einer iterativen Sequenz, die für den zu-grundeliegenden Kontext anwendbare Methoden der agilen Soft-wareentwicklung beschreibt und relevante Stakeholder berücksich-tigt. Der Schwerpunkt der Methodik liegt auf einem systematischen Ansatz, der universell anwendbar ist und den Endnutzer und das Ökosystem der betrachteten Wertschöpfung berücksichtigt. Um die Methodik zu validieren, wird der Einsatz von XR-Technologien in zwei industriellen Anwendungsfällen unter realen wirtschaftlichen Bedingungen durchgeführt. Die Anwendungsfälle stammen aus unterschiedlichen Branchen, mit unterschiedlichen XR-Technologiemerkmalen sowie unterschiedlichen Formen von Wert-schöpfungsketten, um die universelle Anwendbarkeit der Methodik zu demonstrieren und relevante Herausforderungen bei der Durch-führung eines XR-Technologieeinsatzes aufzuzeigen. Mit Hilfe der vorgestellten Methodik können Unternehmen XR-Technologien zielgerichtet in ihrer Wertschöpfung einsetzen. Sie ermöglicht eine detaillierte Planung der Umsetzung, eine fundierte Auswahl von Anwendungsszenarien, die Bewertung möglicher Her-ausforderungen und Hindernisse sowie die gezielte Einbindung der relevanten Stakeholder. Im Ergebnis wird die Wertschöpfung mit wirtschaftlichem Mehrwert durch XR-Technologien optimiert

    Exploring Automated Code Evaluation Systems and Resources for Code Analysis: A Comprehensive Survey

    Full text link
    The automated code evaluation system (AES) is mainly designed to reliably assess user-submitted code. Due to their extensive range of applications and the accumulation of valuable resources, AESs are becoming increasingly popular. Research on the application of AES and their real-world resource exploration for diverse coding tasks is still lacking. In this study, we conducted a comprehensive survey on AESs and their resources. This survey explores the application areas of AESs, available resources, and resource utilization for coding tasks. AESs are categorized into programming contests, programming learning and education, recruitment, online compilers, and additional modules, depending on their application. We explore the available datasets and other resources of these systems for research, analysis, and coding tasks. Moreover, we provide an overview of machine learning-driven coding tasks, such as bug detection, code review, comprehension, refactoring, search, representation, and repair. These tasks are performed using real-life datasets. In addition, we briefly discuss the Aizu Online Judge platform as a real example of an AES from the perspectives of system design (hardware and software), operation (competition and education), and research. This is due to the scalability of the AOJ platform (programming education, competitions, and practice), open internal features (hardware and software), attention from the research community, open source data (e.g., solution codes and submission documents), and transparency. We also analyze the overall performance of this system and the perceived challenges over the years

    Development of a Software Module for Working with PDF Files Using Qt Framework

    Get PDF
    Дана дипломна робота присвячена розробці програмного модуля для роботи з pdf-файлами. Під час виконання даної дипломної роботи, розглянуто основні поняття розробки програмних модулів з застосуванням мови програмування C++ та Qt, недоліки та переваги цієї мови. Також здійснено огляд загальних відомостей про існуючі рішення, які використовуються для роботи з PDF файлами, описано процес розробки модуля та проведено тестування. Крім того, в пояснювальній записці описано операційні системи та налаштування модуля для даних систем. Результат роботи може бути використаний, будь-яким користувачем, котрий знайомий з програмуванням. This thesis is devoted to the development of a software module for working with pdf files. During the implementation of this thesis, the main concepts of developing software modules using the C++ and Qt programming languages, the disadvantages and advantages of this language were considered. An overview of general information about existing solutions used for working with PDF files was also reviewed, the module development process was described, and testing was conducted. In addition, the explanatory note describes the operating systems and module settings for these systems. The result of the work can be used by any user who is familiar with programming.INTRODUCTION 6 1 JUSTIFICATION OF THE RELEVANCE OF THE DEVELOPMENT 7 1.1 DESCRIPTION OF THE INFORMATIZATION OBJECT 7 1.2 ANALYSIS OF EXISTING SOLUTIONS 10 1.1 FORMULATION OF THE PROBLEM 17 2 DESIGN COMPONENTS. REQUIREMENTS ANALYSIS AND SOFTWARE SELECTION 21 2.1 DESCRIPTION OF THE SUBJECT AREA 21 2.2 ALGORITHM FUNCTIONING COMPONENTS 23 2.3 INTERFACE DESIGN 27 2.4 JUSTIFICATION TECHNOLOGIES AND IMPLEMENTATION MEANS 29 3 TESTING AND DEVELOPMENT OF THE COMPONENT 35 3.1 IMPLEMENTATION OF THE USER INTERFACE 35 3.2 DESCRIPTION AND IMPLEMENTATION OF MODULES 41 3.3 TESTING COMPONENTS 46 4 LIFE SAFETY, BASICS OF LABOR PROTECTION 50 4.1 EFFECTS OF ELECTROMAGNETIC RADIATION ON THE HUMAN BODY 50 4.2 TYPES OF HAZARDS 53 4.3 CONCLUSIONS 56 CONCLUSIONS 57 REFERENCES 58 APPENDI

    Prácticas y herramientas para el desarrollo ágil de software seguro

    Get PDF
    Se plantea aplicar y relacionar conceptos y procedimientos relacionados con la ciberseguridad que se utilizan a día de hoy en el ámbito profesional, de manera que puedan aportar un nuevo enfoque a la gestión de proyectos de desarrollo de software ágil, cuyo origen se remonta al año 2001 a través del Manifiesto Agile. El TFM “Prácticas y herramientas para el desarrollo ágil de software seguro” pretende sintetizar y reflexionar tanto los conocimientos adquiridos en el “Máster de Dirección de Proyectos Informáticos”, como las destrezas y experiencias vividas durante la vida laboral del alumno. En primer lugar, se describen los conceptos necesarios para entender el contexto del trabajo. A continuación, desde el punto de vista teórico, se realiza un análisis sobre la situación actual en la que se encuentran las metodologías de desarrollo orientadas a aportar agilidad y seguridad a las aplicaciones, Tras esto, se formularán un conjunto de prácticas, técnicas, principios y herramientas basados en la experiencia del alumno, aportando soluciones para proveer ágilmente de seguridad por defecto al software, siendo estos procesos reutilizables y aplicable a cualquier proyecto o equipo, independientemente de su naturaleza y situación específicas. Y por último se llevará a cabo un resumen de las conclusiones del trabajo.It is proposed to apply and relate concepts and procedures related to cybersecurity that are used today in the professional field, so that they can bring a new approach to the agile software project management, whose origin dates back to 2001 through the Agile Manifesto. The TFM “Practices and tools for agile and secure development during the SDLC” aims to synthesize and reflect both the knowledge acquired in the "Master in IT Project Management", as well as the skills and experiences lived during the student's working life. First of all, it describes from a theoretical point of view the current situation of development methodologies oriented to provide agility and security to applications. Secondly, a set of practices, techniques, principles and tools based on the student's experience will be formulated, providing solutions to provide agile security by default to the software, being these processes reusable and applicable to any project or team, regardless of their specific nature and situation.Máster Universitario en Dirección de Proyectos Informáticos (M133

    Understanding, Analysis, and Handling of Software Architecture Erosion

    Get PDF
    Architecture erosion occurs when a software system's implemented architecture diverges from the intended architecture over time. Studies show erosion impacts development, maintenance, and evolution since it accumulates imperceptibly. Identifying early symptoms like architectural smells enables managing erosion through refactoring. However, research lacks comprehensive understanding of erosion, unclear which symptoms are most common, and lacks detection methods. This thesis establishes an erosion landscape, investigates symptoms, and proposes identification approaches. A mapping study covers erosion definitions, symptoms, causes, and consequences. Key findings: 1) "Architecture erosion" is the most used term, with four perspectives on definitions and respective symptom types. 2) Technical and non-technical reasons contribute to erosion, negatively impacting quality attributes. Practitioners can advocate addressing erosion to prevent failures. 3) Detection and correction approaches are categorized, with consistency and evolution-based approaches commonly mentioned.An empirical study explores practitioner perspectives through communities, surveys, and interviews. Findings reveal associated practices like code review and tools identify symptoms, while collected measures address erosion during implementation. Studying code review comments analyzes erosion in practice. One study reveals architectural violations, duplicate functionality, and cyclic dependencies are most frequent. Symptoms decreased over time, indicating increased stability. Most were addressed after review. A second study explores violation symptoms in four projects, identifying 10 categories. Refactoring and removing code address most violations, while some are disregarded.Machine learning classifiers using pre-trained word embeddings identify violation symptoms from code reviews. Key findings: 1) SVM with word2vec achieved highest performance. 2) fastText embeddings worked well. 3) 200-dimensional embeddings outperformed 100/300-dimensional. 4) Ensemble classifier improved performance. 5) Practitioners found results valuable, confirming potential.An automated recommendation system identifies qualified reviewers for violations using similarity detection on file paths and comments. Experiments show common methods perform well, outperforming a baseline approach. Sampling techniques impact recommendation performance
    corecore