Search CORE

607 research outputs found

Probabilistic SynSet Based Concept Location

Author: Carvalho Nuno Ramos
Henriques Pedro Rangel
Publication venue: OASIcs - OpenAccess Series in Informatics. 1st Symposium on Languages, Applications and Technologies
Publication date: 01/01/2012
Field of study

Concept location is a common task in program comprehension techniques, essential in many approaches used for software care and software evolution. An important goal of this process is to discover a mapping between source code and human oriented concepts. Although programs are written in a strict and formal language, natural language terms and sentences like identifiers (variables or functions names), constant strings or comments, can still be found embedded in programs. Using terminology concepts and natural language processing techniques these terms can be exploited to discover clues about which real world concepts source code is addressing. This work extends symbol tables build by compilers with ontology driven constructs, extends synonym sets defined by linguistics, with automatically created Probabilistic SynSets from software domain parallel corpora. And using a relational algebra, creates semantic bridges between program elements and human oriented concepts, to enhance concept location tasks

Dagstuhl Research Online Publication Server

REmail - Integrating e-mail Communication in the Eclipse IDE

Author: Humpa Vítězslav
Publication venue: Vysoké učení technické v Brně. Fakulta informačních technologií
Publication date: 01/01/2011
Field of study

Během vývoje softwaru musí vývojáři mezi sebou komunikovat. Zvláště pokud pracují v distribuovaném prostředí. Například na open source projektech jsou nuceni využít různých asynchronních metod komunikace. Ze studií vyplývá, že ve srovnání s instatními zprávami, komentáři zdrojového kódu, či komentáři verzovacích systémů e-mail představuje zdaleka nejpoužívanější způsob komunikace při distribuovaném vývoji softwaru. Lze si proto představit, že archívy vývojářských e-mailů obsahují podstatné informace o nejrůznějších entitách zdrojového kódu. Časem však se takové informace ztrácejí, jelikož tyto e-maily je těžké dohledat. Proto jsme vyvinuli REmail, zásuvný modul pro Eclise, integrující e-mailovou komunikaci do IDE. Umožňuje vývojářům pracovat souběžně se zdrojovým kódem a e-maily, které jej diskutují, bez nutnosti opuštění IDE. Využitím relativně výpočetně nenáročných technik REmail dohledá všechny e-maily relevantní k vybrané entitě zdrojového kódu a umožní vývojáři s nimi pracovat.Developers of software systems have to communicate about the project they are building. Especially when working in a distributed development team, such as open source projects, developers must use an asynchronous means of communication. Studies tell us that e-mails are, by far, the means of communication mostly used during the distributed development, opposed to instant messaging, commit comments, or code comments. Therefore, we can imagine archives containing development e-mails enclose essential information concerning various entities of the source code. Unfortunately, such information gets lost with time, since relevant e-mails are hard to retrieve. We have developed REmail, an Eclipse plug-in, to integrate e-mail communication in the IDE. It allows developers to seamlessly handle source code entities and e-mails concerning the source code, without ever exiting from the IDE. Using lightweight linking techniques, REmail retrieves all the e-mails relevant to the chosen source code entities and makes them available to the developer.

CiteSeerX

Digital library of Brno University of Technology

National Repository of Grey Literature

On construction, performance, and diversification for structured queries on the semantic desktop

Author: Minack Enrico
Publication venue: Gottfried Wilhelm Leibniz Universität Hannover
Publication date: 01/01/2011
Field of study

[no abstract

Institutionelles Repositorium der Leibniz Universität Hannover

ETEASH-An Enhanced Tiny Encryption Algorithm for Secured Smart Home

Author: Abdulsalam Yunusa Simpa
Ajao Lukman Adewale
Olaniyi Olayemi Mikail
Oluwade Olushina Raphael
Osang Francis Bukie
Publication venue: Covenant University, Ota, Nigeria
Publication date: 29/06/2021
Field of study

The proliferation of the "Internet of Things" (IoT) and its applications have affected every aspect of human endeavors from smart manufacturing, agriculture, healthcare, and transportation to homes. The smart home is vulnerable to malicious attacks due to memory constraint which inhibits the usage of traditional antimalware and antivirus software. This makes the application of traditional cryptography for its security impossible. This work aimed at securing Smart home devices, by developing an enhanced Tiny Encryption Algorithm (TEA). The enhancement on TEA was to get rid of its vulnerabilities of related-key attacks and weakness of predictable keys to be usable in securing smart devices through entropy shifting, stretching, and mixing technique. The Enhanced Tiny Encryption Algorithm for Smart Home devices (ETEASH) technique was benchmarked with the original TEA using the Runs test and avalanche effect. ETEASH successfully passed the Runs test with the significance level of 0.05 for the null hypothesis, and the ETEASH avalanche effect of 58.44% was achieved against 52.50% for TEA. These results showed that ETEASH is more secured in securing smart home devices than the standard TEA

Covenant Journals (Covenant University)

Discovering Loners and Phantoms in Commit and Issue Data

Author: Brandtner Martin
Gall Harald
Leitner Philipp
Panichella Sebastiano
Schermann Gerald
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/05/2015
Field of study

The interlinking of commit and issue data has become a de-facto standard in software development. Modern issue tracking systems, such as JIRA, automatically interlink commits and issues by the extraction of identifiers (e.g., issue key) from commit messages. However, the conventions for the use of interlinking methodologies vary between software projects. For example, some projects enforce the use of identifiers for every commit while others have less restrictive conventions. In this work, we introduce a model called PaLiMod to enable the analysis of interlinking characteristics in commit and issue data. We surveyed 15 Apache projects to investigate differences and commonalities between linked and non-linked commits and issues. Based on the gathered information, we created a set of heuristics to interlink the residual of non-linked commits and issues. We present the characteristics of Loners and Phantoms in commit and issue data. The results of our evaluation indicate that the proposed PaLiMod model and heuristics enable an automatic interlinking and can indeed reduce the residual of non-linked commits and issues in software projects

Crossref

ZORA

SoK:Prudent Evaluation Practices for Fuzzing

Author: Ale-Ebrahim Arash
Bars Nils
Bernhard Lukas
Bissantz Nicolai
Crump Addison
Holz Thorsten
Muench Marius
Scharnowski Tobias
Schiller Nico
Schloegel Moritz
Publication venue: IEEE
Publication date: 23/05/2024
Field of study

Fuzzing has proven to be a highly effective approach to uncover software bugs over the past decade. After AFL popularized the groundbreaking concept of lightweight coverage feedback, the field of fuzzing has seen a vast amount of scientific work proposing new techniques, improving methodological aspects of existing strategies, or porting existing methods to new domains. All such work must demonstrate its merit by showing its applicability to a problem, measuring its performance, and often showing its superiority over existing works in a thorough, empirical evaluation. Yet, fuzzing is highly sensitive to its target, environment, and circumstances, e.g., randomness in the testing process. After all, relying on randomness is one of the core principles of fuzzing, governing many aspects of a fuzzer's behavior. Combined with the often highly difficult to control environment, the reproducibility of experiments is a crucial concern and requires a prudent evaluation setup. To address these threats to validity, several works, most notably Evaluating Fuzz Testing by Klees et al., have outlined how a carefully designed evaluation setup should be implemented, but it remains unknown to what extent their recommendations have been adopted in practice. In this work, we systematically analyze the evaluation of 150 fuzzing papers published at the top venues between 2018 and 2023. We study how existing guidelines are implemented and observe potential shortcomings and pitfalls. We find a surprising disregard of the existing guidelines regarding statistical tests and systematic errors in fuzzing evaluations. For example, when investigating reported bugs, we find that the search for vulnerabilities in real-world software leads to authors requesting and receiving CVEs of questionable quality. Extending our literature analysis to the practical domain, we attempt to reproduce claims of eight fuzzing papers. These case studies allow us to assess the practical reproducibility of fuzzing research and identify archetypal pitfalls in the evaluation design. Unfortunately, our reproduced results reveal several deficiencies in the studied papers, and we are unable to fully support and reproduce the respective claims. To help the field of fuzzing move toward a scientifically reproducible evaluation strategy, we propose updated guidelines for conducting a fuzzing evaluation that future work should follow

University of Birmingham Research Portal

Unsupervised Green Object Tracker (GOT) without Offline Pre-training

Author: Kuo C. -C. Jay
You Suya
Zhou Zhiruo
Publication venue
Publication date: 16/09/2023
Field of study

Supervised trackers trained on labeled data dominate the single object tracking field for superior tracking accuracy. The labeling cost and the huge computational complexity hinder their applications on edge devices. Unsupervised learning methods have also been investigated to reduce the labeling cost but their complexity remains high. Aiming at lightweight high-performance tracking, feasibility without offline pre-training, and algorithmic transparency, we propose a new single object tracking method, called the green object tracker (GOT), in this work. GOT conducts an ensemble of three prediction branches for robust box tracking: 1) a global object-based correlator to predict the object location roughly, 2) a local patch-based correlator to build temporal correlations of small spatial units, and 3) a superpixel-based segmentator to exploit the spatial information of the target frame. GOT offers competitive tracking accuracy with state-of-the-art unsupervised trackers, which demand heavy offline pre-training, at a lower computation cost. GOT has a tiny model size (<3k parameters) and low inference complexity (around 58M FLOPs per frame). Since its inference complexity is between 0.1%-10% of DL trackers, it can be easily deployed on mobile and edge devices

arXiv.org e-Print Archive

Improving Software Project Health Using Machine Learning

Author: Partachi Profir-Petru
Publication venue: UCL (University College London)
Publication date: 28/12/2020
Field of study

In recent years, systems that would previously live on different platforms have been integrated under a single umbrella. The increased use of GitHub, which offers pull-requests, issue trackingand version history, and its integration with other solutions such as Gerrit, or Travis, as well as theresponse from competitors, created development environments that favour agile methodologiesby increasingly automating non-coding tasks: automated build systems, automated issue triagingetc. In essence, source-code hosting platforms shifted to continuous integration/continuousdelivery (CI/CD) as a service. This facilitated a shift in development paradigms, adherents ofagile methodology can now adopt a CI/CD infrastructure more easily. This has also created large,publicly accessible sources of source-code together with related project artefacts: GHTorrent andsimilar datasets now offer programmatic access to the whole of GitHub. Project health encompasses traceability, documentation, adherence to coding conventions,tasks that reduce maintenance costs and increase accountability, but may not directly impactfeatures. Overfocus on health can slow velocity (new feature delivery) so the Agile Manifestosuggests developers should travel light — forgo tasks focused on a project health in favourof higher feature velocity. Obviously, injudiciously following this suggestion can undermine aproject’s chances for success. Simultaneously, this shift to CI/CD has allowed the proliferation of Natural Language orNatural Language and Formal Language textual artefacts that are programmatically accessible:GitHub and their competitors allow API access to their infrastructure to enable the creation ofCI/CD bots. This suggests that approaches from Natural Language Processing and MachineLearning are now feasible and indeed desirable. This thesis aims to (semi-)automate tasks forthis new paradigm and its attendant infrastructure by bringing to the foreground the relevant NLPand ML techniques. Under this umbrella, I focus on three synergistic tasks from this domain: (1) improving theissue-pull-request traceability, which can aid existing systems to automatically curate the issuebacklog as pull-requests are merged; (2) untangling commits in a version history, which canaid the beforementioned traceability task as well as improve the usability of determining a faultintroducing commit, or cherry-picking via tools such as git bisect; (3) mixed-text parsing, to allowbetter API mining and open new avenues for project-specific code-recommendation tools

UCL Discovery

BIOLOGICAL INSPIRED INTRUSION PREVENTION AND SELF-HEALING SYSTEM FOR CRITICAL SERVICES NETWORK

Author: MOHAMED AHMED ELSHEIK MUNA ELSADIG
Publication venue
Publication date: 01/01/2011
Field of study

With the explosive development of the critical services network systems and Internet, the need for networks security systems have become even critical with the enlargement of information technology in everyday life. Intrusion Prevention System (IPS) provides an in-line mechanism focus on identifying and blocking malicious network activity in real time. This thesis presents new intrusion prevention and self-healing system (SH) for critical services network security. The design features of the proposed system are inspired by the human immune system, integrated with pattern recognition nonlinear classification algorithm and machine learning. Firstly, the current intrusions preventions systems, biological innate and adaptive immune systems, autonomic computing and self-healing mechanisms are studied and analyzed. The importance of intrusion prevention system recommends that artificial immune systems (AIS) should incorporate abstraction models from innate, adaptive immune system, pattern recognition, machine learning and self-healing mechanisms to present autonomous IPS system with fast and high accurate detection and prevention performance and survivability for critical services network system. Secondly, specification language, system design, mathematical and computational models for IPS and SH system are established, which are based upon nonlinear classification, prevention predictability trust, analysis, self-adaptation and self-healing algorithms. Finally, the validation of the system carried out by simulation tests, measuring, benchmarking and comparative studies. New benchmarking metrics for detection capabilities, prevention predictability trust and self-healing reliability are introduced as contributions for the IPS and SH system measuring and validation. Using the software system, design theories, AIS features, new nonlinear classification algorithm, and self-healing system show how the use of presented systems can ensure safety for critical services networks and heal the damage caused by intrusion. This autonomous system improves the performance of the current intrusion prevention system and carries on system continuity by using self-healing mechanism

UTPedia