213 research outputs found
On benchmarking of deep learning systems: software engineering issues and reproducibility challenges
Since AlexNet won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2012, Deep Learning (and Machine Learning/AI in general) gained an exponential interest.
Nowadays, their adoption spreads over numerous sectors, like automotive, robotics, healthcare and finance.
The ML advancement goes in pair with the quality improvement delivered by those solutions.
However, those ameliorations are not for free: ML algorithms always require an increasing computational power, which pushes computer engineers to develop new devices capable of coping with this demand for performance.
To foster the evolution of DSAs, and thus ML research, it is key to make it easy to experiment and compare them. This may be challenging since, even if the software built around these devices simplifies their usage, obtaining the best performance is not always straightforward.
The situation gets even worse when the experiments are not conducted in a reproducible way.
Even though the importance of reproducibility for the research is evident, it does not directly translate into reproducible experiments. In fact, as already shown by previous studies regarding other research fields, also ML is facing a reproducibility crisis.
Our work addresses the topic of reproducibility of ML applications. Reproducibility in this context has two aspects: results reproducibility and performance reproducibility. While the reproducibility of the results is mandatory, performance reproducibility cannot be neglected because high-performance device usage causes cost. To understand how the ML situation is regarding reproducibility of performance, we reproduce results published for the MLPerf suite, which seems to be the most used machine learning benchmark.
Because of the wide range of devices and frameworks used in different benchmark submissions, we focus on a subset of accuracy and performance results submitted to the MLPerf Inference benchmark, presenting a detailed analysis of the difficulties a scientist may find when trying to reproduce such a benchmark and a possible solution using our workflow tool for experiment reproducibility: PROVA!.
We designed PROVA! to support the reproducibility in traditional HPC experiments, but we will show how we extended it to be used as a 'driver' for MLPerf benchmark applications.
The PROVA! driver mode allows us to experiment with different versions of the MLPerf Inference benchmark switching among different hardware and software combinations and compare them in a reproducible way.
In the last part, we will present the results of our reproducibility study, demonstrating the importance of having a support tool to reproduce and extend original experiments getting deeper knowledge about performance behaviours
Software-Defined Networking-Based Campus Networks Via Deep Reinforcement Learning Algorithms: The Case of University of Technology
As a consequence of the COVID-19 pandemic, networks need to be adopted to satisfy the new situation. People have been introduced to new modes of working from home, attending teleconferences, and taking part in e-learning. Other technologies, including smart cities, the Internet of Things, and simulation tools, have also seen a rise in demand. In the new situation, the network most affected is the campus network. Fortunately, a powerful and flexible network model called the software-defined network (SDN) is currently being standardized. SDN can significantly improve the performance of campus networks. Consequently, many scholars and experts have focused on enhancing campus networks via SDN technology.
Integrating deep reinforcement learning (DRL) with SDN is pivotal for advancing the quality of service (QoS) of contemporary networks. Their integration enables real-time collaboration, intelligent decision making, and optimized traffic flow and resource allocation.
The system proposed in this research is a DRL algorithm applied to a campus network—the University of Technology—and investigated as a case study. The proposed system employs a two-method approach for optimizing the QoS of a network. First, the system classifies service types and directs TCP traffic by using a deep Q-network (DQN) for intelligent routing; then, UDP traffic is managed using the Dijkstra algorithm for shortest-path selection. This hybrid model leverages the strengths of machine learning and classical algorithms to ensure efficient resource allocation and high-quality data transmission. The system combines the adaptability of DQN with the proven reliability of the Dijkstra algorithm to enhance dynamically the network performance.
The proposed hybrid system, which used DQN for TCP traffic and the Dijkstra algorithm for UDP traffic, was benchmarked against two other algorithms. The first algorithm was an advanced version of the Dijkstra algorithm that was designed specifically for this study. The second algorithm involved a Q-learning (QL)-based approach. The evaluation metrics included throughput and latency. Tests were conducted under various topologies and load conditions.
The research findings revealed a clear advantage of the hybrid system in complex network topologies under heavy-load conditions. The throughput of the proposed system was 30% higher than the advanced Dijkstra and QL algorithms. The latency benefits were pronounced, with a 50% improvement over the competing algorithms
Issues and Challenges for Network Virtualisation
In recent years, network virtualisation has been of great interest to researchers, being a relatively new and major paradigm in networking. This has been reflected in the IT industry where many virtualisation solutions are being marketed as revolutionary and purchased by enterprises to exploit these promised performances. Adversely, there are certain drawbacks like security, isolation and others that have conceded the network virtualisation. In this study, an investigation of the different state-of-the-art virtualisation technologies, their issues and challenges are addressed and besides, a guideline for a quintessential Network Virtualisation Environment (NVE) is been proposed. A systematic review was effectuated on selectively picked research papers and technical reports. Moreover a comparative study is performed on different Network Virtualisation technologies which include features like security, isolation, stability, convergence, outlay, scalability, robustness, manageability, resource management, programmability, flexibility, heterogeneity, legacy Support, and ease of deployment. The virtualisation technologies comprise Virtual Private Network (VPN), Virtual Local Area Network (VLAN), Virtual Extensible Local Area Network (VXLAN), Software Defined Networking (SDN) and Network Function Virtualisation (NFV). Conclusively the results exhibited the disparity as to the gaps of creating an ideal network virtualisation model which can be circumvented using these as a benchmark
Contributions to Securing Software Updates in IoT
The Internet of Things (IoT) is a large network of connected devices. In IoT, devices can communicate with each other or back-end systems to transfer data or perform assigned tasks. Communication protocols used in IoT depend on target applications but usually require low bandwidth. On the other hand, IoT devices are constrained, having limited resources, including memory, power, and computational resources. Considering these limitations in IoT environments, it is difficult to implement best security practices. Consequently, network attacks can threaten devices or the data they transfer. Thus it is crucial to react quickly to emerging vulnerabilities. These vulnerabilities should be mitigated by firmware updates or other necessary updates securely. Since IoT devices usually connect to the network wirelessly, such updates can be performed Over-The-Air (OTA). This dissertation presents contributions to enable secure OTA software updates in IoT. In order to perform secure updates, vulnerabilities must first be identified and assessed. In this dissertation, first, we present our contribution to designing a maturity model for vulnerability handling. Next, we analyze and compare common communication protocols and security practices regarding energy consumption. Finally, we describe our designed lightweight protocol for OTA updates targeting constrained IoT devices. IoT devices and back-end systems often use incompatible protocols that are unable to interoperate securely. This dissertation also includes our contribution to designing a secure protocol translator for IoT. This translation is performed inside a Trusted Execution Environment (TEE) with TLS interception. This dissertation also contains our contribution to key management and key distribution in IoT networks. In performing secure software updates, the IoT devices can be grouped since the updates target a large number of devices. Thus, prior to deploying updates, a group key needs to be established among group members. In this dissertation, we present our designed secure group key establishment scheme. Symmetric key cryptography can help to save IoT device resources at the cost of increased key management complexity. This trade-off can be improved by integrating IoT networks with cloud computing and Software Defined Networking (SDN).In this dissertation, we use SDN in cloud networks to provision symmetric keys efficiently and securely. These pieces together help software developers and maintainers identify vulnerabilities, provision secret keys, and perform lightweight secure OTA updates. Furthermore, they help devices and systems with incompatible protocols to be able to interoperate
Protecting Systems From Exploits Using Language-Theoretic Security
Any computer program processing input from the user or network must validate the input. Input-handling vulnerabilities occur in programs when the software component responsible for filtering malicious input---the parser---does not perform validation adequately. Consequently, parsers are among the most targeted components since they defend the rest of the program from malicious input. This thesis adopts the Language-Theoretic Security (LangSec) principle to understand what tools and research are needed to prevent exploits that target parsers. LangSec proposes specifying the syntactic structure of the input format as a formal grammar. We then build a recognizer for this formal grammar to validate any input before the rest of the program acts on it. To ensure that these recognizers represent the data format, programmers often rely on parser generators or parser combinators tools to build the parsers. This thesis propels several sub-fields in LangSec by proposing new techniques to find bugs in implementations, novel categorizations of vulnerabilities, and new parsing algorithms and tools to handle practical data formats. To this end, this thesis comprises five parts that tackle various tenets of LangSec. First, I categorize various input-handling vulnerabilities and exploits using two frameworks. First, I use the mismorphisms framework to reason about vulnerabilities. This framework helps us reason about the root causes leading to various vulnerabilities. Next, we built a categorization framework using various LangSec anti-patterns, such as parser differentials and insufficient input validation. Finally, we built a catalog of more than 30 popular vulnerabilities to demonstrate the categorization frameworks. Second, I built parsers for various Internet of Things and power grid network protocols and the iccMAX file format using parser combinator libraries. The parsers I built for power grid protocols were deployed and tested on power grid substation networks as an intrusion detection tool. The parser I built for the iccMAX file format led to several corrections and modifications to the iccMAX specifications and reference implementations. Third, I present SPARTA, a novel tool I built that generates Rust code that type checks Portable Data Format (PDF) files. The type checker I helped build strictly enforces the constraints in the PDF specification to find deviations. Our checker has contributed to at least four significant clarifications and corrections to the PDF 2.0 specification and various open-source PDF tools. In addition to our checker, we also built a practical tool, PDFFixer, to dynamically patch type errors in PDF files. Fourth, I present ParseSmith, a tool to build verified parsers for real-world data formats. Most parsing tools available for data formats are insufficient to handle practical formats or have not been verified for their correctness. I built a verified parsing tool in Dafny that builds on ideas from attribute grammars, data-dependent grammars, and parsing expression grammars to tackle various constructs commonly seen in network formats. I prove that our parsers run in linear time and always terminate for well-formed grammars. Finally, I provide the earliest systematic comparison of various data description languages (DDLs) and their parser generation tools. DDLs are used to describe and parse commonly used data formats, such as image formats. Next, I conducted an expert elicitation qualitative study to derive various metrics that I use to compare the DDLs. I also systematically compare these DDLs based on sample data descriptions available with the DDLs---checking for correctness and resilience
Open Source Law, Policy and Practice
This book examines various policies, including the legal and commercial aspects of the Open Source phenomenon. Here, ‘Open Source’ is adopted as convenient shorthand for a collection of diverse users and communities, whose differences can be as great as their similarities. The common thread is their reliance on, and use of, law and legal mechanisms to govern the source code they write, use, and distribute. The central fact of open source is that maintaining control over source code relies on the existence and efficacy of intellectual property (‘IP’) laws, particularly copyright law. Copyright law is the primary statutory tool that achieves the end of openness, although implemented through private law arrangements at varying points within the software supply chain. This dependent relationship is itself a cause of concern for some philosophically in favour of ‘open’, with some predicting (or hoping) that the free software movement will bring about the end of copyright as a means for protecting software
Modeling Deception for Cyber Security
In the era of software-intensive, smart and connected systems, the growing power and so-
phistication of cyber attacks poses increasing challenges to software security. The reactive
posture of traditional security mechanisms, such as anti-virus and intrusion detection
systems, has not been sufficient to combat a wide range of advanced persistent threats
that currently jeopardize systems operation. To mitigate these extant threats, more ac-
tive defensive approaches are necessary. Such approaches rely on the concept of actively
hindering and deceiving attackers. Deceptive techniques allow for additional defense by
thwarting attackers’ advances through the manipulation of their perceptions. Manipu-
lation is achieved through the use of deceitful responses, feints, misdirection, and other
falsehoods in a system. Of course, such deception mechanisms may result in side-effects
that must be handled. Current methods for planning deception chiefly portray attempts
to bridge military deception to cyber deception, providing only high-level instructions
that largely ignore deception as part of the software security development life cycle. Con-
sequently, little practical guidance is provided on how to engineering deception-based
techniques for defense. This PhD thesis contributes with a systematic approach to specify
and design cyber deception requirements, tactics, and strategies. This deception approach
consists of (i) a multi-paradigm modeling for representing deception requirements, tac-
tics, and strategies, (ii) a reference architecture to support the integration of deception
strategies into system operation, and (iii) a method to guide engineers in deception mod-
eling. A tool prototype, a case study, and an experimental evaluation show encouraging
results for the application of the approach in practice. Finally, a conceptual coverage map-
ping was developed to assess the expressivity of the deception modeling language created.Na era digital o crescente poder e sofisticação dos ataques cibernéticos apresenta constan-
tes desafios para a segurança do software. A postura reativa dos mecanismos tradicionais
de segurança, como os sistemas antivírus e de detecção de intrusão, não têm sido suficien-
tes para combater a ampla gama de ameaças que comprometem a operação dos sistemas
de software actuais. Para mitigar estas ameaças são necessárias abordagens ativas de
defesa. Tais abordagens baseiam-se na ideia de adicionar mecanismos para enganar os
adversários (do inglês deception). As técnicas de enganação (em português, "ato ou efeito
de enganar, de induzir em erro; artimanha usada para iludir") contribuem para a defesa
frustrando o avanço dos atacantes por manipulação das suas perceções. A manipula-
ção é conseguida através de respostas enganadoras, de "fintas", ou indicações erróneas
e outras falsidades adicionadas intencionalmente num sistema. É claro que esses meca-
nismos de enganação podem resultar em efeitos colaterais que devem ser tratados. Os
métodos atuais usados para enganar um atacante inspiram-se fundamentalmente nas
técnicas da área militar, fornecendo apenas instruções de alto nível que ignoram, em
grande parte, a enganação como parte do ciclo de vida do desenvolvimento de software
seguro. Consequentemente, há poucas referências práticas em como gerar técnicas de
defesa baseadas em enganação. Esta tese de doutoramento contribui com uma aborda-
gem sistemática para especificar e desenhar requisitos, táticas e estratégias de enganação
cibernéticas. Esta abordagem é composta por (i) uma modelação multi-paradigma para re-
presentar requisitos, táticas e estratégias de enganação, (ii) uma arquitetura de referência
para apoiar a integração de estratégias de enganação na operação dum sistema, e (iii) um
método para orientar os engenheiros na modelação de enganação. Uma ferramenta protó-
tipo, um estudo de caso e uma avaliação experimental mostram resultados encorajadores
para a aplicação da abordagem na prática. Finalmente, a expressividade da linguagem
de modelação de enganação é avaliada por um mapeamento de cobertura de conceitos
- …