1,677 research outputs found

    Application for managing container-based software development environments

    Get PDF
    Abstract. Virtualizing the software development process can enhance efficiency through unified, remotely managed environments. Docker containers, a popular technology in software development, are widely used for application testing and deployment. This thesis examines the use of containers as cloud-based development environments. This study explores the history and implementation of container-based virtualization before presenting containers as a novel cloud-based software development environment. Virtual containers, like virtual machines, have been extensively used in software development for code testing but not as development environments. Containers are also prevalent in the final stages of software production, specifically in the distribution and deployment of completed applications. In the practical part of the thesis, an application is implemented to improve the usability of a container-based development environment, addressing challenges in adopting new work environments. The work was conducted for a private company, and multiple experts provided input. The management application enhanced the container-based development environment’s efficiency by improving user rights management, virtual container management, and user interface. Additionally, the new management tools reduced training time for new employees by 50%, facilitating their integration into the organization. Container-based development environments with efficient management tools provide a secure, efficient, and unified platform for large-scale software development. Virtual containers also hold potential for future improvements in energy-saving strategies and organizational work method harmonization and integration.Sovellus konttipohjaisten ohjelmistonkehitysympäristöjen hallintaan. Tiivistelmä. Ohjelmistokehitysprosessin virtualisointi voi parantaa tehokkuutta yhtenäisten, etähallittujen ympäristöjen avulla. Ohjelmistonkehityksessä suosittu ohjelmistonkehitysteknologia, Docker-kontteja käytetään laajalti sovellusten testaamisessa ja käyttöönotossa. Tässä opinnäytetyössä tarkastellaan konttien käyttöä pilvipohjaisina kehitysympäristöinä. Tämä tutkimus tutkii konttipohjaisen virtualisoinnin historiaa ja toteutusta, jonka jälkeen esitellään konttien käyttöä uudenlaisena pilvipohjaisena ohjelmistokehitysympäristönä. Virtuaalisia kontteja, kuten virtuaalikoneita, on käytetty laajasti ohjelmistokehityksessä kooditestauksessa, mutta ei kehitysympäristöinä. Kontit ovat myös yleisiä ohjelmistotuotannon loppuvaiheissa, erityisesti valmiiden sovellusten jakelussa ja käyttöönotossa. Opinnäytetyön käytännön osassa toteutetaan konttipohjaisen kehitysympäristön käytettävyyttä parantava sovellus, joka vastaa uusien työympäristöjen käyttöönoton haasteisiin. Työ suoritettiin yksityiselle yritykselle, ja sen suunnitteluun osallistui useita asiantuntijoita. Hallintasovellus lisäsi konttipohjaisen kehitysympäristön tehokkuutta parantamalla käyttäjäoikeuksien hallintaa, virtuaalisen kontin hallintaa ja käyttöliittymää. Lisäksi uudet hallintatyökalut lyhensivät uusien työntekijöiden koulutusaikaa 50%, mikä helpotti heidän integroitumistaan organisaatioon. Säiliöpohjaiset kehitysympäristöt varustettuina tehokkailla hallintatyökaluilla tarjoavat turvallisen, tehokkaan ja yhtenäisen alustan laajamittaiseen ohjelmistokehitykseen. Virtuaalisissa konteissa on myös potentiaalia tulevaisuuden parannuksiin energiansäästöstrategioissa ja organisaation työmenetelmien harmonisoinnissa ja integroinnissa

    Empowering Machine Learning Development with Service-Oriented Computing Principles

    Get PDF
    Despite software industries’ successful utilization of Service-Oriented Computing (SOC) to streamline software development, machine learning (ML) development has yet to fully integrate these practices. This disparity can be attributed to multiple factors, such as the unique challenges inherent to ML development and the absence of a unified framework for incorporating services into this process. In this paper, we shed light on the disparities between services-oriented computing and machine learning development. We propose “Everything as a Module” (XaaM), a framework designed to encapsulate every ML artifacts including models, code, data, and configurations as individual modules, to bridge this gap. We propose a set of additional steps that need to be taken to empower machine learning development using services-oriented computing via an architecture that facilitates efficient management and orchestration of complex ML systems. By leveraging the best practices of services-oriented computing, we believe that machine learning development can achieve a higher level of maturity, improve the efficiency of the development process, and ultimately, facilitate the more effective creation of machine learning applications.</p

    Drone Zoning

    Get PDF
    The growing popularity of small civilian drones has generated a wide array of complex and unprecedented regulatory challenges. Many of these challenges, such as keeping drones away from manned aircraft, are matters that the Federal Aviation Administration (“FAA”) is clearly authorized and well equipped to address. However, several other drone policy challenges relate solely to drones’ potential to disrupt landowners’ privacy and to otherwise interfere with activities on the ground. The nature and severity of these conflicts often varies greatly depending on a drone’s specific location; drone uses that are welcomed in some city neighborhoods may be prohibitively disruptive in others. The FAA, a centralized federal agency, lacks the information and resources necessary to effectively regulate these inherently local drone use issues. Recognizing this fact, cities and states are increasingly crafting their own drone laws. Soon, some municipalities might even find it beneficial to adopt drone zoning ordinances that specifically restrict where, when, and under what conditions civilian drones may fly within their jurisdictions. Unfortunately, the FAA has taken the position that it holds extremely broad regulatory authority over nearly every aspect of civilian drone activity—a position that threatens to preclude the development of valuable state and local drone policies. What aspects of drone activity could be better regulated at the state or local level than at the federal level? And what principles should guide municipal governments as they craft drone policies for their own communities? This Article tackles these questions, highlighting the potential merits of greater state and local involvement in drone law and identifying foundational principles and concepts for the pioneering design of drone zoning ordinances

    Automated Testing for Provisioning Systems of Complex Cloud Products

    Get PDF
    Context: The proliferation of cloud computing enabled companies to shift their approach regarding infrastructure provisioning. The uprising of cloud provisioning enabled by virtualisation technologies sprouted the rise of the Infrastructure as a Service (IaaS) model. OutSystems takes advantage of the IaaS model to spin-up infrastructure on-demand while abstracting the infrastructure management from the end-users. Problem: OutSystems’ orchestrator system handles the automated orchestration of the clients’ infrastructure, and it must be thoroughly tested. Problems arise because infrastructure provisioning takes considerable amounts of time, which dramatically increases the feedback loop for the developers. Currently, the duration of the orchestrator tests hinder the ability to develop and deliver new features at a desirable pace. Objectives: The goals of this work include designing an efficient testing strategy that considers a microservices architecture with infrastructure provisioning capabilities while integrating it in a Continuous Integration (CI)/Continuous Deployment (CD) pipeline. Methods: The solution applies multiple testing techniques that target different portions of the system and follow a pre-determined test distribution to guarantee a balanced test suite. The strategy was tested against a set of prototypes to evaluate its adequacy and efficiency. The strategy definition focuses on mapping the type of errors that each test level should tackle and is, therefore, independent of the employed technologies. Results: The devised strategy is integrated in a CI/CD pipeline and is capable of comprehensively test the created prototypes while maintaining a short feedback loop. It also provides support for testing against commonly found errors in distributed systems in a deterministic way. Conclusions: The work developed in this dissertation met the outlined objectives, as the developed strategy proved its adequacy against the developed prototypes. Moreover, this work provides a solid starting point for the migration of the orchestrator system to a microservices architecture

    Contributions to Edge Computing

    Get PDF
    Efforts related to Internet of Things (IoT), Cyber-Physical Systems (CPS), Machine to Machine (M2M) technologies, Industrial Internet, and Smart Cities aim to improve society through the coordination of distributed devices and analysis of resulting data. By the year 2020 there will be an estimated 50 billion network connected devices globally and 43 trillion gigabytes of electronic data. Current practices of moving data directly from end-devices to remote and potentially distant cloud computing services will not be sufficient to manage future device and data growth. Edge Computing is the migration of computational functionality to sources of data generation. The importance of edge computing increases with the size and complexity of devices and resulting data. In addition, the coordination of global edge-to-edge communications, shared resources, high-level application scheduling, monitoring, measurement, and Quality of Service (QoS) enforcement will be critical to address the rapid growth of connected devices and associated data. We present a new distributed agent-based framework designed to address the challenges of edge computing. This actor-model framework implementation is designed to manage large numbers of geographically distributed services, comprised from heterogeneous resources and communication protocols, in support of low-latency real-time streaming applications. As part of this framework, an application description language was developed and implemented. Using the application description language a number of high-order management modules were implemented including solutions for resource and workload comparison, performance observation, scheduling, and provisioning. A number of hypothetical and real-world use cases are described to support the framework implementation

    Towards Reproducible and Privacy-preserving Analyses Across Federated Repositories for Omics data

    Get PDF
    Even when duly anonymized, health research data has the potential to be disclosive and there- fore requires special safeguards according to the European General Data Protection Regulation (GDPR). Furthermore, the incorporation of FAIR principles (Findable, Accessible, Interoperable, Reusable) for a more favorable reuse of existing data, calls for an approach where sensitive data is kept locally and only metadata and aggregated results are shared. Additionally, since central pool- ing is discouraged by ethical, legal, and societal issues, it is more frequent to observe maturing data management frameworks, and platforms adopting the federated approach. Current implementations of privacy-preserving analysis frameworks seem to be limited when data becomes very large (millions of rows, hundreds of variables). Biological samples data, col- lected by high-throughput technologies, such as Next Generation Sequencing (NGS), which allows to sequence entire genomes, are examples of this kind of data. The term "genomics" refers to the field of science that studies genomes. The Omics tech- nologies intend to produce a systematic identification of all mRNA (transcriptomics), proteins (proteomics), and metabolites (metabolomics), respectively, present in a given biological sample. In the particular case of Omics data, these data are produced by computational workflows known as bioinformatics pipelines. The reproducibility of these pipelines is hard and it is often underestimated. Nevertheless, it is important to generate trust in scientific results, and therefore, is fundamental to know how these Omics data were generated or obtained. This work will leverage on the promising results of current open-source implementations for distributed privacy-preserving analyses, while aiming at generalizing the approach and addressing some of their shortcomings. To enable the privacy-preserving analysis of Omics data, we introduced the "resource" con- cept, implemented in one of the studied solutions. The results were promising, seeing that the privacy-preserving analysis was effective when us- ing the DataSHIELD framework in conjunction with the "resource R" package. We also concluded that the adoption of specialized DataSHIELD packages for Omics analyses is a viable pathway to leverage the privacy-preserving for Omics data. To address the reproducibility challenges, we defined a database model to represent the steps, commands and operations executed by the bioinformatics pipelines. The database model is promising, but to accomplish all reproducibility requirements, including container support and integration with code sharing platforms, it is necessary to use other tools, such as Nextflow or Snakemake, with dozens of other tested and mature functions.Even when duly anonymized, health research data has the potential to be disclosive and there- fore requires special safeguards according to the European General Data Protection Regulation (GDPR). Furthermore, the incorporation of FAIR principles (Findable, Accessible, Interoperable, Reusable) for a more favorable reuse of existing data, calls for an approach where sensitive data is kept locally and only metadata and aggregated results are shared. Additionally, since central pool- ing is discouraged by ethical, legal, and societal issues, it is more frequent to observe maturing data management frameworks, and platforms adopting the federated approach. Current implementations of privacy-preserving analysis frameworks seem to be limited when data becomes very large (millions of rows, hundreds of variables). Biological samples data, col- lected by high-throughput technologies, such as Next Generation Sequencing (NGS), which allows to sequence entire genomes, are examples of this kind of data. The term "genomics" refers to the field of science that studies genomes. The Omics tech- nologies intend to produce a systematic identification of all mRNA (transcriptomics), proteins (proteomics), and metabolites (metabolomics), respectively, present in a given biological sample. In the particular case of Omics data, these data are produced by computational workflows known as bioinformatics pipelines. The reproducibility of these pipelines is hard and it is often underestimated. Nevertheless, it is important to generate trust in scientific results, and therefore, is fundamental to know how these Omics data were generated or obtained. This work will leverage on the promising results of current open-source implementations for distributed privacy-preserving analyses, while aiming at generalizing the approach and addressing some of their shortcomings. To enable the privacy-preserving analysis of Omics data, we introduced the "resource" con- cept, implemented in one of the studied solutions. The results were promising, seeing that the privacy-preserving analysis was effective when us- ing the DataSHIELD framework in conjunction with the "resource R" package. We also concluded that the adoption of specialized DataSHIELD packages for Omics analyses is a viable pathway to leverage the privacy-preserving for Omics data. To address the reproducibility challenges, we defined a database model to represent the steps, commands and operations executed by the bioinformatics pipelines. The database model is promising, but to accomplish all reproducibility requirements, including container support and integration with code sharing platforms, it is necessary to use other tools, such as Nextflow or Snakemake, with dozens of other tested and mature functions

    Scalable processing of aggregate functions for data streams in resource-constrained environments

    Get PDF
    The fast evolution of data analytics platforms has resulted in an increasing demand for real-time data stream processing. From Internet of Things applications to the monitoring of telemetry generated in large datacenters, a common demand for currently emerging scenarios is the need to process vast amounts of data with low latencies, generally performing the analysis process as close to the data source as possible. Devices and sensors generate streams of data across a diversity of locations and protocols. That data usually reaches a central platform that is used to store and process the streams. Processing can be done in real time, with transformations and enrichment happening on-the-fly, but it can also happen after data is stored and organized in repositories. In the former case, stream processing technologies are required to operate on the data; in the latter batch analytics and queries are of common use. Stream processing platforms are required to be malleable and absorb spikes generated by fluctuations of data generation rates. Data is usually produced as time series that have to be aggregated using multiple operators, being sliding windows one of the most common abstractions used to process data in real-time. To satisfy the above-mentioned demands, efficient stream processing techniques that aggregate data with minimal computational cost need to be developed. However, data analytics might require to aggregate extensive windows of data. Approximate computing has been a central paradigm for decades in data analytics in order to improve the performance and reduce the needed resources, such as memory, computation time, bandwidth or energy. In exchange for these improvements, the aggregated results suffer from a level of inaccuracy that in some cases can be predicted and constrained. This doctoral thesis aims to demonstrate that it is possible to have constant-time and memory efficient aggregation functions with approximate computing mechanisms for constrained environments. In order to achieve this goal, the work has been structured in three research challenges. First we introduce a runtime to dynamically construct data stream processing topologies based on user-supplied code. These dynamic topologies are built on-the-fly using a data subscription model de¿ned by the applications that consume data. The subscription-based programing model enables multiple users to deploy their own data-processing services. On top of this runtime, we present the Amortized Monoid Tree Aggregator general sliding window aggregation framework, which seamlessly combines the following features: amortized O(1) time complexity and a worst-case of O(log n) between insertions; it provides both a window aggregation mechanism and a window slide policy that are user programmable; the enforcement of the window sliding policy exhibits amortized O(1) computational cost for single evictions and supports bulk evictions with cost O(log n); and it requires a local memory space of O(log n). The framework can compute aggregations over multiple data dimensions, and has been designed to support decoupling computation and data storage through the use of distributed Key-Value Stores to keep window elements and partial aggregations. Specially motivated by edge computing scenarios, we contribute Approximate and Amortized Monoid Tree Aggregator (A2MTA). It is, to our knowledge, the first general purpose sliding window programable framework that combines constant-time aggregations with error bounded approximate computing techniques. A2MTA uses statistical analysis of the stream data in order to perform inaccurate aggregations, providing a critical reduction of needed resources for massive stream data aggregation, and an improvement of performance.La ràpida evolució de les plataformes d'anàlisi de dades ha resultat en un increment de la demanda de processament de fluxos continus de dades en temps real. Des de la internet de les coses fins al monitoratge de telemetria generada en grans servidors, una demanda recurrent per escenaris emergents es la necessitat de processar grans quantitats de dades amb latències molt baixes, generalment fent el processat de les dades tant a prop dels origines com sigui possible. Les dades son generades com a fluxos continus per dispositius que utilitzen una varietat de localitzacions i protocols. Aquests processat de les dades s pot fer en temps real amb les transformacions efectuant-se al vol, i en aquest cas la utilització de plataformes de processat d'streams és necessària. Les plataformes de processat d'streams cal que absorbeixin pics de freqüència de dades. Les dades es generen com a series temporals que s'agreguen fent servir multiples operadors, on les finestres són l'abstracció més habitual. Per a satisfer les baixes latències i maleabilitat requerides, els operadors necesiten tenir un cost computacional mínim, inclús amb extenses finestres de dades per a agregar. La computació aproximada ha sigut durant decades un paradigma rellevant per l'anàlisi de dades on cal millorar el rendiment de diferents algorismes i reduir-ne el temps de computació, la memòria requerida, l'ample de banda o el consum energètic. A canvi d'aquestes millores, els resultats poden patir d'una falta d'exactitud que pot ser estimada i controlada. Aquesta tesi doctoral vol demostrar que es posible tenir funcions d'agregació pel processat d'streams que tinc un cost de temps constant, sigui eficient en termes de memoria i faci ús de computació aproximada. Per aconseguir aquests objectius, aquesta tesi està dividida en tres reptes. Primer presentem un entorn per a la construcció dinàmica de topologies de computació d'streams de dades utilitzant codi d'usuari. Aquestes topologies es construeixen fent servir un model de subscripció a streams, en el que les aplicación consumidores de dades amplien les topologies mentre s'estan executant. Aquest entorn permet multiples entitats ampliant una mateixa topologia. A sobre d'aquest entorn, presentem un framework de propòsit general per a l'agregació de finestres de dades anomenat AMTA (Amortized Monoid Tree Aggregator). Aquest framework combina: temps amortitzat constant per a totes les operacions, amb un cas pitjor logarítmic; programable tant en termes d'agregació com en termes d'expulsió d'elements de la finestra. L'expulsió massiva d'elements de la finestra es considera una operació atòmica, amb un cost amortitzat constant; i requereix espai en memoria local per a O(log n) elements de la finestra. Aquest framework pot computar agregacions sobre multiples dimensions de dades, i ha estat dissenyat per desacoplar la computació de les dades del seu desat, podent tenir els continguts de la finestra distribuits en diferents màquines. Motivats per la computació en l'edge (edge computing), hem contribuit A2MTA (Approximate and Amortized Monoid Tree Aggregator). Des de el nostre coneixement, es el primer framework de propòsit general per a la computació de finestres que combina un cost constant per a totes les seves operacions amb tècniques de computació aproximada amb control de l'error. A2MTA fa us d'anàlisis estadístics per a poder fer agregacions amb error limitat, reduint críticament els recursos necessaris per a la computació de grans quantitats de dades

    Environmental Enforceability

    Get PDF
    There are great expectations for a resurgence in federal environmental enforcement in a Biden-led federal government. Indeed, federal environmental enforcement suffered serious blows during the Trump Administration, particularly at the Environmental Protection Agency (EPA), including large cuts in the budget for enforcement and reversals of key enforcement policies. Yet, while important to repair the damage, truly strengthening federal environmental enforcement will require more. This Article highlights the need for greater attention to the multiple hurdles that plague environmental enforcement. In doing so it makes three contributions to the literature. First, it asserts that even though environmental statutes, regulations, and guidance documents often contain “enforceable” as an explicit term, in practice the term lacks scope and definition, making the actual enforceability of regulations dubious. Second, it demonstrates the difficulties with actual enforceability by examining key hurdles that become legal defenses for corporate and government defendants in environmental enforcement matters regarding regulatory exceptions, evidentiary standards, and the preemption and preclusion doctrines. Third, it recommends that drafters of environmental laws and regulations consider actual enforceability by considering, within the documents they are drafting, the likely hurdles for enforcers after the law or regulation becomes effective. Although hurdles in environmental enforcement are important for regulatory flexibility, judicial expediency, and other normative values, they often result in a tradeoff for achieving enforceability of environmental laws and regulations. Grappling with such tradeoffs, within the law or regulation itself, is essential for meeting the expectations for enforcement held by regulated entities, researchers, environmental advocates, and most of all, local communities. After all, as noted in a March 2021 Grist news article, “laws are only as good as their enforcement.

    Issue Creator

    Get PDF
    In the constant development of the technological industry, when it comes to product development in terms of software development new trends tend to motivate the evolution of the software through the analysis of user feedback from issue tracking systems. This is because the ultimate success of any software and, as consequence, for any technology-driven company, falls on whether or not the developed solutions manage to fulfill the expectation of the final users. E-goi is a company that provides a platform for multi-channel marketing automation that allows the integration of multiple channels from SMS and voice messages to e-mail and webpush. When it comes to SaaS companies such as E-goi, user feedback becomes of extreme importance in order to improve its products and create value for both the user and the company. When managing user feedback, it is often important how it will be delivered to the development teams in such a way that the problem at hand becomes easily understood with the maximum information possible, to be able to replicate the bugs and to create new features for the product. This, of course, must be achieved with minimal impact when it comes to the analysis of the issues and consequent development. However, the gathering and consequent delivery of this feedback to the product development teams, in E-goi, can come with some problems in both information standardization and duplicate prevention and extra costs when generated by the used tools when pursuing the objective of allowing the entire company to provide said feedback as well. So, to solve this problem, E-goi decided to create a tool that allows all the collaborators to submit issues - the Issue Creator. Nevertheless, other described problems still need to be solved. Here, is where this project comes into play by developing a revamp of this platform and enabling the creation of standardized issue reports, issue duplication prevention, and the implementation of other features that involve the integration of different platforms to simplify the actions that are essential to the product development teams. In this report, an introduction to the identified problem is described, along with the objectives and methodology followed. After this, a full contextualization on how the E-goi organizational departments are distributed, with an emphasis on the product development department, and their processes in software development. Subsequently, an analysis of the value of the solution and the requirements gathered through the elicitation phase as part of the requirements engineering practice is made, passing by a detailed view of the proposed design to develop the platform. Finally, the developed platform was evaluated both from the technical aspect through tests and quality aspects comprehended by the users, by taking advantage of the stakeholder answers gathered from inquiries performed.Com o constante desenvolvimento da indústria tecnológica, quando se trata de desenvolvimento de produtos, mais concretamente em termos de desenvolvimento de software, as novas tendências tendem a motivar a evolução do software através da análise do feedback dos clientes reunido nos sistemas de gestão de tarefas. Isto deve-se essencialmente ao facto do sucesso de qualquer software e de qualquer empresa tecnológica, depender do facto das soluções desenvolvidas conseguirem ou não atender às expectativas dos utilizadores finais. A E-goi é uma empresa que disponibiliza uma plataforma de automação de marketing multicanal que permite a integração de múltiplos canais desde SMS e mensagens de voz a e-mail e webpush. No que concerne empresas SaaS como a E-goi, o feedback dos utilizadores torna-se de extrema importância para melhorar os seus produtos e criar valor tanto para o utilizador como para a empresa. Ao gerir o feedback do utilizador, muitas vezes é importante como ele será entregue às equipas de desenvolvimento, de forma a que o problema em questão seja facilmente entendido com o máximo de informações possível, aquando se tenta reproduzir os bugs reportados ou mesmo desenvolver novas funcionalidades. Isso, é claro, deve ser alcançado com o mínimo de impacto aquando a análise das issues e consequente desenvolvimento. No entanto, a recolha e consequente entrega deste feedback às equipas de desenvolvimento de produto, no E-goi, pode acarretar alguns problemas quer na uniformização da informação, quer na prevenção de duplicações e custos extra gerados pela utilização das ferramentas de gestão de issues pelos diversos colaboradores da empresa, de forma a permitir que estes também possam reportar novos problemas. Assim, para resolver esta problemática, a E-goi decidiu criar uma ferramenta que permitisse a todos os colaboradores submeterem novas issues - o Issue Creator. No entanto, outros problemas descritos ficaram por resolver. É aqui que entra este projeto, ao desenvolver uma reformulação desta plataforma e permitir a criação de relatórios de issues standerdizados, prevenção de duplicação de relatórios e a implementação de outras funcionalidades que envolvem a integração de diferentes plataformas, de forma a simplificar as ações que são essenciais para as equipes de desenvolvimento de produtos. Neste relatório é descrita uma introdução ao problema identificado, bem como os objetivos e a metodologia seguida. A seguir, é feita uma contextualização completa de como estão distribuídos os departamentos da E-goi, com ênfase no departamento de desenvolvimento de produto e nos seus processos no desenvolvimento de software. Posteriormente, é feita uma análise de valor da solução e dos requisitos levantados na fase de elicitação como parte da prática de engenharia de requisitos, passando por uma visão detalhada do design proposto para o desenvolvimento da plataforma. Finalmente, a plataforma desenvolvida é avaliada tanto do aspeto técnico por meio de testes quanto dos aspetos de qualidade compreendidos pelos utilizadores, através da analise das respostas obtidas pela realização de questionários
    corecore