260 research outputs found
Learning from, Understanding, and Supporting DevOps Artifacts for Docker
With the growing use of DevOps tools and frameworks, there is an increased
need for tools and techniques that support more than code. The current
state-of-the-art in static developer assistance for tools like Docker is
limited to shallow syntactic validation. We identify three core challenges in
the realm of learning from, understanding, and supporting developers writing
DevOps artifacts: (i) nested languages in DevOps artifacts, (ii) rule mining,
and (iii) the lack of semantic rule-based analysis. To address these challenges
we introduce a toolset, binnacle, that enabled us to ingest 900,000 GitHub
repositories.
Focusing on Docker, we extracted approximately 178,000 unique Dockerfiles,
and also identified a Gold Set of Dockerfiles written by Docker experts. We
addressed challenge (i) by reducing the number of effectively uninterpretable
nodes in our ASTs by over 80% via a technique we call phased parsing. To
address challenge (ii), we introduced a novel rule-mining technique capable of
recovering two-thirds of the rules in a benchmark we curated. Through this
automated mining, we were able to recover 16 new rules that were not found
during manual rule collection. To address challenge (iii), we manually
collected a set of rules for Dockerfiles from commits to the files in the Gold
Set. These rules encapsulate best practices, avoid docker build failures, and
improve image size and build latency. We created an analyzer that used these
rules, and found that, on average, Dockerfiles on GitHub violated the rules
five times more frequently than the Dockerfiles in our Gold Set. We also found
that industrial Dockerfiles fared no better than those sourced from GitHub.
The learned rules and analyzer in binnacle can be used to aid developers in
the IDE when creating Dockerfiles, and in a post-hoc fashion to identify issues
in, and to improve, existing Dockerfiles.Comment: Published in ICSE'202
Atualização do Processo de Entrega de Portais
The integration and deployment of an application is a regular activity in many organizations, creating value for a client and helping developers validate their work. This process, once completely manual, became automated nowadays, and it is possible to build, test and deploy an application without any human interaction. Although some organizations might still study the pros and cons, the truth is that Continuous Integration, Deployment and Delivery are a part of many organizations, helping quickly discover errors or deliver a correction or version without disrupting the work of its users. Grasshopper SI, an organization based in Maia with 17 years, did not have these processes implemented in any of its projects. To evaluate the impact of these practices in the organization, different approaches were studied and then applied the best approach to a single project, while searching for a way to reduce the downtime between updates. This dissertation describes the whole process performed, from research to evaluation, while also evaluating the benefits and disadvantages of applying these techniques.A integração e disponibilização de uma aplicação é um processo regular em várias organizações, ajudando as mesmas a criar valor para clientes e ajudando os desenvolvedores a validar o seu trabalho. Outrora, o processo era completamente manual, dependendo sempre de uma equipa dedicada ou dos próprios desenvolvedores; contudo, hoje em dia existem soluções de automatização, permitindo construir, testar e entregar uma aplicação sem nenhuma interação humana. Apesar de algumas organizações estudarem ainda as vantagens e desvantagens dos processos, a verdade é que os processos de Integração, Disponibilização e Entrega ContÃnua são uma parte de várias organizações, ajudando-as a descobrir rapidamente erros ou a entregar correções ou novas versões sem interromper o trabalho dos seus utilizadores. A Grasshopper SI, uma organização sediada na Maia com 17 anos, não possuÃa nenhum destes processos em nenhum dos seus projetos. De modo a avaliar o impacto destas práticas na organização, foram estudadas diferentes técnicas e abordagens, sendo que as melhores práticas foram aplicadas a um projeto existente na mesma. Em paralelo, foi estudada a possibilidade de reduzir o tempo de indisponibilidade durante uma atualização da aplicação. Esta dissertação descreve todo o processo realizado, desde pesquisa até avaliação, enquanto era realizada a avaliação das vantagens e desvantagens das técnicas aplicadas
Achieving Continuous Delivery of Immutable Containerized Microservices with Mesos/Marathon
In the recent years, DevOps methodologies have been introduced to extend the traditional agile principles which have brought up on us a paradigm shift in migrating applications towards a cloud-native architecture. Today, microservices, containers, and Continuous Integration/Continuous Delivery have become critical to any organization’s transformation journey towards developing lean artifacts and dealing with the growing demand of pushing new features, iterating rapidly to keep the customers happy. Traditionally, applications have been packaged and delivered in virtual machines. But, with the adoption of microservices architectures, containerized applications are becoming the standard way to deploy services to production. Thanks to container orchestration tools like Marathon, containers can now be deployed and monitored at scale with ease. Microservices and Containers along with Container Orchestration tools disrupt and redefine DevOps, especially the delivery pipeline.
This Master’s thesis project focuses on deploying highly scalable microservices packed as immutable containers onto a Mesos cluster using a container orchestrating framework called Marathon. This is achieved by implementing a CI/CD pipeline and bringing in to play some of the greatest and latest practices and tools like Docker, Terraform, Jenkins, Consul, Vault, Prometheus, etc. The thesis is aimed to showcase why we need to design systems around microservices architecture, packaging cloud-native applications into containers, service discovery and many other latest trends within the DevOps realm that contribute to the continuous delivery pipeline. At BetterDoctor Inc., it is observed that this project improved the avg. release cycle, increased team members’ productivity and collaboration, reduced infrastructure costs and deployment failure rates. With the CD pipeline in place along with container orchestration tools it has been observed that the organisation could achieve Hyperscale computing as and when business demands
Tackling Version Management and Reproducibility in MLOps
A crescente adoção de soluções baseadas em machine learning (ML) exige avanços na aplicação das melhores práticas para manter estes sistemas em produção. Operações de machine learning (MLOps) incorporam princÃpios de automação contÃnua ao desenvolvimento de modelos de ML, promovendo entrega, monitoramento e treinamento contÃnuos. Devido a vários fatores, como a natureza experimental do desenvolvimento de modelos de ML ou a necessidade de otimizações derivadas de mudanças nas necessidades de negócios, espera-se que os cientistas de dados criem
vários experimentos para desenvolver um modelo ou preditor que atenda satisfatoriamente aos principais desafios de um dado problema.
Como a reavaliação de modelos é uma necessidade constante, metadados são constantemente produzidos devido a várias execuções de experimentos. Esses metadados são conhecidos como artefatos ou ativos de ML. A linhagem adequada entre esses artefatos possibilita a recriação do ambiente em que foram desenvolvidos, facilitando a reprodutibilidade do modelo. Vincular informações de experimentos, modelos, conjuntos de dados, configurações e alterações de código requer organização, rastreamento, manutenção e controle de versão adequados.
Este trabalho investigará as melhores práticas, problemas atuais e desafios relacionados ao gerenciamento e versão de artefatos e aplicará esse conhecimento para desenvolver um fluxo de trabalho que suporte a engenharia e operacionalização de ML, aplicando princÃpios de MLOps que facilitam a reprodutibilidade dos modelos. Cenários cobrindo preparação de dados, geração de modelo, comparação entre versões de modelo, implantação, monitoramento, depuração e re-treinamento demonstraram como as estruturas e ferramentas selecionadas podem ser integradas para atingir esse objetivo.The growing adoption of machine learning solutions requires advancements in applying best practices to maintain artificial intelligence systems in production. Machine Learning Operations (MLOps) incorporates DevOps principles into machine learning development, promoting automation, continuous delivery, monitoring, and training capabilities. Due to multiple factors, such as the experimental nature of the machine learning process or the need for model optimizations derived from changes in business needs, data scientists are expected to create multiple experiments to develop a model or predictor that satisfactorily addresses the main challenges of a given problem.
Since the re-evaluation of models is a constant need, metadata is constantly produced due to multiple experiment runs. This metadata is known as ML artifacts or assets. The proper lineage between these artifacts enables environment recreation, facilitating model reproducibility. Linking information from experiments, models, datasets, configurations, and code changes requires proper organization, tracking, maintenance, and version control of these artifacts.
This work will investigate the best practices, current issues, and open challenges related
to artifact versioning and management and apply this knowledge to develop an ML workflow that supports ML engineering and operationalization, applying MLOps principles that facilitate model reproducibility. Scenarios covering data preparation, model generation, comparison between model versions, deployment, monitoring, debugging, and retraining demonstrated how the selected frameworks and tools could be integrated to achieve that goal
DevOps, Continuous Integration and Continuous Deployment Methods for Software Deployment Automation
In the fast-paced landscape of software development, the need for efficient, reliable, and rapid deployment processes has become paramount. Manual deployment processes often lead to inefficiencies, errors, and delays, impacting the overall agility and reliability of software delivery. DevOps, as a cultural and collaborative approach, plays a central role in orchestrating the synergy between development and operations teams, fostering a shared responsibility for the entire software delivery lifecycle. Continuous Integration is a fundamental DevOps practice that involves regularly integrating code changes into a shared repository, triggering automated builds and tests. Continuous Deployment complements Continuous Integration by automating the release and deployment of validated code changes into production environments. The purpose of this research is to create a software deployment automation system to make it easier and reliable for organizations to deploy software. In conclusion, the results of this research show that by adopting DevOps, Continuous Integration, and Continuous Deployment, organizations can achieve enhanced collaboration, shortened release cycles, increased deployment frequency, consistent deployment, and improved overall software quality
Investigating Software Engineering Artifacts in DevOps Through the Lens of Boundary Objects
Software engineering artifacts are central to DevOps, enabling the collaboration of teams involved with integrating the development and operations domains. However, collaboration around DevOps artifacts has yet to receive detailed research attention. We apply the sociological concept of Boundary Objects to describe and evaluate the specific software engineering artifacts that enable a cross-disciplinary understanding. Using this focus, we investigate how different DevOps stakeholders can collaborate efficiently using common artifacts. We performed a multiple case study and conducted twelve semi-structured interviews with DevOps practitioners in nine companies. We elicited participants\u27 collaboration practices, focusing on the coordination of stakeholders and the use of engineering artifacts as a means of translation. This paper presents a consolidated overview of four categories of DevOps Boundary Objects and eleven stakeholder groups relevant to DevOps. To help practitioners assess cross-disciplinary knowledge management strategies, we detail how DevOps Boundary Objects contribute to four areas of DevOps knowledge and propose derived dimensions to evaluate their use
Report from GI-Dagstuhl Seminar 16394: Software Performance Engineering in the DevOps World
This report documents the program and the outcomes of GI-Dagstuhl Seminar
16394 "Software Performance Engineering in the DevOps World".
The seminar addressed the problem of performance-aware DevOps. Both, DevOps
and performance engineering have been growing trends over the past one to two
years, in no small part due to the rise in importance of identifying
performance anomalies in the operations (Ops) of cloud and big data systems and
feeding these back to the development (Dev). However, so far, the research
community has treated software engineering, performance engineering, and cloud
computing mostly as individual research areas. We aimed to identify
cross-community collaboration, and to set the path for long-lasting
collaborations towards performance-aware DevOps.
The main goal of the seminar was to bring together young researchers (PhD
students in a later stage of their PhD, as well as PostDocs or Junior
Professors) in the areas of (i) software engineering, (ii) performance
engineering, and (iii) cloud computing and big data to present their current
research projects, to exchange experience and expertise, to discuss research
challenges, and to develop ideas for future collaborations
Segurança de contentores em ambiente de desenvolvimento contÃnuo
The rising of the DevOps movement and the transition from a product economy
to a service economy drove significant changes in the software development
life cycle paradigm, among which the dropping of the waterfall in favor of
agile methods. Since DevOps is itself an agile method, it allows us to monitor
current releases, receiving constant feedback from clients, and improving
the next software releases. Despite its extraordinary development, DevOps
still presents limitations concerning security, which needs to be included in the
Continuous Integration or Continuous Deployment pipelines (CI/CD) used in
software development.
The massive adoption of cloud services and open-source software, the widely
spread containers and related orchestration, as well as microservice architectures,
broke all conventional models of software development. Due to these
new technologies, packaging and shipping new software is done in short periods
nowadays and becomes almost instantly available to users worldwide.
The usual approach to attach security at the end of the software development
life cycle (SDLC) is now becoming obsolete, thus pushing the adoption of DevSecOps
or SecDevOps, by injecting security into SDLC processes earlier
and preventing security defects or issues from entering into production.
This dissertation aims to reduce the impact of microservices’ vulnerabilities by
examining the respective images and containers through a flexible and adaptable
set of analysis tools running in dedicated CI/CD pipelines. This approach
intends to provide a clean and secure collection of microservices for later release
in cloud production environments. To achieve this purpose, we have
developed a solution that allows programming and orchestrating a battery of
tests. There is a form where we can select several security analysis tools, and
the solution performs this set of tests in a controlled way according to the defined
dependencies. To demonstrate the solution’s effectiveness, we program
a battery of tests for different scenarios, defining the security analysis pipeline
to incorporate various tools. Finally, we will show security tools working locally,
which subsequently integrated into our solution return the same results.A ascensão da estratégia DevOps e a transição de uma economia de produto
para uma economia de serviços conduziu a mudanças significativas no paradigma
do ciclo de vida do desenvolvimento de software, entre as quais o
abandono do modelo em cascata em favor de métodos ágeis. Uma vez que
o DevOps é parte integrante de um método ágil, permite-nos monitorizar as
versões actuais, recebendo feedback constante dos clientes, e melhorando
as próximas versões de software. Apesar do seu extraordinário desenvolvimento,
o DevOps ainda apresenta limitações relativas à segurança, que necessita
de ser incluÃda nas pipelines de integração contÃnua ou implantação
contÃnua (CI/CD) utilizadas no desenvolvimento de software.
A adopção em massa de serviços na nuvem e software aberto, a ampla difusão
de contentores e respectiva orquestração bem como das arquitecturas
de micro-serviços, quebraram assim todos os modelos convencionais de desenvolvimento
de software. Devido a estas novas tecnologias, a preparação e
expedição de novo software é hoje em dia feita em curtos perÃodos temporais
e ficando disponÃvel quase instantaneamente a utilizadores em todo o mundo.
Face a estes fatores, a abordagem habitual que adiciona segurança ao final
do ciclo de vida do desenvolvimento de software está a tornar-se obsoleta,
sendo crucial adotar metodologias DevSecOps ou SecDevOps, injetando a
segurança mais cedo nos processos de desenvolvimento de software e impedindo
que defeitos ou problemas de segurança fluam para os ambientes de
produção.
O objectivo desta dissertação é reduzir o impacto de vulnerabilidades em
micro-serviços através do exame das respectivas imagens e contentores por
um conjunto flexÃvel e adaptável de ferramentas de análise que funcionam em
pipelines CI/CD dedicadas. Esta abordagem pretende fornecer uma coleção
limpa e segura de micro-serviços para posteriormente serem lançados em
ambientes de produção na nuvem. Para atingir este objectivo, desenvolvemos
uma solução que permite programar e orquestrar uma bateria de testes.
Existe um formulário onde podemos seleccionar várias ferramentas de análise
de segurança, e a solução executa este conjunto de testes de uma forma
controlada de acordo com as dependências definidas. Para demonstrar a
eficácia da solução, programamos um conjunto de testes para diferentes cenários,
definindo as pipelines de análise de segurança para incorporar várias
ferramentas. Finalmente, mostraremos ferramentas de segurança a funcionar
localmente, que posteriormente integradas na nossa solução devolvem
os mesmos resultados.Mestrado em Engenharia Informátic
- …