512 research outputs found

    LEAN DATA ENGINEERING. COMBINING STATE OF THE ART PRINCIPLES TO PROCESS DATA EFFICIENTLYS

    Get PDF
    The present work was developed during an internship, under Erasmus+ Traineeship program, in Fieldwork Robotics, a Cambridge based company that develops robots to operate in agricultural fields. They collect data from commercial greenhouses with sensors and real sense cameras, as well as with gripper cameras placed in the robotic arms. This data is recorded mainly in bag files, consisting of unstructured data, such as images and semi-structured data, such as metadata associated with both the conditions where the images were taken and information about the robot itself. Data was uploaded, extracted, cleaned and labelled manually before being used to train Artificial Intelligence (AI) algorithms to identify raspberries during the harvesting process. The amount of available data quickly escalates with every trip to the fields, which creates an ever-growing need for an automated process. This problem was addressed via the creation of a data engineering platform encom- passing a data lake, data warehouse and its needed processing capabilities. This platform was created following a series of principles entitled Lean Data Engineering Principles (LDEP), and the systems that follows them are called Lean Data Engineering Systems (LDES). These principles urge to start with the end in mind: process incoming batch or real-time data with no resource wasting, limiting the costs to the absolutely necessary for the job completion, in other words to be as lean as possible. The LDEP principles are a combination of state-of-the-art ideas stemming from several fields, such as data engineering, software engineering and DevOps, leveraging cloud technologies at its core. The proposed custom-made solution enabled the company to scale its data operations, being able to label images almost ten times faster while reducing over 99.9% of its associated costs in comparison to the previous process. In addition, the data lifecycle time has been reduced from weeks to hours while maintaining coherent data quality results, being able, for instance, to correctly identify 94% of the labels in comparison to a human counterpart.Este trabalho foi desenvolvido durante um estágio no âmbito do programa Erasmus+ Traineeship, na Fieldwork Robotics, uma empresa sediada em Cambridge que desenvolve robôs agrícolas. Estes robôs recolhem dados no terreno com sensores e câmeras real- sense, localizados na estrutura de alumínio e nos pulsos dos braços robóticos. Os dados recolhidos são ficheiros contendo dados não estruturados, tais como imagens, e dados semi- -estruturados, associados às condições em que as imagens foram recolhidas. Originalmente, o processo de tratamento dos dados recolhidos (upload, extração, limpeza e etiquetagem) era feito de forma manual, sendo depois utilizados para treinar algoritmos de Inteligência Artificial (IA) para identificar framboesas durante o processo de colheita. Como a quantidade de dados aumentava substancialmente com cada ida ao terreno, verificou-se uma necessidade crescente de um processo automatizado. Este problema foi endereçado com a criação de uma plataforma de engenharia de dados, composta por um data lake, uma data warehouse e o respetivo processamento, para movimentar os dados nas diferentes etapas do processo. Esta plataforma foi criada seguindo uma série de princípios intitulados Lean Data Engineering Principles (LDEP), sendo os sistemas que os seguem intitulados de Lean Data Engineering Systems (LDES). Estes princípios incitam a começar com o fim em mente: processar dados em batch ou em tempo real, sem desperdício de recursos, limitando os custos ao absolutamente necessário para a concluir o trabalho, ou seja, tornando-os o mais lean possível. Os LDEP combinam vertentes do estado da arte em diversas áreas, tais como engenharia de dados, engenharia de software, DevOps, tendo no seu cerne as tecnologias na cloud. O novo processo permitiu à empresa escalar as suas operações de dados, tornando-se capaz de etiquetar imagens quase 10× mais rápido e reduzindo em mais de 99,9% os custos associados, quando comparado com o processo anterior. Adicionalmente, o ciclo de vida dos dados foi reduzido de semanas para horas, mantendo uma qualidade equiparável, ao ser capaz de identificar corretamente 94% das etiquetas em comparação com um homólogo humano

    SLA-Based Continuous Security Assurance in Multi-Cloud DevOps

    Get PDF
    Multi-cloud applications, i.e. those that are deployed over multiple independent Cloud providers, pose a number of challenges to the security-aware development and operation. Security assurance in such applications is hard due to the lack of insights of security controls ap- plied by Cloud providers and the need of controlling the security levels of all the components and layers at a time. This paper presents the MUSA approach to Service Level Agreement (SLA)-based continuous security assurance in multi-cloud applications. The paper details the proposed model for capturing the security controls in the o ered application Se- curity SLA and the approach to continuously monitor and asses the controls at operation phase. This new approach enables to easily align development security requirements with controls monitored at operation as well as early react at operation to any possible security incident or SLA violation.The MUSA project leading to this paper has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No. 644429

    Adaptive Big Data Pipeline

    Get PDF
    Over the past three decades, data has exponentially evolved from being a simple software by-product to one of the most important companies’ assets used to understand their customers and foresee trends. Deep learning has demonstrated that big volumes of clean data generally provide more flexibility and accuracy when modeling a phenomenon. However, handling ever-increasing data volumes entail new challenges: the lack of expertise to select the appropriate big data tools for the processing pipelines, as well as the speed at which engineers can take such pipelines into production reliably, leveraging the cloud. We introduce a system called Adaptive Big Data Pipelines: a platform to automate data pipelines creation. It provides an interface to capture the data sources, transformations, destinations and execution schedule. The system builds up the cloud infrastructure, schedules and fine-tunes the transformations, and creates the data lineage graph. This system has been tested on data sets of 50 gigabytes, processing them in just a few minutes without user intervention.ITESO, A. C

    Big data and natural environment. How does different data support different green strategies

    Get PDF
    Abstract Big data is an increasing trend in strategic management. Notwithstanding, just few studies envisage the potentiality offered by big data to sustain different green strategy typologies. The paper wants to explore how firms can capture value from big data to improve green engagement by providing a conceptual model through a comprehensive and panoramic literature that relates big data sources to the adoption of different green strategies. The main finding of the study is that companies that wants to implement Clean Innovation Strategy often refer to external partner to develop the necessary architecture needed to exploit big data potentialities

    Enterprise Composition Architecture for Micro-Granular Digital Services and Products

    Get PDF
    The digitization of our society changes the way we live, work, learn, communicate, and collaborate. This defines the strategical context for composing resilient enterprise architectures for micro-granular digital services and products. The change from a closed-world modeling perspective to more flexible open-world composition and evolution of system architectures defines the moving context for adaptable systems, which are essential to enable the digital transformation. Enterprises are presently transforming their strategy and culture together with their processes and information systems to become more digital. The digital transformation deeply disrupts existing enterprises and economies. Since years a lot of new business opportunities appeared using the potential of the Internet and related digital technologies, like Internet of Things, services computing, cloud computing, big data with analytics, mobile systems, collaboration networks, and cyber physical systems. Digitization fosters the development of IT systems with many rather small and distributed structures, like Internet of Things or mobile systems. In this paper, we are focusing on the continuous bottom-up integration of micro-granular architectures for a huge amount of dynamically growing systems and services, like Internet of Things and Microservices, as part of a new digital enterprise architecture. To integrate micro-granular architecture models to living architectural model versions we are extending more traditional enterprise architecture reference models with state of art elements for agile architectural engineering to support the digitalization of services with related products, and their processes

    Maturity model for DevOps

    Get PDF
    Businesses today need to respond to customer needs at unprecedented speed. Driven by this need for speed, many companies are rushing to the DevOps movement. DevOps, the combination of Development and Operations, is a new way of thinking in the software engineering domain that recently received much attention. Since DevOps has recently been introduced as a new term and novel concept, no common understanding of what it means has yet been achieved. Therefore, the definitions of DevOps often are only a part relevant to the concept. When further observing DevOps, it could be seen as a movement, but is still young and not yet formally defined. Also, no adoption models or fine-grained maturity models showing what to consider to adopt DevOps and how to mature it were identified. As a consequence, this research attempted to fill these gaps and consequently brought forward a Systematic Literature Review to identify the determining factors contributing to the implementation of DevOps, including the main capabilities and areas with which it evolves. This resulted in a list of practices per area and capability that was used in the interviews with DevOps practitioners that, with their experience, contributed to define the maturity of those DevOps practices. This combination of factors was used to construct a DevOps maturity model showing the areas and capabilities to be taken into account in the adoption and maturation of DevOps.Hoje em dia, as empresas precisam de responder às necessidades dos clientes a uma velocidade sem precedentes. Impulsionadas por esta necessidade de velocidade, muitas empresas apressam-se para o movimento DevOps. O DevOps, a combinação de Desenvolvimento e Operações, é uma nova maneira de pensar no domínio da engenharia de software que recentemente recebeu muita atenção. Desde que o DevOps foi introduzido como um novo termo e um novo conceito, ainda não foi alcançado um entendimento comum do que significa. Portanto, as definições do DevOps geralmente são apenas uma parte relevante para o conceito. Ao observar o DevOps, o fenómeno aborda questões culturais e técnicas para obter uma produção mais rápida de software, tem um âmbito amplo e pode ser visto como um movimento, mas ainda é jovem e ainda não está formalmente definido. Além disso, não foram identificados modelos de adoção ou modelos de maturidade refinados que mostrem o que considerar para adotar o DevOps e como fazê-lo crescer. Como consequência, esta pesquisa tentou preencher essas lacunas e, consequentemente, apresentou uma Revisão sistemática da literatura para identificar os fatores determinantes que contribuem para a implementação de DevOps, incluindo os principais recursos e áreas com os quais ele evolui. Isto resultou numa lista de práticas por área e por capacidade, que foi utilizado como base nas entrevistas realizadas com peritos em DevOps que, com a sua experiência, ajudaram a atribuir níveis de maturidade a cada prática. Esta combinação de fatores foi usada para construir um modelo de maturidade de DevOps mostrando as áreas e as capacidades a serem levados em consideração na sua adoção e maturação

    Un enfoque de toma de decisiones multicriterio aplicado a la estrategia de transformación digital de las organizaciones por medio de la inteligencia artificial responsable en la nube de las organizaciones. Estudio de caso en el sector de salud

    Get PDF
    Tesis inédita de la Universidad Complutense de Madrid, Facultad de Estudios Estadísticos, leída el 08-02-2023Organisations are committed to understanding both the needs of their customers and the capabilities and plans of their competitors and partners, through the processes of acquiring and evaluating market information in a systematic and anticipatory manner. On the other hand, most organisations in the last few years have defined that one of their main strategic objectives for the next few years is to become a truly data-driven organisation in the current Big Data and Artificial Intelligence (AI) context (Moreno et al., 2019). They are willing to invest heavily in Data and AI Strategy and build enterprise data and AI platforms that will enable this Market-Oriented vision (Moreno et al., 2019). In this thesis, it is presented a Multicriteria Decision Making (MCDM) model (Saaty, 1988), an AI Digital Cloud Transformation Strategy and a cloud conceptual architecture to help AI leaders and organisations with their Responsible AI journey, capable of helping global organisations to move from the use of data from descriptive to prescriptive and leveraging existing cloud services to deliver true Market-Oriented in a much shorter time (compared with traditional approaches)...Las organizaciones se comprometen a comprender tanto las necesidades de sus clientes como las capacidades y planes de sus competidores y socios, a través de procesos de adquisición y evaluación de información de mercado de manera sistemática y anticipatoria. Por otro lado, la mayoría de las organizaciones en los últimos años han definido que uno de sus principales objetivos estratégicos para los próximos años es convertirse en una organización verdaderamente orientada a los datos (data-driven) en el contexto actual de Big Data e Inteligencia Artificial (IA) (Moreno et al. al., 2019). Están dispuestos a invertir fuertemente en datos y estrategia de inteligencia artificial y construir plataformas de datos empresariales e inteligencia artificial que permitan esta visión orientada al mercado (Moreno et al., 2019). En esta tesis, se presenta un modelo de toma de decisiones multicriterio (MCDM) (Saaty, 1988), una estrategia de transformación digital de IA de la nube y una arquitectura conceptual de nube para ayudar a los líderes y organizaciones de IA en su viaje de IA responsable, capaz de ayudar a las organizaciones globales a pasar del uso de datos descriptivos a prescriptivos y aprovechar los servicios en la nube existentes para ofrecer una verdadera orientación al mercado en un tiempo mucho más corto (en comparación con los enfoques tradicionales)...Fac. de Estudios EstadísticosTRUEunpu
    corecore