7 research outputs found

    Distributed Job Scheduler

    Get PDF
    Since Cron was released for Unix operating systems in 1975, it became a useful tool for making developers and system administrators’ life easier by programming tasks to be launched autonomously. Although Cron is a simple and powerful tool, it has some problems associated with it, such as lack of visibility, and complexity, because scheduling tasks using crontab’s notation can sometimes be difficult. As times wore on, new approaches of job scheduling systems emerged, most of them providing a user friendly interface to manage jobs/tasks scheduling and reports or statistics about job’s execution. Every day Jumia dispatches millions of marketing campaigns which include emails, newsletters, push notifications, SMS, and other types of channels to engage its customers to visit the e-commerce online store and other Jumia applications. In Jumia Marketing and Digital Services team’s systems a job scheduler is also used, it’s called Eye Of Sauron (EOS). EOS is very useful, however it wasn’t designed very well when it begun and, nowadays, it’s considered a problem for Jumia’s business because it’s not reliable. It’s Eye Of Sauron’s duty to trigger the dispatch process for all the marketing campaigns for Jumia’s users, so it needs to be well designed and provide trust to Jumia’s business stakeholders. With this project the problems from the original service were addressed. A new distributed job scheduler named Eye of Sauron v2 was designed and developed. It is composed by several components that are capable of being scaled horizontally and/or vertically. The new system also uses a message broker for asynchronous communication and a relational database as storage solution. The new job scheduler was considered successful because it was evaluated with a quality percentage of eighty-seven points using a Quantitative Evaluation Framework (QEF) model that considers numerous aspects not only related with functionality, but also with user interface and experience.Desde que o Cron foi lançado para sistemas operativos Unix em 1975, este tornou-se uma ferramenta muito útil para facilitar a vida de programadores e administradores de sistemas ao possibilitar o agendamento de tarefas a serem lançadas de forma autónoma. Embora o Cron seja uma ferramenta simples e poderosa, ele possui alguns problemas associados, como a falta de visibilidade e complexidade, pois o agendamento de tarefas usando a notação do crontab às vezes pode ser difícil. Com o passar do tempo, surgiram novas abordagens de sistemas de agendamento de tarefas, a maioria delas fornecendo uma interface amigável para promover a manutenção do agendamento de tarefas e relatórios ou estatísticas sobre a execução dessas tarefas. Todos os dias a Jumia envia milhões de campanhas de marketing que incluem e-mails, newsletters, notificações push, SMS e outros tipos de canais para aliciar os seus clientes a visitar a loja online de comércio eletrónico e outras aplicações da Jumia. Nos sistemas da equipa Jumia Marketing and Digital Services também é usado um agendador de tarefas, chamado Eye Of Sauron (EOS), ou “Olho de Sauron”. Este sistema é muito útil, porém não foi adequadamente projetado, o que fez com que hoje em dia seja considerado considerado um problema para o negócio da Jumia por não ser confiável. É dever do Eye Of Sauron chamar o processo de envio de todas as campanhas de marketing para os utilizadores da Jumia, por isso precisa fornecer confiança aos executivos da Jumia. Com este projeto os problemas do serviço original foram solucionados. Um novo agendador de tarefas distribuído chamado Eye of Sauron v2 foi projetado e desenvolvido. É composto por vários componentes que podem ser escláveis horizontalmente e/ou verticalmente. O novo sistema também utiliza um message broker para comunicação assíncrona e uma base de dados relacional como solução de armazenamento. O novo agendador de tarefas foi considerado bem sucedido porque foi avaliado com uma percentagem de qualidade de oitenta e sete pontos usando um modelo Quantitative Evaluation Framework (QEF). Este modelo considera inúmeros aspectos, não apenas relacionados à funcionalidade, mas também com a interface e a experiência do utilizador

    System support for object replication in distributed systems

    Get PDF
    Distributed systems are composed of a collection of cooperating but failure prone system components. The number of components in such systems is often large and, despite low probabilities of any particular component failing, the likelihood that there will be at least a small number of failures within the system at a given time is high. Therefore, distributed systems must be able to withstand partial failures. By being resilient to partial failures, a distributed system becomes more able to offer a dependable service and therefore more useful. Replication is a well known technique used to mask partial failures and increase reliability in distributed computer systems. However, replication management requires sophisticated distributed control algorithms, and is therefore a labour intensive and error prone task. Furthermore, replication is in most cases employed due to applications' non-functional requirements for reliability, as dependability is generally an orthogonal issue to the problem domain of the application. If system level support for replication is provided, the application developer can devote more effort to application specific issues. Distributed systems are inherently more complex than centralised systems. Encapsulation and abstraction of components and services can be of paramount importance in managing their complexity. The use of object oriented techniques and languages, providing support for encapsulation and abstraction, has made development of distributed systems more manageable. In systems where applications are being developed using object-oriented techniques, system support mechanisms must recognise this, and provide support for the object-oriented approach. The architecture presented exploits object-oriented techniques to improve transparency and to reduce the application programmer involvement required to use the replication mechanisms. This dissertation describes an approach to implementing system support for object replication, which is distinct from other approaches such as replicated objects in that objects are not specially designed for replication. Additionally, object replication, in contrast to data replication, is a function-shipping approach and deals with the replication of both operations and data. Object replication is complicated by objects' encapsulation of local state and the arbitrary interaction patterns that may exist among objects. Although fully transparent object replication has not been achieved, my thesis is that partial system support for replication of program-level objects is practicable and assists the development of certain classes of reliable distributed applications. I demonstrate the usefulness of this approach by describing a prototype implementation and showing how it supports the development of an example toy application. To increase their flexibility, the system support mechanisms described are tailorable. The approach adopted in this work is to provide partial support for object replication, relying on some assistance from the application developer to supply application dependent functionality within particular collators for dealing with processing of results from object replicas. Care is taken to make the programming model as simple and concise as possible

    Managing Smartphone Testbeds with SmartLab

    Get PDF
    The explosive number of smartphones with ever growing sensing and computing capabilities have brought a paradigm shift to many traditional domains of the computing field. Re-programming smartphones and instrumenting them for application testing and data gathering at scale is currently a tedious and time-consuming process that poses significant logistical challenges. In this paper, we make three major contributions: First, we propose a comprehensive architecture, coined SmartLab1, for managing a cluster of both real and virtual smartphones that are either wired to a private cloud or connected over a wireless link. Second, we propose and describe a number of Android management optimizations (e.g., command pipelining, screen-capturing, file management), which can be useful to the community for building similar functionality into their systems. Third, we conduct extensive experiments and microbenchmarks to support our design choices providing qualitative evidence on the expected performance of each module comprising our architecture. This paper also overviews experiences of using SmartLab in a research-oriented setting and also ongoing and future development efforts

    Integrated Software Architecture-Based Reliability Prediction for IT Systems

    Get PDF
    With the increasing importance of reliability in business and industrial IT systems, new techniques for architecture-based software reliability prediction are becoming an integral part of the development process. This dissertation thesis introduces a novel reliability modelling and prediction technique that considers the software architecture with its component structure, control and data flow, recovery mechanisms, its deployment to distributed hardware resources and the system\u27s usage profile

    Integrated Software Architecture-Based Reliability Prediction for IT Systems

    Get PDF
    With the increasing importance of reliability in business and industrial IT systems, new techniques for architecture-based software reliability prediction are becoming an integral part of the development process. This dissertation thesis introduces a novel reliability modelling and prediction technique that considers the software architecture with its component structure, control and data flow, recovery mechanisms, its deployment to distributed hardware resources and the system´s usage profile
    corecore