1,563 research outputs found

    MOON: MapReduce On Opportunistic eNvironments

    Get PDF
    Abstract—MapReduce offers a flexible programming model for processing and generating large data sets on dedicated resources, where only a small fraction of such resources are every unavailable at any given time. In contrast, when MapReduce is run on volunteer computing systems, which opportunistically harness idle desktop computers via frameworks like Condor, it results in poor performance due to the volatility of the resources, in particular, the high rate of node unavailability. Specifically, the data and task replication scheme adopted by existing MapReduce implementations is woefully inadequate for resources with high unavailability. To address this, we propose MOON, short for MapReduce On Opportunistic eNvironments. MOON extends Hadoop, an open-source implementation of MapReduce, with adaptive task and data scheduling algorithms in order to offer reliable MapReduce services on a hybrid resource architecture, where volunteer computing systems are supplemented by a small set of dedicated nodes. The adaptive task and data scheduling algorithms in MOON distinguish between (1) different types of MapReduce data and (2) different types of node outages in order to strategically place tasks and data on both volatile and dedicated nodes. Our tests demonstrate that MOON can deliver a 3-fold performance improvement to Hadoop in volatile, volunteer computing environments

    Puun korjuuketjujen simulointi.

    Get PDF

    Frameworks for enhancing temporal interface behaviour through software architectural design

    Get PDF
    The work reported in this thesis is concerned with understanding aspects of temporal behaviour. A large part of the thesis is based on analytical studies of temporal properties and interface and architectural concerns. The main areas covered include: i. analysing long-term human processes and the impact of interruptions and delays ii. investigating how infrastructures can be designed to support synchronous fast pace activity iii.design of the Getting-to-Know (GtK) experimental notification server The work is motivated by the failure of many collaborative systems to effectively manage the temporal behaviour at the interface level, as they often assume that the interaction is taking place over fast, reliable local area networks. However, the Web has challenged this assumption and users are faced with frequent network-related delays. The nature of cooperative work increases the importance of timing issues. Collaborative users require both rapid feedback of their own actions and timely feedthrough of other actions. Although it may appear that software architectures are about the internals of system design and not a necessary concern for the user interface, internal details do show up at the surface in non-functional aspects, such as timing. The focus of this work is on understanding the behavioural aspects and how they are influenced by the infrastructure. The thesis has contributed to several areas of research: (a)the study of long-term work processes generated a trigger analysis technique for task decomposition in HCI (b)the analysis of architectures was later applied to investigate architectural options for mobile interfaces (c)the framework for notification servers commenced a design vocabulary in CSCW for the implementation of notification services, with the aim of improving design (d)the impedance matching framework facilitate both goal-directed feedthrough and awareness In particular, (c) and (d) have been exercised in the development of the GtK separable notification server

    Performance modelling of replication protocols

    Get PDF
    PhD ThesisThis thesis is concerned with the performance modelling of data replication protocols. Data replication is used to provide fault tolerance and to improve the performance of a distributed system. Replication not only needs extra storage but also has an extra cost associated with it when performing an update. It is not always clear which algorithm will give best performance in a given scenario, how many copies should be maintained or where these copies should be located to yield the best performance. The consistency requirements also change with application. One has to choose these parameters to maximize reliability and speed and minimize cost. A study showing the effect of change in different parameters on the performance of these protocols would be helpful in making these decisions. With the use of data replication techniques in wide-area systems where hundreds or even thousands of sites may be involved, it has become important to evaluate the performance of the schemes maintaining copies of data. This thesis evaluates the performance of replication protocols that provide differ- ent levels of data consistency ranging from strong to weak consistency. The protocols that try to integrate strong and weak consistency are also examined. Queueing theory techniques are used to evaluate the performance of these protocols. The performance measures of interest are the response times of read and write jobs. These times are evaluated both when replicas are reliable and when they are subject to random breakdowns and repairs.Commonwealth Scholarshi

    Dynamic scheduling in a multi-product manufacturing system

    Get PDF
    To remain competitive in global marketplace, manufacturing companies need to improve their operational practices. One of the methods to increase competitiveness in manufacturing is by implementing proper scheduling system. This is important to enable job orders to be completed on time, minimize waiting time and maximize utilization of equipment and machineries. The dynamics of real manufacturing system are very complex in nature. Schedules developed based on deterministic algorithms are unable to effectively deal with uncertainties in demand and capacity. Significant differences can be found between planned schedules and actual schedule implementation. This study attempted to develop a scheduling system that is able to react quickly and reliably for accommodating changes in product demand and manufacturing capacity. A case study, 6 by 6 job shop scheduling problem was adapted with uncertainty elements added to the data sets. A simulation model was designed and implemented using ARENA simulation package to generate various job shop scheduling scenarios. Their performances were evaluated using scheduling rules, namely, first-in-first-out (FIFO), earliest due date (EDD), and shortest processing time (SPT). An artificial neural network (ANN) model was developed and trained using various scheduling scenarios generated by ARENA simulation. The experimental results suggest that the ANN scheduling model can provided moderately reliable prediction results for limited scenarios when predicting the number completed jobs, maximum flowtime, average machine utilization, and average length of queue. This study has provided better understanding on the effects of changes in demand and capacity on the job shop schedules. Areas for further study includes: (i) Fine tune the proposed ANN scheduling model (ii) Consider more variety of job shop environment (iii) Incorporate an expert system for interpretation of results. The theoretical framework proposed in this study can be used as a basis for further investigation

    Performance Modeling of Database Systems: a Survey, Journal of Telecommunications and Information Technology, 2018, nr 4

    Get PDF
    This paper presents a systematic survey of the existing database system performance evaluation models based on the queueing theory. The continuous evolution of the methodologies developed is classified according to the mathematical modeling language used. This survey covers formal models – from queueing systems and queueing networks to queueing Petri nets. Some fundamentals of the queueing system theory are presented and queueing system models are classified according to service time distribution. The paper introduces queueing networks and considers several classification criteria applicable to such models. This survey distinguishes methodologies, which evaluate database performance at the integrated system level. Finally, queueing Petri nets are introduced, which combine modeling power of queueing networks and Petri nets. Two performance models within this formalism are investigated. We find that an insufficient amount of research effort is directed into the area of NoSQL data stores. Vast majority of models developed focus on traditional relational models. These models should be adapted to evaluate performance of non-relational data stores

    Understanding the rhythms of email processing strategies in a network of knowledge workers

    Get PDF
    Scope and Method of Study: While emails have improved the communication effectiveness of knowledge workers, they have also started to negatively impact their productivity. Emails have long been known to provide value to the organization, but the influence of the overwhelming amount of information shared through emails and the inefficiencies surrounding the everyday use of emails at work has remained almost completely unanalyzed so far. Frequent announcements of new emails and then a user's checking her email leads to an escalation in the interruption issues, the resulting overall effectiveness derived from email communication needs to be re-explored. This study uses a computational modeling approach to understand how various combinations of timing-based and frequency-based email processing strategies adopted within different types of knowledge networks can influence average email response time, average primary task completion time, and the overall effectiveness, comprising value-effectiveness and time-effectiveness, in the presence of interruptions. Earlier research on the topic has focused on individual knowledge workers. This study performs a network-level analysis to compare different sender-receiver relationships to assess the impact of different overall email policies on the entire network. Computational models of three different email exchange networks were developed, namely, homogeneous networks with higher users of email, homogeneous networks with low users of email and heterogeneous networks utilizing various combinations of email strategies. A new method, referred to as forward and reverse method, to evaluate and validate model parameters is also developed.Findings and Conclusions: Findings suggest the choice of email checking policy can impact time and value effectiveness. For example, rhythmic email processing strategies lead to lower value-effectiveness but higher time-effectiveness for all types of networks. Email response times are generally higher with rhythmic policies than with arrhythmic policies. On the other hand, primary task completion times are usually lower with rhythmic policies. On an average, organizations could potentially save 3 to 6 percent of overall time spent per day by using email strategies that are more time effective but could lose 2.5 to 3.5 percent in the communication-value. These values cumulate into significant time saving or value loss for large organizations

    Improve primary care performance through operations management: An application to emergency care and preventive care

    Get PDF
    El propósito principal de esta tesis es aplicar el método de gestión de operaciones para mejorar el rendimiento de los responsables de proporcionar atención sanitaria en relación con dos componentes principales de la atención primaria: atención de urgencia y atención primaria. Durante muchos años, en la atención sanitaria se han aplicado los sistemas de gestión de operaciones (OM) y de investigación de operaciones (OR) con la finalidad de mejorar la eficiencia en la prestación de los servicios sanitarios. El núcleo del sistema de atención médica es la atención sanitaria, cuyas funciones principales incluyen el suministro de un punto de entrada, la prestación de atención médica y preventiva fundamental y ayudar a los pacientes a coordinar y a integrar la atención, aspectos que son fundamentales de cara a mejorar no solo el resultado sanitario de los pacientes, sino también el rendimiento en términos de coste de todo el sistema sanitario (Starfield 1998). En un estudio sobre el rendimiento de la atención primaria y del sistema de salud (Schoen et al., 2004), en EE. UU. se registró un índice de utilización del departamento de urgencias (ED) muy superior al de otros tres países, el cual venía acompañado de un menor porcentaje de adultos que dispusieran de un doctor, un lugar o una clínica habitual donde acudir al caer enfermos. Por este motivo, el capítulo 2 de esta disertación aborda la mejora del departamento de salas de urgencia a través del rediseño del proceso. Otro hallazgo fundamental de la encuesta es que Canadá cuenta con el menor índice de chequeos en términos de prueba de Papanicolaou y mamografías. Debido a la importancia de la atención preventiva para salvar vidas y reducir costes, el capítulo 3 de esta disertación analiza cómo mejorar el programa de atención preventiva financiado por el gobierno a través del diseño de la red. El capítulo 2 establece el contexto de un departamento de urgencias (ED) en un hospital terciario con un censo anual de 55 000 pacientes, y analiza la forma en la que el proceso de rediseño de una prueba sanguínea específica tiene un determinado impacto sobre la congestión del ED. De forma más específica, analizamos en cambio en tres magnitudes de rendimiento después de que el análisis de la muestra de sangre del paciente para determinar los niveles de troponina fuera trasladada del laboratorio central del laboratorio al interior del ED. Mediante la teoría de la asignación de colas de prioridad, generamos hipótesis sobre las siguientes medidas de rendimiento: tiempo de espera (definido como la diferencia de tiempo entre el registro de entrada del paciente y la asignación de cama), tiempo de servicio (definido como la diferencia de tiempo entre la asignación de cama y la distribución, el metabolismo y la eliminación de un fármaco) y calidad del servicio (definido como el índice de revisión de los pacientes tras 72 horas). Mediante un modelo de diferencias en diferencias, determinamos que el rediseño del proceso está asociado con unas mejoras estadísticamente significativas en casi todas las mediciones de rendimiento operativo. Concretamente, encontramos que la adopción de POCT está asociada a una reducción del 21,6 % en el tiempo de servicio entre los pacientes objeto de la prueba durante las horas punta, y en una reducción de entre el 5,9 % y el 35,5 % en el tiempo de espera en función de la categoría de prioridad del paciente durante esas mismas horas punta. Además, encontramos que la adopción de un POCT estaba asociada con una mejora de la calidad del servicio, puesto que la probabilidad de recaída pronosticada se redujo en un 0,64 % durante su uso. También descubrimos importantes efectos indirectos a través de todo el sistema en pacientes que no habían sido objeto de un POCT (pacientes que no son objeto de prueba). En otras palabras, la adopción de un POCT está asociada con una reducción del tiempo de espera entre estos pacientes que no son objeto de prueba de un 4,73 % y a una reducción del 11,6 % en el tiempo de espera en función de la categoría de prioridad de los pacientes durante las horas punta. Al examinar el impacto del POCT entre ambas poblaciones de pacientes, tanto los que fueron sometidos a la prueba como los que no, se pudo determinar que esta investigación es única a la hora de identificar los grandes beneficios en el sistema que pueden lograrse a través del rediseño del proceso asociado al ED. El tercer capítulo de esta tesis emplea un modelo de elección de preferencias para analizar las prioridades del cliente en la atención preventiva desde la perspectiva de la configuración del servicio. Aplicamos el modelo en el contexto de un programa de chequeos asociados con el cáncer de mama financiado por el gobierno en Montreal (Canadá), con el fin de identificar las contrapartidas que reciben los participantes del programa a la hora de acceder a un conjunto de instalaciones con diferentes configuraciones de servicio basadas en sus auténticas preferencias. De forma más concreta, analizamos estas preferencias en relación con el tiempo de espera para obtener cita, el tiempo de desplazamiento a la clínica en la que se vaya a practicar el chequeo, la disponibilidad del aparcamiento de la clínica, el horario de apertura de la clínica, el tiempo de espera dentro de la clínica el día del chequeo, la preparación del personal de enfermería, el proceso de chequeo y el tiempo de espera para recibir el resultado. Pudimos comprobar que la preparación del personal de enfermería (es decir, si son capaces de responder preguntas relacionadas con el chequeo o con el cáncer de mama) y el tiempo de espera para obtener una cita eran los factores más determinantes a la hora de elegir una clínica, seguidos de cerca por la disponibilidad de aparcamiento. Mediante el análisis de clases latentes también podemos confirmar que, al contrario de lo apuntado por otras investigaciones, no existe una heterogeneidad clara entre los participantes del programa. Nuestro modelo Arena de simulación muestra que tener en cuenta las preferencias del cliente en el diseño de las configuraciones del servicio mejorará notablemente tanto el nivel de congestión como el índice de participación en las nuevas pruebas. Como conclusión de ambos capítulos, esta tesis trata de generar implicaciones en términos de gestión en lo que respecta a la configuración de la atención sanitaria que puedan ayudar a mejorar la calidad del servicio mediante el uso de un enfoque de metodología empírica. Vemos que pueden acometerse importantes mejoras en los servicios existentes a través del rediseño del proceso de servicio y de la comprensión de las preferencias del cliente, sin necesidad de revisar todo el sistema de atención sanitaria
    corecore