1,371 research outputs found

    Evolutionary Multi-objective Scheduling for Anti-Spam Filtering Throughput Optimization

    Get PDF
    This paper presents an evolutionary multi-objective optimization problem formulation for the anti-spam filtering problem, addressing both the classification quality criteria (False Positive and False Negative error rates) and email messages classification time (minimization). This approach is compared to single objective problem formulations found in the literature, and its advantages for decision support and flexible/adaptive anti-spam filtering configuration is demonstrated. A study is performed using the Wirebrush4SPAM framework anti-spam filtering and the SpamAssassin email dataset. The NSGA-II evolutionary multi-objective optimization algorithm was applied for the purpose of validating and demonstrating the adoption of this novel approach to the anti-spam filtering optimization problem, formulated from the multi-objective optimization perspective. The results obtained from the experiments demonstrated that this optimization strategy allows the decision maker (anti-spam filtering system administrator) to select among a set of optimal and flexible filter configuration alternatives with respect to classification quality and classification efficiency

    Scheduling Independent Moldable Tasks on Multi-Cores with GPUs

    Get PDF
    The number of parallel systems using accelerators is growing up.The technology is now mature enough to allow sustainedpetaflop/s. However, reaching this performance scale requiresefficient scheduling algorithms to manage the heterogeneouscomputing resources.We present a new approach for scheduling independent tasks onmultiple CPUs and multiple GPUs. The tasks are assumed to beparallelizable on CPUs using the moldable model: the final numberof cores allotted to a task can be decided and set by thescheduler. More precisely, we design an algorithm aiming atminimizing the makespan---the maximum completion time of alltasks---for this scheduling problem. The proposed algorithmcombines a dual approximation scheme with a fast integer linearprogram (ILP). It determines both the partitioning of the tasks,ie whether a task should be mapped to CPUs or a GPU, and thenumber of CPUs allotted to a moldable task if mapped to the CPUs.A worst case analysis shows that the algorithm has anapproximation ratio of 32+ϵ\frac{3}{2} + \epsilon. However, sincethe complexity of the ILP-based algorithm could benon-polynomial, we also present a proved polynomial-timealgorithm with an approximation ratio of 2+ϵ2+\epsilon.We complement the theoretical analysis of our two novelalgorithms with an experimental study. In these experiments, wecompare our algorithms to a modified version of the classical\heft algorithm, adapted to handle moldable tasks. Theexperimental results show that our algorithm with the32+ϵ\frac{3}{2} + \epsilon approximation ratio producessignificantly shorter schedules than the modified \heft for mostof the instances. In addition, the experiments provide evidencethat this ILP-based algorithm is also practically able to solvelarger problem instances in a reasonable amount of time

    A Survey of Anticipatory Mobile Networking: Context-Based Classification, Prediction Methodologies, and Optimization Techniques

    Get PDF
    A growing trend for information technology is to not just react to changes, but anticipate them as much as possible. This paradigm made modern solutions, such as recommendation systems, a ubiquitous presence in today's digital transactions. Anticipatory networking extends the idea to communication technologies by studying patterns and periodicity in human behavior and network dynamics to optimize network performance. This survey collects and analyzes recent papers leveraging context information to forecast the evolution of network conditions and, in turn, to improve network performance. In particular, we identify the main prediction and optimization tools adopted in this body of work and link them with objectives and constraints of the typical applications and scenarios. Finally, we consider open challenges and research directions to make anticipatory networking part of next generation networks

    NON-DUPLICATIVE APPROACH TO SHARING BETWEEN STREAMED QUERIES

    Get PDF
    The push for streaming database systems to handle massive amounts of data and multiple queries necessitates the development of efficient yet adaptive query sharing technology. This project designed an effective solution to this problem poised as NASSQ, an elegant hybrid between static and dynamic routing alternatives. Utilizing the adaptive architecture of dynamic routing systems, NASSQ supports adaptive sharing of operators among different queries while refraining from duplicating intermediate data tuples. However like static routing, NASSQ constructs optimized routes using statistics

    Self managed virtual machine scheduling in Cloud systems

    Get PDF
    In Cloud systems, Virtual Machines (VMs) are scheduled to hosts according to their instant resource usage (e.g. to hosts with most available RAM) without considering their overall and long-term utilization. Also, in many cases, the scheduling and placement processes are computational expensive and affect performance of deployed VMs. In this work, we present a Cloud VM scheduling algorithm that takes into account already running VM resource usage over time by analyzing past VM utilization levels in order to schedule VMs by optimizing performance. We observe that Cloud management processes, like VM placement, affect already deployed systems (for example this could involve throughput drop in a database cluster), so we aim to minimize such performance degradation. Moreover, overloaded VMs tend to steal resources (e.g. CPU) from neighbouring VMs, so our work maximizes VMs real CPU utilization. Based on these, we provide an experimental analysis to compare our solution with traditional schedulers used in OpenStack by exploring the behaviour of different NoSQL (MongoDB, Apache Cassandra and Elasticsearch). The results show that our solution refines traditional instant-based physical machine selection as it learns the system behaviour as well as it adapts over time. The analysis is prosperous as for the selected setting we approximately minimize performance degradation by 19% and we maximize CPU real time by 2% when using real world workloads
    corecore