Search CORE

1,547 research outputs found

Automatic network configuration with dynamic churn prediction

Author: Bocek Thomas
Lareida Andri
Pernebayev Maxat
Stiller Burkhard
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/05/2015
Field of study

Peer-to-Peer (P2P) systems have been deployed on millions of nodes worldwide in environments that range from static to very dynamic and therefore exhibit different churn levels. Typically, P2P systems introduce redundancy to cope with loss of nodes. In distributed hash tables, redundancy often fixed during development or at initial deployment of the system. This can limit the applicability of the system to stable environments or make them inefficient in such environments. Automatic network configuration can make a system more adaptable to changing environments and reduce manual configuration tasks. Therefore, this paper proposes an automatic replication configuration based on churn prediction that automatically adapts its replication configuration to its environment. The mechanism termed dynamic replication mechanism (dynamic RM) developed and evaluated in this paper is based on exponential moving averages to predict churn that is used itself to determine a replication factor meeting a certain reliability threshold. Simulations with synthetic data and experiments with data from torrent trackers show that the behavior can be predicted accurately in any environment, from low churn rates to diurnal and high churn rates

Crossref

ZORA

Why (and How) Networks Should Run Themselves

Author: Feamster Nick
Rexford Jennifer
Publication venue
Publication date: 31/10/2017
Field of study

The proliferation of networked devices, systems, and applications that we depend on every day makes managing networks more important than ever. The increasing security, availability, and performance demands of these applications suggest that these increasingly difficult network management problems be solved in real time, across a complex web of interacting protocols and systems. Alas, just as the importance of network management has increased, the network has grown so complex that it is seemingly unmanageable. In this new era, network management requires a fundamentally new approach. Instead of optimizations based on closed-form analysis of individual protocols, network operators need data-driven, machine-learning-based models of end-to-end and application performance based on high-level policy goals and a holistic view of the underlying components. Instead of anomaly detection algorithms that operate on offline analysis of network traces, operators need classification and detection algorithms that can make real-time, closed-loop decisions. Networks should learn to drive themselves. This paper explores this concept, discussing how we might attain this ambitious goal by more closely coupling measurement with real-time control and by relying on learning for inference and prediction about a networked application or system, as opposed to closed-form analysis of individual protocols

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

Proceedings of the 4th Student-STAFF Research Conference 2020 School of Computer Science and Engineering SSRC2020

Author: Bolotov A.
Bolotov A.
Publication venue: University of Westminster
Publication date: 01/01/2020
Field of study

This volume contains the proceedings of the 4th Student-STAFF Research Conference of the School of Computer Science and Engineering (SSRC2020). This is a traditional, annual forum which brings together, for an one-day intensive programme, established and young researchers from different areas of research, doctoral researchers, postgraduate and undergraduate alumni, and covers both traditional and emerging topics, disseminates achieved results or work in progress. During informal discussions at conference sessions, the attendees share their research findings with an open audience of academics, doctoral, postgraduate and undergraduate students. The SSRCS2020 was held on-line. The specifics of this year's conference was the participation of alumni from the Informatics Institute of Technology (IIT Sri Lanka) and Westminster International University in Tashkent (WIUT, Uzbekistan). The event met great interest - it had more than 200 on-line participants, with one session accommodating the audience of 156! The presenters whether they are established researchers or just at the start of their career, not only share their work but also gain invaluable feedback during the conference sessions. Twenty one abstracts of the Proceedings contributed by the speakers at the SSRC2020 are assembled in order of their presentation at the conference. The abstracts cover a wide spectre of topics including the development of on-line knowledge and learning repositories, data analysis, applications of machine learning in fraud detection, bankruptcy prediction, patients mortality, image synthesis, graph DB, image analysis for medical diagnostics, mobile app developments, user experience design, wide area networking, adaptive agent algorithms, plagiarism detection, process mining techniques for behavioural patterns, data mining for reablement, Cloud Computing, Networking and linguistic profiling

WestminsterResearch

Built to Last or Built Too Fast? Evaluating Prediction Models for Build Times

Author: Baysal Olga
Bisong Ekaba
Tran Eric
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 29/06/2017
Field of study

Automated builds are integral to the Continuous Integration (CI) software development practice. In CI, developers are encouraged to integrate early and often. However, long build times can be an issue when integrations are frequent. This research focuses on finding a balance between integrating often and keeping developers productive. We propose and analyze models that can predict the build time of a job. Such models can help developers to better manage their time and tasks. Also, project managers can explore different factors to determine the best setup for a build job that will keep the build wait time to an acceptable level. Software organizations transitioning to CI practices can use the predictive models to anticipate build times before CI is implemented. The research community can modify our predictive models to further understand the factors and relationships affecting build times.Comment: 4 paged version published in the Proceedings of the IEEE/ACM 14th International Conference on Mining Software Repositories (MSR) Pages 487-490. MSR 201

arXiv.org e-Print Archive

Crossref

Carleton University's Institutional Repository

Self-management for large-scale distributed systems

Author: Al-Shishtawy Ahmad
Publication venue: 'Breakthrough Institute, Rockefeller Philanthropy Advisors'
Publication date: 01/01/2012
Field of study

Autonomic computing aims at making computing systems self-managing by using autonomic managers in order to reduce obstacles caused by management complexity. This thesis presents results of research on self-management for large-scale distributed systems. This research was motivated by the increasing complexity of computing systems and their management. In the first part, we present our platform, called Niche, for programming self-managing component-based distributed applications. In our work on Niche, we have faced and addressed the following four challenges in achieving self-management in a dynamic environment characterized by volatile resources and high churn: resource discovery, robust and efficient sensing and actuation, management bottleneck, and scale. We present results of our research on addressing the above challenges. Niche implements the autonomic computing architecture, proposed by IBM, in a fully decentralized way. Niche supports a network-transparent view of the system architecture simplifying the design of distributed self-management. Niche provides a concise and expressive API for self-management. The implementation of the platform relies on the scalability and robustness of structured overlay networks. We proceed by presenting a methodology for designing the management part of a distributed self-managing application. We define design steps that include partitioning of management functions and orchestration of multiple autonomic managers. In the second part, we discuss robustness of management and data consistency, which are necessary in a distributed system. Dealing with the effect of churn on management increases the complexity of the management logic and thus makes its development time consuming and error prone. We propose the abstraction of Robust Management Elements, which are able to heal themselves under continuous churn. Our approach is based on replicating a management element using finite state machine replication with a reconfigurable replica set. Our algorithm automates the reconfiguration (migration) of the replica set in order to tolerate continuous churn. For data consistency, we propose a majority-based distributed key-value store supporting multiple consistency levels that is based on a peer-to-peer network. The store enables the tradeoff between high availability and data consistency. Using majority allows avoiding potential drawbacks of a master-based consistency control, namely, a single-point of failure and a potential performance bottleneck. In the third part, we investigate self-management for Cloud-based storage systems with the focus on elasticity control using elements of control theory and machine learning. We have conducted research on a number of different designs of an elasticity controller, including a State-Space feedback controller and a controller that combines feedback and feedforward control. We describe our experience in designing an elasticity controller for a Cloud-based key-value store using state-space model that enables to trade-off performance for cost. We describe the steps in designing an elasticity controller. We continue by presenting the design and evaluation of ElastMan, an elasticity controller for Cloud-based elastic key-value stores that combines feedforward and feedback control

Publikationer från KTH

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Recommended from our members

Multimedia delivery in the future internet

Author: Aggoun A
Amon P
Arbel I
Chernilov A
Cosmas J
Garcia G
Jari A
Keller S
Kontopoulos C
Lamy-Bergot C
Leon A
Mattavelli M
Mauthe A
Mota T
Naumann M
Navarro A
Negru O
Pinto F
Shao B
Timmerer C
Tsekleves E
Zahariadis T
Publication venue: 'Society for Leukocyte Biology'
Publication date: 01/01/2008
Field of study

The term “Networked Media” implies that all kinds of media including text, image, 3D graphics, audio and video are produced, distributed, shared, managed and consumed on-line through various networks, like the Internet, Fiber, WiFi, WiMAX, GPRS, 3G and so on, in a convergent manner [1]. This white paper is the contribution of the Media Delivery Platform (MDP) cluster and aims to cover the Networked challenges of the Networked Media in the transition to the Future of the Internet. Internet has evolved and changed the way we work and live. End users of the Internet have been confronted with a bewildering range of media, services and applications and of technological innovations concerning media formats, wireless networks, terminal types and capabilities. And there is little evidence that the pace of this innovation is slowing. Today, over one billion of users access the Internet on regular basis, more than 100 million users have downloaded at least one (multi)media file and over 47 millions of them do so regularly, searching in more than 160 Exabytes1 of content. In the near future these numbers are expected to exponentially rise. It is expected that the Internet content will be increased by at least a factor of 6, rising to more than 990 Exabytes before 2012, fuelled mainly by the users themselves. Moreover, it is envisaged that in a near- to mid-term future, the Internet will provide the means to share and distribute (new) multimedia content and services with superior quality and striking flexibility, in a trusted and personalized way, improving citizens’ quality of life, working conditions, edutainment and safety. In this evolving environment, new transport protocols, new multimedia encoding schemes, cross-layer inthe network adaptation, machine-to-machine communication (including RFIDs), rich 3D content as well as community networks and the use of peer-to-peer (P2P) overlays are expected to generate new models of interaction and cooperation, and be able to support enhanced perceived quality-of-experience (PQoE) and innovative applications “on the move”, like virtual collaboration environments, personalised services/ media, virtual sport groups, on-line gaming, edutainment. In this context, the interaction with content combined with interactive/multimedia search capabilities across distributed repositories, opportunistic P2P networks and the dynamic adaptation to the characteristics of diverse mobile terminals are expected to contribute towards such a vision. Based on work that has taken place in a number of EC co-funded projects, in Framework Program 6 (FP6) and Framework Program 7 (FP7), a group of experts and technology visionaries have voluntarily contributed in this white paper aiming to describe the status, the state-of-the art, the challenges and the way ahead in the area of Content Aware media delivery platforms

Brunel University Research Archive

A Prediction Model for Bank Loans Using Agglomerative Hierarchical Clustering with Classification Approach

Author: Marion Olubunmi Adebiyi
Micheal Olaolu Arowolo
Oluwatosin Faith Adeniyi
Roseline Oluwaseun Ogundokun
Publication venue: Covenant University, Ota, Nigeria
Publication date: 16/12/2022
Field of study

Businesses depend on banks for financing and other services. The success or failure of a company depends in large part on the ability of the industry to identify credit risk. As a result, banks must analyze whether or not a loan application will default in the future. To evaluate if a loan application was eligible for one, financial firms used highly competent personnel in the past. Machine learning algorithms and neural networks have been used to train class-sifters to forecast an individual's credit score based on their prior credit history, preventing loans from being provided to individuals who have failed on their obligations but these machine learning approaches require modification to solve difficulties such as class imbalance, noise, time complexity. Customers leaving a bank to go to a competitor is known as churn. Customers who can be predicted in advance to leave provide a firm an edge in client retention and growth. Banks may use machine learning to predict the behavior of trusted customers by assessing past data. To retain the trust of those clients, they may also introduce several unique deals. This study employed agglomerative hierarchical clustering, Decision Trees, and Random Forest Classification techniques. The data with decision tree obtained an accuracy of 84%, the data with the Random Forest obtained an accuracy of 85% and the clustered data passed through the agglomerative hierarchical clustering obtained an accuracy of 98.3% using random forest classifier and an accuracy of 98.1 % using decision tree classifier

Covenant Journals (Covenant University)

Candoia: A Platform and an Ecosystem for Building and Deploying Versatile Mining Software Repositories Tools

Author: Lin Eric
Mills Dalton
Rajan Hridesh
Rajan Hridesh
Tiwari Nitin
Upadhyaya Ganesha
Publication venue: Iowa State University Digital Repository
Publication date: 05/11/2015
Field of study

Research on mining software repositories (MSR) has shown great promise during the last decade in solving many challenging software engineering problems. There exists, however, a ‘valley of death’ between these significant innovations in the MSR research and their deployment in practice. The significant cost of converting a prototype to software; need to provide support for a wide variety of tools and technologies e.g. CVS, SVN, Git, Bugzilla, Jira, Issues, etc, to improve applicability; and the high cost of customizing tools to practitioner-specific settings are some key hurdles in transition to practice. We describe Candoia, a platform and an ecosystem that is aimed at bridging this valley of death between innovations in MSR research and their deployment in practice. We have implemented Candoia and provide facilities to build and publish MSR ideas as Candoia apps. Our evaluation demonstrates that Candoia drastically reduces the cost of converting an idea to an app, thus reducing the barrier to transitioning research findings into practice. We also see versatility, in Candoia app’s ability to work with a variety of tools and technologies that the platform supports. Finally, we find that customizing Candoia app to fit project-specific needs is often well within the grasp of developers

Digital Repository @ Iowa State University (ISU)

Distributed Correlation-Based Feature Selection in Spark

Author: Alonso-Betanzos Amparo
de-Marcos Luis
Palma-Mendoza Raul-Jose
Rodriguez Daniel
Publication venue: 'Elsevier BV'
Publication date: 31/01/2019
Field of study

CFS (Correlation-Based Feature Selection) is an FS algorithm that has been successfully applied to classification problems in many domains. We describe Distributed CFS (DiCFS) as a completely redesigned, scalable, parallel and distributed version of the CFS algorithm, capable of dealing with the large volumes of data typical of big data applications. Two versions of the algorithm were implemented and compared using the Apache Spark cluster computing model, currently gaining popularity due to its much faster processing times than Hadoop's MapReduce model. We tested our algorithms on four publicly available datasets, each consisting of a large number of instances and two also consisting of a large number of features. The results show that our algorithms were superior in terms of both time-efficiency and scalability. In leveraging a computer cluster, they were able to handle larger datasets than the non-distributed WEKA version while maintaining the quality of the results, i.e., exactly the same features were returned by our algorithms when compared to the original algorithm available in WEKA.Comment: 25 pages, 5 figure

arXiv.org e-Print Archive

Repositorio da Universidade da Coruña