82 research outputs found

    WebMonitoring software system: Finite state machines for monitoring the web

    Get PDF
    This paper presents a software system called WebMonitoring. The system is designed for solving certain problems in the process of information search on the web. The first problem is improving entering of queries at search engines and enabling more complex searches than keyword-based ones. The second problem is providing access to web page content that is inaccessible by common search engines due to search engine’s crawling limitations or time difference between the moment a web page is set up on the Internet and the moment the crawler finds it. The architecture of the WebMonitoring system relies upon finite state machines and the concept of monitoring the web. We present the system’s architecture and usage. Some modules were originally developed for the purpose of the WebMonitoring system, and some rely on UNITEX, linguistically oriented software system. We hereby evaluate the WebMonitoring system and give directions for further development

    Middleware Technologies for Cloud of Things - a survey

    Get PDF
    The next wave of communication and applications rely on the new services provided by Internet of Things which is becoming an important aspect in human and machines future. The IoT services are a key solution for providing smart environments in homes, buildings and cities. In the era of a massive number of connected things and objects with a high grow rate, several challenges have been raised such as management, aggregation and storage for big produced data. In order to tackle some of these issues, cloud computing emerged to IoT as Cloud of Things (CoT) which provides virtually unlimited cloud services to enhance the large scale IoT platforms. There are several factors to be considered in design and implementation of a CoT platform. One of the most important and challenging problems is the heterogeneity of different objects. This problem can be addressed by deploying suitable "Middleware". Middleware sits between things and applications that make a reliable platform for communication among things with different interfaces, operating systems, and architectures. The main aim of this paper is to study the middleware technologies for CoT. Toward this end, we first present the main features and characteristics of middlewares. Next we study different architecture styles and service domains. Then we presents several middlewares that are suitable for CoT based platforms and lastly a list of current challenges and issues in design of CoT based middlewares is discussed.Comment: http://www.sciencedirect.com/science/article/pii/S2352864817301268, Digital Communications and Networks, Elsevier (2017

    Middleware Technologies for Cloud of Things - a survey

    Full text link
    The next wave of communication and applications rely on the new services provided by Internet of Things which is becoming an important aspect in human and machines future. The IoT services are a key solution for providing smart environments in homes, buildings and cities. In the era of a massive number of connected things and objects with a high grow rate, several challenges have been raised such as management, aggregation and storage for big produced data. In order to tackle some of these issues, cloud computing emerged to IoT as Cloud of Things (CoT) which provides virtually unlimited cloud services to enhance the large scale IoT platforms. There are several factors to be considered in design and implementation of a CoT platform. One of the most important and challenging problems is the heterogeneity of different objects. This problem can be addressed by deploying suitable "Middleware". Middleware sits between things and applications that make a reliable platform for communication among things with different interfaces, operating systems, and architectures. The main aim of this paper is to study the middleware technologies for CoT. Toward this end, we first present the main features and characteristics of middlewares. Next we study different architecture styles and service domains. Then we presents several middlewares that are suitable for CoT based platforms and lastly a list of current challenges and issues in design of CoT based middlewares is discussed.Comment: http://www.sciencedirect.com/science/article/pii/S2352864817301268, Digital Communications and Networks, Elsevier (2017

    Volunteer computing

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.Includes bibliographical references (p. 205-216).This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.This thesis presents the idea of volunteer computing, which allows high-performance parallel computing networks to be formed easily, quickly, and inexpensively by enabling ordinary Internet users to share their computers' idle processing power without needing expert help. In recent years, projects such as SETI@home have demonstrated the great potential power of volunteer computing. In this thesis, we identify volunteer computing's further potentials, and show how these can be achieved. We present the Bayanihan system for web-based volunteer computing. Using Java applets, Bayanihan enables users to volunteer their computers by simply visiting a web page. This makes it possible to set up parallel computing networks in a matter of minutes compared to the hours, days, or weeks required by traditional NOW and metacomputing systems. At the same time, Bayanihan provides a flexible object-oriented software framework that makes it easy for programmers to write various applications, and for researchers to address issues such as adaptive parallelism, fault-tolerance, and scalability. Using Bayanihan, we develop a general-purpose runtime system and APIs, and show how volunteer computing's usefulness extends beyond solving esoteric mathematical problems to other, more practical, master-worker applications such as image rendering, distributed web-crawling, genetic algorithms, parametric analysis, and Monte Carlo simulations. By presenting a new API using the bulk synchronous parallel (BSP) model, we further show that contrary to popular belief and practice, volunteer computing need not be limited to master-worker applications, but can be used for coarse-grain message-passing programs as well. Finally, we address the new problem of maintaining reliability in the presence of malicious volunteers. We present and analyze traditional techniques such as voting, and new ones such as spot-checking, encrypted computation, and periodic obfuscation. Then, we show how these can be integrated in a new idea called credibility-based fault-tolerance, which uses probability estimates to limit and direct the use of redundancy. We validate this new idea with parallel Monte Carlo simulations, and show how it can achieve error rates several orders-of-magnitude smaller than traditional voting for the same slowdown.by Luis F.G. Sarmenta.Ph.D

    Understanding and Detecting Malicious Cyber Infrastructures

    Get PDF
    Malware (e.g., trojans, bots, and spyware) is still a pervasive threat on the Internet. It is able to infect computer systems to further launch a variety of malicious activities such as sending spam, stealing sensitive information and launching distributed denial-of-service (DDoS) attacks. In order to continue malevolent activities without being detected and to improve the efficiency of malicious activities, cyber-criminals tend to build malicious cyber infrastructures to communicate with their malware and to exploit benign users. In these infrastructures, multiple servers are set to be efficient and anonymous in (i) malware distribution (using redirectors and exploit servers), (ii) control (using C&C servers), (iii) monetization (using payment servers), and (iv) robustness against server takedowns (using multiple backups for each type of server). The most straightforward way to counteract the malware threat is to detect malware directly on infected hosts. However, it is difficult since packing and obfuscation techniques are frequently used by malware to evade state-of-the-art anti-virus tools. Therefore, an alternate solution is to detect and disrupt the malicious cyber infrastructures used by malware. In this dissertation, we take an important step in this direction and focus on identifying malicious servers behind those malicious cyber infrastructures. We present a comprehensive inferring framework to infer servers involved in malicious cyber infrastructure based on the three roles of those servers: compromised server, malicious server accessed through redirection and malicious server accessed through directly connecting. We characterize these three roles from four novel perspectives and demonstrate our detection technologies in four systems: PoisonAmplifier, SMASH, VisHunter and NeighbourWatcher. PoisonAmplifier focuses on compromised servers. It explores the fact that cybercriminals tend to use compromised servers to trick benign users during the attacking process. Therefore, it is designed to proactively find more compromised servers. SMASH focuses on malicious servers accessed through directly connecting. It explores the fact that multiple backups are usually used in malicious cyber infrastructures to avoid server takedowns. Therefore, it leverages the correlation among malicious servers to infer a group of malicious servers. VisHunter focuses on the redirections from compromised servers to malicious servers. It explores the fact that cybercriminals usually conceal their core malicious servers. Therefore, it is designed to detect those “invisible” malicious servers. NeighbourWatcher focuses on all general malicious servers promoted by spammers. It explores the observation that spammers intend to promote some servers (e.g., phishing servers) on the special websites (e.g., forum and wikis) to trick benign users and to improve the reputation of their malicious servers. In short, we build a comprehensive inferring framework to identify servers involved in malicious cyber infrastructures from four novel perspectives and implement different inference techniques in different systems that complement each other. Our inferring framework has been evaluated in live networks and/or real-world network traces. The evaluation results show that it can accurately detect malicious servers involved in malicious cyber infrastructures with a very low false positive rate. We found the three roles of malicious servers we proposed can characterize most of servers involved in malicious cyber infrastructures, and the four principles we developed for the detection are invariable across different malicious cyber infrastructures. We believe our experience and lessons are of great benefit to the future malicious cyber infrastructure study and detection

    Technologically scaffolded atypical cognition: the case of YouTube’s recommender system

    Get PDF
    YouTube has been implicated in the transformation of users into extremists and conspiracy theorists. The alleged mechanism for this radicalizing process is YouTube’s recommender system, which is optimized to amplify and promote clips that users are likely to watch through to the end. YouTube optimizes for watch-through for economic reasons: people who watch a video through to the end are likely to then watch the next recommended video as well, which means that more advertisements can be served to them. This is a seemingly innocuous design choice, but it has a troubling side-effect. Critics of YouTube have alleged that the recommender system tends to recommend extremist content and conspiracy theories, as such videos are especially likely to capture and keep users’ attention. To date, the problem of radicalization via the YouTube recommender system has been a matter of speculation. The current study represents the first systematic, pre-registered attempt to establish whether and to what extent the recommender system tends to promote such content. We begin by contextualizing our study in the framework of technological seduction. Next, we explain our methodology. After that, we present our results, which are consistent with the radicalization hypothesis. Finally, we discuss our findings, as well as directions for future research and recommendations for users, industry, and policy-makers

    Self adapting websites: mining user access logs.

    Get PDF
    The Web can be regarded as a large repository of diversified information in the form of millions of websites distributed across the globe. However, the ever increasing number of websites in the Web has made it extremely difficult for users to find the right informa- tion that satisfies their current needs. In order to address this problem, many researchers explored Web Mining as a way of developing intelligent websites, which could present the information available in a website in a more meaningful way by relating it to a users need. Web Mining applies data mining techniques on web usage, web content or web structure data to discover useful knowledge such as topical relations between documents, users access patterns and website usage statistics. This knowledge is then used to develop intelligent websites that can personalise the content of a website based on a users prefer- ence. However, existing intelligent websites are too focussed on filtering the information available in a website to match a users need, ignoring the true source of users problems in the Web. The majority of problems faced by users in the Web today, can be reduced to issues related to a websites design. All too often, users needs change rapidly but the websites remain static and existing intelligent websites such as customisation, personalisa- tion and recommender systems only provide temporary solutions to this problem. An idea introduced to address this limitation is the development of adaptive websites. Adaptive websites are sites that automatically change their organisation and presentation based on users access patterns. Shortcutting is a sophisticated method used to change the organi- sation of a website. It involves connecting two documents that were previously unlinked in a website by adding a new hyperlink between them based on correlations in users visits. Existing methods tend to minimize the number of clicks required to find a target document by providing a shortcut between the initial and target documents in a users navigational path. This approach assumes the sequence of intermediate documents appearing in the path is insignificant to a users information need and bypasses them. In this work, we explore the idea of adaptive websites and present our approach to it using wayposts to address the above mentioned limitation. Wayposts are intermediate documents in a users path which may contain information significant to a users need that could lead him to his intended target document. Our work identifies such wayposts from frequently travelled users paths and suggests them as potential navigational shortcuts, which could be used to improve a websites organisation
    • …
    corecore