1,558 research outputs found

    Alternative approach to tree-structured web log representation and mining

    Get PDF
    More recent approaches to web log data representation aim to capture the user navigational patterns with respect to the overall structure of the web site. One such representation is tree-structured log files which is the focus of this work. Most existing methods for analyzing such data are based on the use of frequent subtree mining techniques to extract frequent user activity and navigational paths. In this paper we evaluate the use of other standard data mining techniques enabled by a recently proposed structure preserving flat data representation for tree-structured data. The initially proposed framework was adjusted to better suit the web log mining task. Experimental evaluation is performed on two real world web log datasets and comparisons are made with an existing state-of-the art classifier for tree-structured data. The results show the great potential of the method in enabling the application of a wider range of data mining/analysis techniques to tree-structured web log data

    Degree of Scaffolding: Learning Objective Metadata: A Prototype Leaning System Design for Integrating GIS into a Civil Engineering Curriculum

    Get PDF
    Digital media and networking offer great potential as tools for enhancing classroom learning environments, both local and distant. One concept and related technological tool that can facilitate the effective application and distribution of digital educational resources is learning objects in combination with the SCORM (sharable content objects reference model) compliance framework. Progressive scaffolding is a learning design approach for educational systems that provides flexible guidance to students. We are in the process of utilizing this approach within a SCORM framework in the form of a multi-level instructional design. The associated metadata required by SCORM will describe the degree of scaffolding. This paper will discuss progressive scaffolding as it relates to SCORM compliant learning objects, within the context of the design of an application for integrating Geographic Information Systems (GIS) into the civil engineering curriculum at the University of Missouri - Rolla

    Bidirectional Growth based Mining and Cyclic Behaviour Analysis of Web Sequential Patterns

    Get PDF
    Web sequential patterns are important for analyzing and understanding users behaviour to improve the quality of service offered by the World Wide Web. Web Prefetching is one such technique that utilizes prefetching rules derived through Cyclic Model Analysis of the mined Web sequential patterns. The more accurate the prediction and more satisfying the results of prefetching if we use a highly efficient and scalable mining technique such as the Bidirectional Growth based Directed Acyclic Graph. In this paper, we propose a novel algorithm called Bidirectional Growth based mining Cyclic behavior Analysis of web sequential Patterns (BGCAP) that effectively combines these strategies to generate prefetching rules in the form of 2-sequence patterns with Periodicity and threshold of Cyclic Behaviour that can be utilized to effectively prefetch Web pages, thus reducing the users perceived latency. As BGCAP is based on Bidirectional pattern growth, it performs only (log n+1) levels of recursion for mining n Web sequential patterns. Our experimental results show that prefetching rules generated using BGCAP is 5-10 percent faster for different data sizes and 10-15% faster for a fixed data size than TD-Mine. In addition, BGCAP generates about 5-15 percent more prefetching rules than TD-Mine.Comment: 19 page

    How Do Tor Users Interact With Onion Services?

    Full text link
    Onion services are anonymous network services that are exposed over the Tor network. In contrast to conventional Internet services, onion services are private, generally not indexed by search engines, and use self-certifying domain names that are long and difficult for humans to read. In this paper, we study how people perceive, understand, and use onion services based on data from 17 semi-structured interviews and an online survey of 517 users. We find that users have an incomplete mental model of onion services, use these services for anonymity and have varying trust in onion services in general. Users also have difficulty discovering and tracking onion sites and authenticating them. Finally, users want technical improvements to onion services and better information on how to use them. Our findings suggest various improvements for the security and usability of Tor onion services, including ways to automatically detect phishing of onion services, more clear security indicators, and ways to manage onion domain names that are difficult to remember.Comment: Appeared in USENIX Security Symposium 201

    A workbench to support development and maintenance of world-wide web

    Get PDF
    The World-Wide Web is one of the most dominant features of the Internet. In its short life it has become an important part of information technology, having a role to play in all sectors. Unfortunately, it has many problems too. Due to its fast evolution, World-Wide Web document development is undisciplined and has resulted in the appearance of much poor quality work. This is also widely due to the inexperience of authors, the lack of conventions, standards or guidelines and useful tools for development and maintenance of Web documents. One solution to the major problems of poor quality of World-Wide Web documents is the improved maintenance of such documents. Maintenance is an important area that, similar to software engineering, receives little attention compared with development. In order to address the problems of World-Wide Web document maintenance, research into the area was carried out through a literature survey and case studies of the organisations that manage World-Wide Web sites. The results of this research led to producing a workbench which provides support to both developers and maintainers of Web documents. This workbench consists of methods, guidelines and tools for World-Wide Web development and maintenance

    A Literature Survey on Web Content Mining

    Get PDF
    Web is an accumulation of inter related documents on one or more web servers while web mining implies extricating important data from web databases. Web mining is one of the data mining spaces where data mining methods are utilized for extricating data from the web servers. The web information incorporates site pages, web links, questions on the web and web logs. Web mining is utilized to comprehend the client behavior, assess a specific site in view of the data which is stored in web log documents. Web mining is assessed by utilizing data mining strategies, specifically Association Rules, Classification and Clustering. It has some helpful regions or applications, for example, Electronic trade, E-learning, E-government, E-arrangements, E-majority rules system, Electronic business, security, crime examination and computerized library. Recovering the required web page from the web productively and adequately becomes a challenging task since web is comprised of unstructured information, which conveys the substantial measure of data and increment the unpredictability of managing data from various web service providers. The accumulation of data turns out to be elusive, extract, channel or assess the significant data for the clients. In this paper, we have considered the essential ideas of web mining, classification, procedures and issues. Notwithstanding this, this paper likewise broke down the web mining research challenges

    Modelling Web Usage in a Changing Environment

    Get PDF
    Eiben, A.E. [Promotor]Kowalczyk, W. [Copromotor

    Intelligent Support for Information Retrieval of Web Documents

    Get PDF
    The main goal of this research was to investigate the means of intelligent support for retrieval of web documents. We have proposed the architecture of the web tool system --- Trillian, which discovers the interests of users without their interaction and uses them for autonomous searching of related web content. Discovered pages are suggested to the user. The discovery of user interests is based on analysis of documents visited by the users previously. We have created a module for completely transparent tracking of the user's movement on the web, which logs both visited URLs and contents of web pages. The post analysis step is based on a variant of the suffix tree clustering algorithm. We primarily focus on overall Trillian architecture design and the process of discovering topics of interests. We have implemented an experimental prototype of Trillian and evaluated the quality, speed and usefulness of the proposed system. We have shown that clustering is a feasible technique for extraction of interests from web documents. We consider the proposed architecture to be quite promising and suitable for future extensions
    • …
    corecore