Search CORE

1,558 research outputs found

Discovering Web Server Logs Patterns Using Generalized Association Rules Algorithm

Author: Mohamad Farhan Mohamad Mohsin
Mohd Helmy Abd Wahab
Mohd Norzali Haji Mohd
Publication venue: 'IntechOpen'
Publication date: 01/03/2010
Field of study

Alternative approach to tree-structured web log representation and mining

Author: Hadzic Fedja
Hecker Michael
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2011
Field of study

More recent approaches to web log data representation aim to capture the user navigational patterns with respect to the overall structure of the web site. One such representation is tree-structured log files which is the focus of this work. Most existing methods for analyzing such data are based on the use of frequent subtree mining techniques to extract frequent user activity and navigational paths. In this paper we evaluate the use of other standard data mining techniques enabled by a recently proposed structure preserving flat data representation for tree-structured data. The initially proposed framework was adjusted to better suit the web log mining task. Experimental evaluation is performed on two real world web log datasets and comparisons are made with an existing state-of-the art classifier for tree-structured data. The results show the great potential of the method in enabling the application of a wider range of data mining/analysis techniques to tree-structured web log data

espace@Curtin

Degree of Scaffolding: Learning Objective Metadata: A Prototype Leaning System Design for Integrating GIS into a Civil Engineering Curriculum

Author: Buechler Matt
Hall Richard H.
Hilgers Michael Gene
Luna Ronaldo
Sullivan John McKenna
Taylor Aaron Joseph
Publication venue: Scholars\u27 Mine
Publication date: 01/01/2004
Field of study

Digital media and networking offer great potential as tools for enhancing classroom learning environments, both local and distant. One concept and related technological tool that can facilitate the effective application and distribution of digital educational resources is learning objects in combination with the SCORM (sharable content objects reference model) compliance framework. Progressive scaffolding is a learning design approach for educational systems that provides flexible guidance to students. We are in the process of utilizing this approach within a SCORM framework in the form of a multi-level instructional design. The associated metadata required by SCORM will describe the degree of scaffolding. This paper will discuss progressive scaffolding as it relates to SCORM compliant learning objects, within the context of the design of an application for integrating Geographic Information Systems (GIS) into the civil engineering curriculum at the University of Missouri - Rolla

Missouri University of Science and Technology (Missouri S&T): Scholars' Mine

Bidirectional Growth based Mining and Cyclic Behaviour Analysis of Web Sequential Patterns

Author: Kumar N. Krishna
Patnaik L. M.
Srikantaiah K. C.
Venugopal K. R.
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 01/01/2013
Field of study

Web sequential patterns are important for analyzing and understanding users behaviour to improve the quality of service offered by the World Wide Web. Web Prefetching is one such technique that utilizes prefetching rules derived through Cyclic Model Analysis of the mined Web sequential patterns. The more accurate the prediction and more satisfying the results of prefetching if we use a highly efficient and scalable mining technique such as the Bidirectional Growth based Directed Acyclic Graph. In this paper, we propose a novel algorithm called Bidirectional Growth based mining Cyclic behavior Analysis of web sequential Patterns (BGCAP) that effectively combines these strategies to generate prefetching rules in the form of 2-sequence patterns with Periodicity and threshold of Cyclic Behaviour that can be utilized to effectively prefetch Web pages, thus reducing the users perceived latency. As BGCAP is based on Bidirectional pattern growth, it performs only (log n+1) levels of recursion for mining n Web sequential patterns. Our experimental results show that prefetching rules generated using BGCAP is 5-10 percent faster for different data sizes and 10-15% faster for a fixed data size than TD-Mine. In addition, BGCAP generates about 5-15 percent more prefetching rules than TD-Mine.Comment: 19 page

arXiv.org e-Print Archive

ePrints@Bangalore University

How Do Tor Users Interact With Onion Services?

Author: Chetty Marshini
Dutkowska-Zuk Agnieszka
Edmundson Anne
Feamster Nick
Roberts Laura M.
Winter Philipp
Publication venue
Publication date: 01/01/2018
Field of study

Onion services are anonymous network services that are exposed over the Tor network. In contrast to conventional Internet services, onion services are private, generally not indexed by search engines, and use self-certifying domain names that are long and difficult for humans to read. In this paper, we study how people perceive, understand, and use onion services based on data from 17 semi-structured interviews and an online survey of 517 users. We find that users have an incomplete mental model of onion services, use these services for anonymity and have varying trust in onion services in general. Users also have difficulty discovering and tracking onion sites and authenticating them. Finally, users want technical improvements to onion services and better information on how to use them. Our findings suggest various improvements for the security and usability of Tor onion services, including ways to automatically detect phishing of onion services, more clear security indicators, and ways to manage onion domain names that are difficult to remember.Comment: Appeared in USENIX Security Symposium 201

arXiv.org e-Print Archive

Princeton University Open Access Repository

A workbench to support development and maintenance of world-wide web

Author: Dalton Susannah
Publication venue
Publication date: 01/01/1996
Field of study

The World-Wide Web is one of the most dominant features of the Internet. In its short life it has become an important part of information technology, having a role to play in all sectors. Unfortunately, it has many problems too. Due to its fast evolution, World-Wide Web document development is undisciplined and has resulted in the appearance of much poor quality work. This is also widely due to the inexperience of authors, the lack of conventions, standards or guidelines and useful tools for development and maintenance of Web documents. One solution to the major problems of poor quality of World-Wide Web documents is the improved maintenance of such documents. Maintenance is an important area that, similar to software engineering, receives little attention compared with development. In order to address the problems of World-Wide Web document maintenance, research into the area was carried out through a literature survey and case studies of the organisations that manage World-Wide Web sites. The results of this research led to producing a workbench which provides support to both developers and maintainers of Web documents. This workbench consists of methods, guidelines and tools for World-Wide Web development and maintenance

Durham e-Theses

A Literature Survey on Web Content Mining

Author: V. David Martin, Dr. T. N. Ravi
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/10/2016
Field of study

Web is an accumulation of inter related documents on one or more web servers while web mining implies extricating important data from web databases. Web mining is one of the data mining spaces where data mining methods are utilized for extricating data from the web servers. The web information incorporates site pages, web links, questions on the web and web logs. Web mining is utilized to comprehend the client behavior, assess a specific site in view of the data which is stored in web log documents. Web mining is assessed by utilizing data mining strategies, specifically Association Rules, Classification and Clustering. It has some helpful regions or applications, for example, Electronic trade, E-learning, E-government, E-arrangements, E-majority rules system, Electronic business, security, crime examination and computerized library. Recovering the required web page from the web productively and adequately becomes a challenging task since web is comprised of unstructured information, which conveys the substantial measure of data and increment the unpredictability of managing data from various web service providers. The accumulation of data turns out to be elusive, extract, channel or assess the significant data for the clients. In this paper, we have considered the essential ideas of web mining, classification, procedures and issues. Notwithstanding this, this paper likewise broke down the web mining research challenges

International Journal on Recent and Innovation Trends in Computing and Communication

Modelling Web Usage in a Changing Environment

Author: Hofgesang P.I.
Publication venue
Publication date: 01/01/2009
Field of study

Eiben, A.E. [Promotor]Kowalczyk, W. [Copromotor

VU Research Portal

Towards A Better Understanding of Browser Fingerprinting

Author: Al-Fannah Nasser Mohammed
Publication venue
Publication date: 01/01/2020
Field of study

Royal Holloway - Pure

Intelligent Support for Information Retrieval of Web Documents

Author: Kovaľ Robert
Návrat Pavol
Publication venue: Institute of Informatics, Slovak Academy of Sciences
Publication date: 21/02/2012
Field of study

The main goal of this research was to investigate the means of intelligent support for retrieval of web documents. We have proposed the architecture of the web tool system --- Trillian, which discovers the interests of users without their interaction and uses them for autonomous searching of related web content. Discovered pages are suggested to the user. The discovery of user interests is based on analysis of documents visited by the users previously. We have created a module for completely transparent tracking of the user's movement on the web, which logs both visited URLs and contents of web pages. The post analysis step is based on a variant of the suffix tree clustering algorithm. We primarily focus on overall Trillian architecture design and the process of discovering topics of interests. We have implemented an experimental prototype of Trillian and evaluated the quality, speed and usefulness of the proposed system. We have shown that clustering is a feasible technique for extraction of interests from web documents. We consider the proposed architecture to be quite promising and suitable for future extensions

Computing and Informatics (E-Journal - Institute of Informatics, SAS, Bratislava)