Search CORE

744 research outputs found

Website Content Extraction Using Web Structure Analysis

Author: Daraham Nor Hayati
Publication venue: Universiti Teknologi Petronas
Publication date: 01/12/2005
Field of study

The Web poses itself as the largest data repository ever available in the history of humankind. Major efforts have been made in order to provide efficient to relevant information within huge repository of data. Although several techniques have been developed to the problem of Web data extraction, their use is still not spread, mostly because of the need for high human intervention and the low quality of the extraction results. For this project a domain-oriented approach to Web data extraction and discuss it application to extracting news from Web Sites. It will use the abstraction method to identify important sections in a web document. The relevance information will be taken account and will be highlighted in order to develop a focused web content output. The fact-finding and data about the project are gathered from various sources such as internet, and books. The methodology used is a Waterfall Model that involves several phases which are Planning, Analysis, Design and Implementation. The result of this project is the display and review of web content extraction and how it being currently being developed which the goals is to give more usability and easiness toward web users

UTPedia

Web Mining-Based Objective Metrics for Measuring Website Navigatability

Author: Chau Michael
Fang Xiao
Hu Paul
Sheng Olivia
Yang Zhuo
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2006
Field of study

Web site design is critical to the success of electronic commerce and digital government. Effective design requires appropriate evaluation methods and measurement metrics. The current research examines Web site navigability, a fundamental structural aspect of Web site design. We define Web site navigability as the extent to which a visitor can use a Web site’s hyperlink structure to locate target contents successfully in an easy and efficient manner. We propose a systematic Web site navigability evaluation method built on Web mining techniques. To complement the subjective self-reported metrics commonly used by previous research, we develop three objective metrics for measuring Web site navigability on the basis of the Law of Surfing. We illustrate the use of the proposed methods and measurement metrics with two large Web sites

AIS Electronic Library (AISeL)

HKU Scholars Hub

Website Content Extraction Using Web Structure Analysis

Author: Daraham Nor Hayati
Publication venue: Universiti Teknologi Petronas
Publication date: 01/12/2005
Field of study

UTPedia

Visual Architecture based Web Information Extraction

Author
Publication venue: 'Bonfring'
Publication date
Field of study

Crossref

Web Mining for Web Personalization

Author: Berendt B.
Berendt B.
Buchner A. G.
Chen M. S.
Coenen F.
Cooley R.
Huang Z.
Joachims T.
Joshi A.
Lieberman H.
Magdalini Eirinaki
Masseglia F.
Michalis Vazirgiannis
Mladenic D.
Mobasher B.
Mobasher B.
Nasraoui O.
Perkowitz M.
Perkowitz M.
Perkowitz M.
Shahabi C.
Spiliopoulou M.
Spiliopoulou M.
Yan T. W.
Zaiane O. R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/02/2003
Field of study

Web personalization is the process of customizing a Web site to the needs of specific users, taking advantage of the knowledge acquired from the analysis of the user\u27s navigational behavior (usage data) in correlation with other information collected in the Web context, namely, structure, content, and user profile data. Due to the explosive growth of the Web, the domain of Web personalization has gained great momentum both in the research and commercial areas. In this article we present a survey of the use of Web mining for Web personalization. More specifically, we introduce the modules that comprise a Web personalization system, emphasizing the Web usage mining module. A review of the most common methods that are used as well as technical issues that occur is given, along with a brief overview of the most popular tools and applications available from software vendors. Moreover, the most important research initiatives in the Web usage mining and personalization areas are presented

Crossref

SJSU ScholarWorks

The use of web analytics on a small data set in an online media company : shifter´s case study

Author: Ribeiro João Pedro de Almeida
Publication venue
Publication date: 09/01/2017
Field of study

Project Work presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceThe primary struggle in data analysis is the lack of talent in performing relevant and fit-to-business analyzes that retrieve knowledge and provides concise and clear action plans to today’s startups and small enterprises that exist online. Tracking, knowing and understanding the navigational patterns of user behavior for a 3 month period collection and using an Excel spreadsheet tool obtained a context for each piece of content produced and published by Shifter, an online media company. Investigations made after acquiring Shifter’s data resulted in recommendations for rethink and redesign the editorial content of the business to answer different community’s needs

Repositório da Universidade Nova de Lisboa

BlogForever D2.6: Data Extraction Methodology

Author: Banos V.
Davis R.
Gkotsis G.
Pincent E.
Stepanyan K.
Publication venue
Publication date: 25/10/2013
Field of study

This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Discovering Web Server Logs Patterns Using Generalized Association Rules Algorithm

Author: Mohamad Farhan Mohamad Mohsin
Mohd Helmy Abd Wahab
Mohd Norzali Haji Mohd
Publication venue: 'IntechOpen'
Publication date: 01/03/2010
Field of study

IntechOpen