617,607 research outputs found

    Structure-based analysis of Web sites

    Get PDF
    The performance of information retrieval on the Web is heavily influenced by the organization of Web pages, user navigation patterns, and guidance-related functions. Having observed the lack of measures to reflect this factor, this paper focuses on an approach based on both structure properties and navigation data to analyze and improve the performance of Web site. Two types of indices are defined two major factors for analysis and improvement- "aaccessibility" reflects the structure property to measure how easy the user can access the pages and "popularity" implies the navigation data primarily based on the log statistics. The accessibility and popularity (A-P) plot serves as a compass for the Web designer to get an overview of current performance status and explore in the possible directions for improvement to balance the design anticipation and navigation expectation.published_or_final_versio

    Characterising Web Site Link Structure

    Full text link
    The topological structures of the Internet and the Web have received considerable attention. However, there has been little research on the topological properties of individual web sites. In this paper, we consider whether web sites (as opposed to the entire Web) exhibit structural similarities. To do so, we exhaustively crawled 18 web sites as diverse as governmental departments, commercial companies and university departments in different countries. These web sites consisted of as little as a few thousand pages to millions of pages. Statistical analysis of these 18 sites revealed that the internal link structure of the web sites are significantly different when measured with first and second-order topological properties, i.e. properties based on the connectivity of an individual or a pairs of nodes. However, examination of a third-order topological property that consider the connectivity between three nodes that form a triangle, revealed a strong correspondence across web sites, suggestive of an invariant. Comparison with the Web, the AS Internet, and a citation network, showed that this third-order property is not shared across other types of networks. Nor is the property exhibited in generative network models such as that of Barabasi and Albert.Comment: To appear at IEEE/WSE0

    Web site structure mining using social network analysis

    Get PDF
    Purpose – Web sites are typically designed attending to a variety of criteria. However, web site structure determines browsing behavior and way-finding results. The aim of this study is to identify the main profiles of web sites’ organizational structure by modeling them as graphs and considering several social network analysis features. Design/methodology/approach – A case study based on 80 institutional Spanish universities’ web sites has been used for this purpose. For each root domain, two different networks have been considered: the first is the domain network, and the second is the page network. In both cases, several indicators related to social network analysis have been evaluated to characterize the web site structure. Factor analysis provides the statistical methodology to adequately extract the main web site profiles in terms of their internal structure. Findings – This paper allows the categorization of web site design styles and provides general guidelines to assist designers to better identify areas for creating and improving institutional web sites. The findings of this study offer practical implications to web site designers for creating and maintaining an effective web presence, and for improving usability. Research limitations/implications – The research is limited to 80 institutional Spanish universities’ web sites. Other institutional university web sites from different countries can be analyzed, and the conclusions could be compared or enlarged. Originality/value – This paper highlights the importance of the internal web sites structure, and their implications on usability and way-finding results. As a difference to previous research, the paper is focused on the comparison of internal structure of institutional web sites, rather than analyzing the web as a whole or the interrelations among web sitesMinisterio de Educación y Ciencia DPI2007- 60128Junta de Andalucía. Consejería de Innovación, Ciencia y Empresa P07-TIC-0262

    Preprocessing and Content/Navigational Pages Identification as Premises for an Extended Web Usage Mining Model Development

    Get PDF
    From its appearance until nowadays, the internet saw a spectacular growth not only in terms of websites number and information volume, but also in terms of the number of visitors. Therefore, the need of an overall analysis regarding both the web sites and the content provided by them was required. Thus, a new branch of research was developed, namely web mining, that aims to discover useful information and knowledge, based not only on the analysis of websites and content, but also on the way in which the users interact with them. The aim of the present paper is to design a database that captures only the relevant data from logs in a way that will allow to store and manage large sets of temporal data with common tools in real time. In our work, we rely on different web sites or website sections with known architecture and we test several hypotheses from the literature in order to extend the framework to sites with unknown or chaotic structure, which are non-transparent in determining the type of visited pages. In doing this, we will start from non-proprietary, preexisting raw server logs.Knowledge Management, Web Mining, Data Preprocessing, Decision Trees, Databases

    Coarse-grained Classification of Web Sites by Their Structural Properties

    Get PDF
    In this paper, we identify and analyze structural properties which reflect the functionality of a Web site. These structural properties consider the size, the organization, the composition of URLs, and the link structure of Web sites. Opposed to previous work, we perform a comprehensive measurement study to delve into the relation between the structure and the functionality of Web sites. Our study focuses on five of the most relevant functional classes, namely Academic, Blog, Corporate, Personal, and Shop. It is based upon more than 1,400 Web sites composed of 7 million crawled and 47 million known Web pages. We present a detailed statistical analysis which provides insight into how structural properties can be used to distinguish between Web sites from different functional classes. Building on these results, we introduce a content-independent approach for the automated coarse-grained classification of Web sites. A naïve Bayesian classifier with advanced density estimation yields a precision of 82% and recall of 80% for the classification of Web sites into the considered classes

    Food-Web Structure of Seagrass Communities across Different Spatial Scales and Human Impacts

    Get PDF
    Seagrass beds provide important habitat for a wide range of marine species but are threatened by multiple human impacts in coastal waters. Although seagrass communities have been well-studied in the field, a quantification of their food-web structure and functioning, and how these change across space and human impacts has been lacking. Motivated by extensive field surveys and literature information, we analyzed the structural features of food webs associated with Zostera marina across 16 study sites in 3 provinces in Atlantic Canada. Our goals were to (i) quantify differences in food-web structure across local and regional scales and human impacts, (ii) assess the robustness of seagrass webs to simulated species loss, and (iii) compare food-web structure in temperate Atlantic seagrass beds with those of other aquatic ecosystems. We constructed individual food webs for each study site and cumulative webs for each province and the entire region based on presence/absence of species, and calculated 16 structural properties for each web. Our results indicate that food-web structure was similar among low impact sites across regions. With increasing human impacts associated with eutrophication, however, food-web structure show evidence of degradation as indicated by fewer trophic groups, lower maximum trophic level of the highest top predator, fewer trophic links connecting top to basal species, higher fractions of herbivores and intermediate consumers, and higher number of prey per species. These structural changes translate into functional changes with impacted sites being less robust to simulated species loss. Temperate Atlantic seagrass webs are similar to a tropical seagrass web, yet differed from other aquatic webs, suggesting consistent food-web characteristics across seagrass ecosystems in different regions. Our study illustrates that food-web structure and functioning of seagrass habitats change with human impacts and that the spatial scale of food-web analysis is critical for determining results

    Exploring the Structure of Library and Information Science Web Space Based on Multivariate Analysis of Social Tags

    Get PDF
    Introduction. This study examines the structure of Web space in the field of library and information science using multivariate analysis of social tags from the Website, Delicious.com. A few studies have examined mathematical modelling of tags, mainly examining tagging in terms of tri-partite graphs, pattern tracing and descriptive statistics. This study is one of the few studies to employ multivariate analysis in investigating dimensions of Web spaces based on social tagging data. Method. This study examines the post data collected from a set of library and information science related Websites bookmarked on Delicious.com using a Web crawler. Post data consist of the URL, usernames, tags and comments assigned by users of Delicious.com. The collected tag data were analysed based on multivariate methods, such as multidimensional scaling and structural equation modelling. Analysis. Collected data were first analysed using multidimensional scaling to explore initial relationships amongst the selected Websites. Then, confirmatory factor analysis based on structural equation modelling was employed to examine the hierarchical structure of the library & information science Web space. Results. Social tag data exhibit different dimensions in the Web space of the library and information science field. In addition, social tags confirmed the hierarchical structure of the field by showing significantly stronger relationships between the sites with similar characteristics. That is, the structure of the tagging data shows similar connections to those present in the real world. Conclusions. This study suggests a new statistical approach in social tagging and Web space analysis studies. Tag information can be used to explain the hierarchical structure of a certain domain. Methodologically, this study suggests that structural equation modelling can be a compelling method to explore hierarchal structures of nodes on the Web space

    Empirical analysis of web-based user-object bipartite networks

    Get PDF
    Understanding the structure and evolution of web-based user-object networks is a significant task since they play a crucial role in e-commerce nowadays. This Letter reports the empirical analysis on two large-scale web sites, audioscrobbler.com and del.icio.us, where users are connected with music groups and bookmarks, respectively. The degree distributions and degree-degree correlations for both users and objects are reported. We propose a new index, named collaborative clustering coefficient, to quantify the clustering behavior based on the collaborative selection. Accordingly, the clustering properties and clustering-degree correlations are investigated. We report some novel phenomena well characterizing the selection mechanism of web users and outline the relevance of these phenomena to the information recommendation problem.Comment: 6 pages, 7 figures and 1 tabl

    What a User Wants: Redesigning a Library\u27s Web Site Based on a Card-Sort Analysis

    Get PDF
    Web site usability concerns anyone with a web site to maintain. Libraries, however, are often the biggest offenders in terms of usability. In our efforts to provide users with everything they need to do research, we often overwhelm them with sites that are confusing in structure, difficult to navigate, and weighed down with jargon. Dowling College Library recently completed a redesign of its web site based upon the concept of usability. For smaller libraries in particular, this can be a challenge. The web site is often maintained by one or two people and finding the time and resources to conduct a usability study is difficult in that situation. Additional demands of a site redesign, from restructuring page layouts to adding visual appeal, only add to the burden. However, our team of four librarians was able to do it. We focused on vocabulary and organizational structure using a card-sort analysis. This analysis taught us how our users approach the information on our site. Task-based testing confirmed what the card-sort analysis had taught us and smoothed out design problems. Incorporating user feedback at nearly every stage of the process allowed us to create a site that more closely mirrors how our users look for information on our site. This study details how using testing and analyzing results throughout the redesign process created a better, more user-friendly web site
    corecore