37 research outputs found

    Web Mining for Web Personalization

    Get PDF
    Web personalization is the process of customizing a Web site to the needs of specific users, taking advantage of the knowledge acquired from the analysis of the user\u27s navigational behavior (usage data) in correlation with other information collected in the Web context, namely, structure, content, and user profile data. Due to the explosive growth of the Web, the domain of Web personalization has gained great momentum both in the research and commercial areas. In this article we present a survey of the use of Web mining for Web personalization. More specifically, we introduce the modules that comprise a Web personalization system, emphasizing the Web usage mining module. A review of the most common methods that are used as well as technical issues that occur is given, along with a brief overview of the most popular tools and applications available from software vendors. Moreover, the most important research initiatives in the Web usage mining and personalization areas are presented

    Services and resource profiles as metrics for the allocation of IT infrastructure costs

    Get PDF

    Website visualizer : a tool for the visual analysis of website usage

    Get PDF
    Mestrado em Engenharia Electrónica, Telecomunicações e InformáticaOs sítios web estão incorporados em organizações para sustentar a missão das mesmas e para garantir uma difusão eficaz de informação num quadro de fluxo de trabalho eficiente. Neste contexto, os gestores de conteúdo e informação tem que monitorizar constantemente as necessidades inerentes à missão institucional e reflecti-las na estrutura, conteúdos e paradigmas de interacção das respectivas intranets e extranets. Esta tarefa de monitorização e análise não é de todo trivial, nem automática, sendo difícil garantir a sincronização dos sítios institucionais com as efectivas necessidades da sua missão em dado momento. O objectivo fundamental deste trabalho traduz-se nos exercícios de conceptualização, desenvolvimento e avaliação de uma aplicação capaz de relatar um cenário de análise e visualizar padrões de interacção em sitíos institucionais suportados em tecnologias web, que seja capaz de realçar as áreas mais críticas, com base na análise da estructura, conteúdo e hiperligações. Para este efeito, propôs-se um modelo conceptual e uma arquitectura, bem como um conjunto de métodos de visualização que facilitem essa análise. De forma a validar o modelo conceptual, a arquitectura, as estruturas de informação e os diversos métodos de visualização propostos, desenvolveu-se um protótipo que já comporta algumas fases de avaliação e aferição. Este protótipo pode ser considerado como uma plataforma de suporte à investigação capaz de integrar e testar esquemas específicos de visualização e procedimentos de correlação visual. Em suma, é parte integrante de um dos projectos de investigação da Universidade de Aveiro. Mais específicamente, este trabalho introduz uma arquitectura por camadas que suporta vistas multiplas sincronizadas, bem como novos metódos de visualização, inspecção e interacção. O prototipo integra estes metódos de visualização numa aplicação capaz de capturar, compilar e analizar informação relacionada com a estructura e conteúdo do sitío web, bem como padrões de utilização. O protótipo destina-se fundamentalmente a dar apoio a especialistas de usabilidade ou gestores de conteúdo na organização do espaço de informação de um sitío institucional. Contudo, não se destina a produzir directamente soluções para problemas de usabilidade encontrados, mas sim a ajudar a tomar decisões com base nos problemas de usabilidade diagnosticados, identificados e sinalizados durante o processo de análise. ABSTRACT: Websites are incorporated in organizations to support their mission and guarantee effective information delivery within an efficient information workflow framework. In this context, content managers have to constantly monitor the business needs and reflect them on the structure, contents and interaction paradigm of the institutional websites. This task is not trivial, nor automated, being difficult to guarantee that these websites are synchronized with the actual business requirements. The overall goal of this work is the conceptualization, development and evaluation of an application able to assist usability experts in the analysis and visualization of interaction patterns of organizational web based systems. It should be able to highlight the most critical website areas, based on the analysis of website structure, contents and interconnections. For this purpose, a conceptual model and architecture has been proposed, as well as a set of visualization methods designed to facilitate that analysis. In order to validate the proposed conceptual models, the architecture, information structures and several visualization methods, a prototype was developed, evaluated and refined. It can be considered as an experimental research platform, capable of integrating and testing specific visualization schemes and visual correlation procedures, and is part of an ongoing research program of University of Aveiro. Specifically, this work introduces a layered architecture that supports simultaneously synchronised multiple views, as well as novel visualization, inspection and interaction mechanisms. The prototype integrates these visualization methods in an application able to capture, compile and analyze the information related to the structure, contents and usage patterns of a website. This work is meant mainly to help usability experts or content managers to organize the informational space of an institutional web site. However, this application is not supposed to directly provide solutions for the usability problems of the site but to offer the means to help its users take decisions based on the interpretation of the usability problems identified and highlighted during the analysis process

    A Comparative Analysis of Automated Web Site Evaluation Tools

    Get PDF

    Website Log Analysis: Case Study of USA Cycling's Website

    Get PDF
    Numerous articles on the reasons for web log analysis exist. Much of the web-log analysis literature deals with how to collect data, technical aspects, and how to select the appropriate software for collecting the data; it will be the aim of this paper to create a user profile for USA Cycling's website by using WebTrends software to analyze web-log files. After the user profile has been developed, it will be shown that the web-log analysis of USA Cycling's website can be used to make daily and long term decisions about the its functionality. In addition, this paper will cover the basic issues of web-log analysis as well as exploring the practical application for USA Cycling. To accomplish these tasks, USA Cycling's web-logs were analyzed from August 1999 to April 2002 using WebTrends log analyzing software and key questions were developed based on observations and sent to USA Cycling for clarification

    FITTEST logging and log-analysis' infrastructure for PHP

    Get PDF
    Dissertação de mestrado em Engenharia InformáticaMany applications employ logging in order to provide tracing information about their executions. Overall, generated logs offer a large amount of information that can be valuable in numerous ways. One good approach, will be a scenario where the main program is secure from any types of errors, such as wrong data or even in worst scenarios, things like code injection. Currently, there are no solutions to support all these features in a single framework. The followed approach to generate these logs is FITTEST which means Future Internet Testing. FITTEST is an incessant testing method that has been chosen to handle the augmented dynamics in forthcoming internet applications. However, to support the FITTEST approach, logs have to be generated systematically, and in a well-defined format. The logging solution can log both hight level and low level events. High level events are events that can be seen as produced by users, whereas low-level events are events that tell us what happens inside a function execution as part of the target program’s reaction to a high level event. The goal of this project is the development of an infrastructure to generate FITTEST logs for PHP, the application of that infrastructure in a test web-application and then the analysing of the generated logs. The logs are then stored in a compressed format. However, they can also be exported to XML enabling post-processing by other tools.Muitas aplicações utilizam logs com o intuito de fornecer informações de funcionamento sobre as suas execuções. Geralmente, os logs gerados oferecem uma grande quantidade de informações que podem ser valiosas de inúmeras maneiras. Uma boa abordagem será um cenário onde o programa principal seja protegido contra quaisquer tipos de erros, tais como dados errados ou até mesmo nos piores cenários, ataques como a injeção de código (code injection). Atualmente, não existem soluções para apoiar todas estas caracteristícas em uma única framework. A abordagem apresentada para gerar estes logs é chamada de FITTEST que significa Future Internet Testing. FITTEST é um método de teste contínuo que foi escolhido para lidar com a dinâmica aumentada em aplicações de internet futuras. No entanto, para suportar a abordagem FITTEST, os logs têm de ser obtidos sistematicamente e em um formato bem definido. Os logs FITTEST podem registar dois tipos de logs: os eventos de alto nível e os eventos de baixo nível. Os de alto nível são eventos que podem ser vistos e produzidos pelo utilizador, enquanto que os de baixo nível são eventos que nos dizem o que aconteceu dentro de uma execução de uma função quando há uma reacção da aplicação para com um evento de alto nível. O objectivo principal deste projeto é o desenvolvimento de uma infra-estrutura em PHP para gerar FITTEST logs, aplicar essa infra-estrutura em uma aplicação web e analisar os logs por ela gerados. Os logs são armazenados em um formato compacto (.log), no entanto podem ser exportados para XML possibilitando assim o pós-processamento por outras ferramentas

    Visualização de interação em cenários de comunicação humano-computador

    Get PDF
    Doutoramento em Informação e Comunicação em Plataformas DigitaisOs contextos infocomunicacionais suportados em mediação tecnológica (no contexto geral da comunicação mediada por computador) estão a tornar-se cada vez mais presentes nas atividades do dia a dia de um número crescente de indivíduos e instituições. Especificamente, as tecnologias e serviços da internet/web têm uma presença marcante nas instituições, um pouco por todo lado. Os sites internos das instituições (vulgo intranets) são desenvolvidos de acordo com as estratégias de comunicação internas, refletindo os fluxos internos de informação e respetivos serviços de comunicação que lhe estão associados. Um problema emergente tem a ver com a gestão destas plataformas infocomunicacionais internas (intranets) e da relação com os seus interlocutores externos (extranets), ambas em crescimento constante. Os especialistas em comunicação organizacional sentem a falta de ferramentas que lhes permita analisar (padrões de atividade e comportamento) e perceber o que realmente se está a passar dentro da instituição. De facto, estes instrumentos tendem a basear-se em métricas de cariz técnico clássico, em muitos casos, para afinações técnicas e não para gerir a comunicação organizacional ou para a análise e gestão da informação. Esta tese centra-se na conceção e avaliação destas ferramentas de análise e diagnóstico para que possam contribuir para um desenvolvimento destas infraestruturas sofisticadas e, consequentemente, melhorar a eficiência dos processos infocomunicacionais que lhes são intrínsecos. Um dos problemas está na identificação dos desajustes utilizador-sistema ao nível da interação humano-computador, que têm que ser completamente identificados, e os problemas prontamente apresentados à equipa que procede ao desenho e desenvolvimento das plataformas infocomunicacionais. O sistema tem que servir a organização e manter com eficácia os seus padrões de fluxo de informação e respetivas tarefas. O conceito sistémico de feedback apresenta-se aqui como fundamental e necessariamente eficiente para a rigorosa identificação de problemas na plataforma infocomunicacional de uma determinada instituição. As propostas apresentadas demonstram capacidade de diagnosticar problemas estruturais e de conteúdo a dois níveis: ao nível da própria interface dos serviços infocomunicacionais e ao nível da estrutura interna, ou de organização relacional de informação. Os serviços de diagnóstico apresentados baseiam-se em pressupostos de análise contextual fortemente suportados em técnicas de análise visual e revelam, através de algumas experiências de cariz empírico, conseguir dar resposta ao desafio lançado por esta tese.Technologically mediated info-communicational scenarios are becoming more and more pervasive in the day-to-day activity of a growing number of individuals and institutions. Specifically, internet/web technologies and services have a strong presence in institutions worldwide. Internal web sites (also known as intranets) are developed in compliance with internal communication strategies, reflecting internal information, workflow and related communication services. An emerging problem concerns the management of these constantly growing internal info-communicational platforms (intranets) and its external counterparts (extranets). Organizational communication specialists lack efficient tools to analyze (activity and behavioral patterns) and understand what is really going on inside the institutions. In fact, these instruments tend to be based on classical technical metrics, in most situations, for technical tuning and not for organizational communication and information analysis. This thesis is focused on the conception and evaluation of these diagnostic tools in order to contribute to the development of these sophisticated infrastructures and, consequently, improve the efficiency of their internal info-communicational processes. One of the issues lies in identifying user-system mismatch at the human-computer interaction level, which must be thoroughly identified, and the problems pinpointed to the design team. The system must serve the organization and adapt perfectly to its internal communication strategies, sustaining efficiently its information and workflow patterns. Efficient feedback instruments are fundamental to identify info-communicational platform problems inside an institution. The offered proposals demonstrate the ability to diagnose structural and content issues at two levels: at the level of its own info-communication services interface, and at the level of the internal structure or relational layout of information. The presented diagnostic services are based upon assumed contextual analysis, strongly supported in visual assessment methods, and manage to provide a response to the challenge issued by this thesis, through some empirical experiments

    GAUGING PUBLIC INTEREST FROM SERVER LOGS, SURVEYS AND INLINKS

    Get PDF
    As the World Wide Web (the Web) has turned into a full-fledged medium to disseminate news, it is very important for journalism and information science researchers to investigate how Web users access online news reports and how to interpret such usage patterns. This doctoral thesis collected and analyzed Web server log statistics, online surveys results, online reprints of the top 50 news reports, as well as external inlinks data of a leading comprehensive online newspaper (the People’s Daily Online) in China, one of the biggest Web/information markets in today’s world. The aim of the thesis was to explore various methods to gauge the public interest from a Webometrics perspective. A total of 129 days of Web server log statistics, including the top 50 Chinese and English news stories with the highest daily pageview numbers, the comments attracted by these news items and the emailed frequencies of the same stories were collected from October 2007 to September 2008. These top 50 news items’positions on the Chinese and English homepages and the top 50 queries submitted to the website search engine of the People’s Daily Online were also retrieved. Results of the two online surveys launched in March 2008 and March 2009 were collected after their respective closing dates. The external inlinks to the People’s Daily Online were retrieved by Yahoo! (Chinese and English versions), and the online reprints were retrieved by Google. Besides the general usage patterns identified from the top 50 news stories, this study, by conducting statistical tests on the data sets, also reveals the following findings. First, the editors’ choices and the readers’ favorites do not always match each other; thus content of news title is more important than its homepage position in attracting online visits. Second, the Chinese and English readers’ interests in the same events are different. Third, the pageview numbers and comments posted to the news items reflect the unfavorable attitudes of the Chinese people toward the United States and Japan, which might offer us a method to investigate the public interest in some other issues or nations after necessary modifications. More importantly, some publicly available data, such as the comments posted to the news stories and online survey results, further show that the pageview measure does reflect readers’ interests/needs truthfully, as proved by the strong correlations between the top news reports and relevant top queries. The external ininks to the news websites and the online reprints of the top news items help us examine readers\u27 interests from other perspectives, as well as establish online profiles of the news websites. Such publicly accessible information could be an alternative data source for researchers to study readers\u27 interests when the Web server log data are not available. This doctoral thesis not only shows the usefulness of Web server log statistics, survey results, and other publicly accessible data in studying Web user’s information needs, but also offers practical suggestions for online news sites to improve their contents and homepage designs. However, no single method can draw a complete picture of the online news readers’ interests. The above mentioned research methodologies should be employed together, in order to make more comprehensive conclusions. Future research is especially needed to investigate the continuously rapid growth of the “Mobile News Readers,” which poses both challenges and opportunities to the press industry in the 21st century

    GAUGING PUBLIC INTEREST FROM SERVER LOGS, SURVEYS AND INLINKS A Multi-Method Approach to Analyze News Websites

    Get PDF
    As the World Wide Web (the Web) has turned into a full-fledged medium to disseminate news, it is very important for journalism and information science researchers to investigate how Web users access online news reports and how to interpret such usage patterns. This doctoral thesis collected and analyzed Web server log statistics, online surveys results, online reprints of the top 50 news reports, as well as external inlinks data of a leading comprehensive online newspaper (the People\u27s Daily Online) in China, one of the biggest Web/information markets in today\u27s world. The aim of the thesis was to explore various methods to gauge the public interest from a Webometrics perspective. A total of 129 days of Web server log statistics, including the top 50 Chinese and English news stories with the highest daily pageview numbers, the comments attracted by these news items and the emailed frequencies of the same stories were collected from October 2007 to September 2008. These top 50 news items’positions on the Chinese and English homepages and the top 50 queries submitted to the website search engine of the People’s Daily Online were also retrieved. Results of the two online surveys launched in March 2008 and March 2009 were collected after their respective closing dates. The external inlinks to the People’s Daily Online were retrieved by Yahoo! (Chinese and English versions), and the online reprints were retrieved by Google. Besides the general usage patterns identified from the top 50 news stories, this study, by conducting statistical tests on the data sets, also reveals the following findings. First, the editors’ choices and the readers’ favorites do not always match each other; thus content of news title is more important than its homepage position in attracting online visits. Second, the Chinese and English readers’ interests in the same events are different. Third, the pageview numbers and comments posted to the news items reflect the unfavorable attitudes of the Chinese people toward the United States and Japan, which might offer us a method to investigate the public interest in some other issues or nations after necessary modifications. More importantly, some publicly available data, such as the comments posted to the news stories and online survey results, further show that the pageview measure does reflect readers’ interests/needs truthfully, as proved by the strong correlations between the top news reports and relevant top queries. The external ininks to the news websites and the online reprints of the top news items help us examine readers\u27 interests from other perspectives, as well as establish online profiles of the news websites. Such publicly accessible information could be an alternative data source for researchers to study readers\u27 interests when the Web server log data are not available. This doctoral thesis not only shows the usefulness of Web server log statistics, survey results, and other publicly accessible data in studying Web user’s information needs, but also offers practical suggestions for online news sites to improve their contents and homepage designs. However, no single method can draw a complete picture of the online news readers’ interests. The above mentioned research methodologies should be employed together, in order to make more comprehensive conclusions. Future research is especially needed to investigate the continuously rapid growth of the “Mobile News Readers,” which poses both challenges and opportunities to the press industry in the 21st century

    Web design for effective online training and instruction.

    Get PDF
    The following is a research/experimental thesis that surveys and examines web-design for effective online training and instruction. The purpose of the thesis is to create -- from a variety of relevant learning theories and practical web-design strategies advocated in the research literature -- a Web-based instruction checklist that can be used to develop and assess online instructional materials. This checklist, referred to as WeBIC, is structured around the common ISD processes of analysis, design, development, implementation, and evaluation, with a focus on ‘Web Usability’ and ‘the Five Ps’ of preparation, presentation, participation, practice and performance. To determine the usefulness of WeBIC as a design and evaluation tool, three studies have been generated: (1) an experimental comparison study of online instructional materials in two formats -- a web-study one that follows guidelines and strategies outlined by WeBIC, and the other that follows a text-only format based on a modified form of thesis writing guidelines; (2) an analysis study of server data related to website access and instructional activity at ESLenglish.com and during the comparison study; and (3) an evaluation study of the instructional materials used in the comparison study and the instructional materials available at ESLenglish.com. The comparison study showed 2.1% learning gains that under closer analysis were found to be non-significant. The server analysis study confirmed the importance of designing for ‘speed of access’ and ‘navigation ease.’ It also brought into question the reliability of web mining data and the need for proper operational definitions. The evaluation study produced WeBIC scores for ESLenglish.com and the comparison study learning materials that could be used as benchmarks for further research
    corecore