184 research outputs found

    BlogForever D2.6: Data Extraction Methodology

    Get PDF
    This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform

    Transitioning From Relational to Nosql: a Case Study

    Get PDF
    Data storage requirements have increased dramatically in recent years due to the explosion in data volumes brought about by the Web 2.0 era. Changing priorities for database system requirements has seen NoSQL databases emerge as an alternative to relational database systems that have dominated this market for over 40 years. Web-enabled, always on applications mean availability of the database system is critically important as any downtime can translate in to unrecoverable financial loss. Cost is also hugely important in this era where credit is difficult to obtain and organizations look to get the maximum from their IT infrastructure from the least amount of investment. The purpose of this study is to evaluate the current NoSQL market and assess its suitability as an alternative to a relational database. The research will look at a case study of a bulletin board application that uses a relational database for data storage and evaluate how such an application can be converted to using a NoSQL database. This case study will also be used to assess the performance attributes of a NoSQL database when implemented on a low cost hardware platform. The findings will provide insight to those who are considering making the switch from a relational database system to a NoSQL database system

    Developing Adventures in Sustainable Urbanism\u27s Web Site

    Get PDF
    We designed and built a portion of a website to accompany Adventures in Sustainable Urbanism, a book co-written by Robert Krueger (our advisor), Tim Freytag, and Samuel Mössner. Specifically, we had to design and partially implement the Field Trips section of the site. We looked into what the elements of good website design (especially educational website design) were, and from there created two examples of good field trips and a means for people to add their own field trips

    Performance Challenges with Data Visualizations in Browser Environment

    Get PDF
    Information exists in many forms, from text, to equations, videos, audio, and graphical mediums. With graphical or visual mediums, it is becoming easier to absorb information where the alternatives are textual descriptions. Graphs are important vehicles of transporting information. In order to create a good graph, certain attributes need to be taken into account, such as which variables are being displayed over which axis, visual elements, and their sizes are also important to consider. In modern times with the internet and the amount of data being generated, how can all this data be fitted into a single graph? That question is the motivation for this thesis. Presenting large data in visualizations involves a great deal of thought, effort, and ingenuity on how to proceed with what information to convey. There are times when obtaining data for such visualization come with their own challenges. This thesis investigates the obstacles facing an internal tool within a company in regard to their data retrieval method. As well as the objective to research an efficient and easy-to-use method for presenting large data on a webpage

    Scrolling vs Paging: Reading Performance and Preference of Reading Modes in Long-form Online News

    Get PDF
    This study explores the impact of scrolling and dynamic pagination in long-form online documents on reader performance and reader experience. Previous research has produced mixed results, indicating no difference between modes, or a positive effect favouring scrolling. Recent advances in web standards have enabled simpler, dynamic, performant methods of pagination to tailor content responsively to any screen, meriting renewed study in this area. This paper uses one such method to load subsequent online news pages instantly without buffering. In an online browser experiment with 38 participants, an increase in reading speed in the scrolling mode was found at a level of significance. This follows previous research which has suggested that while a scrolling presentation style exacts extra demands on working memory capacity (WMC), many current web users have developed compensatory strategies and cognitive flexibility for navigating scrolling web documents
    • 

    corecore