Search CORE

248 research outputs found

An Extensible Framework for Creating Personal Archives of Web Resources Requiring Authentication

Author: Kelly Matthew Ryan
Publication venue: ODU Digital Commons
Publication date: 01/07/2012
Field of study

The key factors for the success of the World Wide Web are its large size and the lack of a centralized control over its contents. In recent years, many advances have been made in preserving web content but much of this content (namely, social media content) was not archived, or still to this day is not being archived,for various reasons. Tools built to accomplish this frequently break because of the dynamic structure of social media websites. Because many social media websites exhibit a commonality in hierarchy of the content, it would be worthwhile to setup a means to reference this hierarchy for tools to leverage and become adaptive as the target websites evolve. As relying on the service to provide this means is problematic in the context of archiving, we can surmise that the only way to assure that all of these shortcomings are not experienced is to rely on the original context in which the user views the content, i.e. the webbrowser. In this thesis I will describe an abstract specification and concrete implementations of the specification that allow tools to leverage the context of theweb browser to capture content into personal web archives. These tools will then be able to accomplish personal web archiving in a way that makes them more robust. As evaluation, I will make a change in the hierarchy of a synthetic social media website and its respective specification. Then, I will show that anadapted tool, using the specification, continues to function and is able to archive the social media website

Old Dominion University

An Updated Portrait of the Portuguese Web

Author: Demarzo Marcelo Marcos Piva [UNIFESP]
Garcia-Campayo Javier
Gascon Santiago
Montero-Marin Jesus
Prado-Abril Javier
Publication venue: Universidade de Aveiro
Publication date: 12/10/2009
Field of study

This study presents an updated characterization of the Portuguese Web derived from a crawl of 48 million contents belonging to all media types (2.5 TB of data), performed in March, 2008. The resulting data was analyzed to characterize contents, sites and domains. This study was performed within the scope of the Portuguese Web Archive.POSC/EU, UMI

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Crossref

Repositório Comum

Repositorio Universidad de Zaragoza

Repositório Institucional UNIFESP

Directory of Open Access Journals

PubMed Central

Prediksi Kepribadian Mahasiswa Menggunakan Naïve Bayes

Author: Arnomo Sasa Ani
Fariska Vina
Fifi Fifi
Khisal Khisal
Ridho Muhammat Rasid
Publication venue: LPPM Universitas Putera Batam
Publication date: 28/09/2023
Field of study

College students are in a transitional phase from youth to adulthood. The transition period makes students still unstable to control their emotions. It makes his curiosity towards new things increase which then shows his personality traits. The purpose of this study was to find out how researchers collect data about personality from students, to find out how to classify personality from the data that has been collected. Research methods start from collecting data using Text Preprocessing questionnaires, Data Training, Classification, Testing, to making predictions. After applying the classification algorithm with the Naïve Bayes algorithm, the Train Score is 0.947 and the Test Score is 0.879. Trials have also been carried out to make predictions with new data whose results are correct

Universitas Putera Batam (UPB): Open Journal Systems

Smart Three Phase Crawler for Mining Deep Web Interfaces

Author: Pooja, Dr Gundeep Tanwar
Publication venue: Auricle Global Society of Education and Research
Publication date: 30/04/2018
Field of study

As deep web develops at a quick pace, there has been expanded enthusiasm for strategies that assistance effectively find deep-web interfaces. Nonetheless, because of the extensive volume of web assets and the dynamic idea of deep web, accomplishing wide scope and high effectiveness is a testing issue. In this task propose a three-stage framework, for proficient reaping deep web interfaces. In the principal stage, web crawler performs website based scanning for focus pages with the assistance of web search tools, abstaining from going by a substantial number of pages. In this paper we have made an overview on how web crawler functions and what are the approaches accessible in existing framework from various scientists

International Journal on Future Revolution in Computer Science & Communication Engineering

Eliminating Code Duplication in Cascading Style Sheets

Author: Mazinanian Davood
Publication venue
Publication date: 01/09/2017
Field of study

Cascading Style Sheets (i.e., CSS) is the standard styling language, widely used for defining the presentation semantics of user interfaces for web, mobile and desktop applications. Despite its popularity, CSS has not received much attention from academia. Indeed, developing and maintaining CSS code is rather challenging, due to the inherent language design shortcomings, the interplay of CSS with other programming languages (e.g., HTML and JavaScript), the lack of empirically- evaluated coding best-practices, and immature tool support. As a result, the quality of CSS code bases is poor in many cases. In this thesis, we focus on one of the major issues found in CSS code bases, i.e., the duplicated code. In a large, representative dataset of CSS code, we found an average of 68% duplication in style declarations. To alleviate this, we devise techniques for refactoring CSS code (i.e., grouping style declarations into new style rules), or migrating CSS code to take advantage of the code abstraction features provided by CSS preprocessor languages (i.e., superset languages for CSS that augment it by adding extra features that facilitate code maintenance). Specifically for the migration transformations, we attempt to align the resulting code with manually-developed code, by relying on the knowledge gained by conducting an empirical study on the use of CSS preprocessors, which revealed the common coding practices of the developers who use CSS preprocessor languages. To guarantee the behavior preservation of the proposed transformations, we come up with a list of preconditions that should be met, and also describe a lightweight testing technique. By applying a large number of transformations on several web sites and web applications, it is shown that the transformations are indeed presentation-preserving, and can effectively reduce the amount of duplicated code in CSS

Concordia University Research Repository