7,625 research outputs found
Web log file analysis: backlinks and queries
As has been described else where, web log files are a useful source of information about visitor site use, navigation behaviour, and, to some extent, demographics. But log files can also reveal the existence of both web pages and search engine queries that are sources of new visitors.This study extracts such information from a single web log files and uses it to illustrate its value, not only to th site owner but also to those interested in investigating the online behaviour of web users
Cache-Aware Memory Manager for Optimistic Simulations
Parallel Discrete Event Simulation is a well known technique for executing complex general-purpose simulations where models are described as objects the interaction of which is expressed through the generation of impulsive events. In particular, Optimistic Simulation allows full exploitation of the available computational power, avoiding the need to compute safety properties for the events to be executed. Optimistic Simulation platforms internally rely on several data structures, which are meant to support operations aimed at ensuring correctness, inter-kernel communication and/or event scheduling. These housekeeping and management operations access them according to complex patterns, commonly suffering from misuse of memory caching architectures. In particular, operations like log/restore access data structures on a periodic basis, producing the replacement of in-cache buffers related to the actual working set of the application logic, producing a non-negligible performance drop.
In this work we propose generally-applicable design principles for a new memory management subsystem targeted at Optimistic Simulation platforms which can face this issue by wisely allocating memory buffers depending on their actual future access patterns, in order to enhance event-execution memory locality. Additionally, an application-transparent implementation within ROOT-Sim, an open-source generalpurpose optimistic simulation platform, is presented along with experimental results testing our proposal
Realistic Traffic Generation for Web Robots
Critical to evaluating the capacity, scalability, and availability of web
systems are realistic web traffic generators. Web traffic generation is a
classic research problem, no generator accounts for the characteristics of web
robots or crawlers that are now the dominant source of traffic to a web server.
Administrators are thus unable to test, stress, and evaluate how their systems
perform in the face of ever increasing levels of web robot traffic. To resolve
this problem, this paper introduces a novel approach to generate synthetic web
robot traffic with high fidelity. It generates traffic that accounts for both
the temporal and behavioral qualities of robot traffic by statistical and
Bayesian models that are fitted to the properties of robot traffic seen in web
logs from North America and Europe. We evaluate our traffic generator by
comparing the characteristics of generated traffic to those of the original
data. We look at session arrival rates, inter-arrival times and session
lengths, comparing and contrasting them between generated and real traffic.
Finally, we show that our generated traffic affects cache performance similarly
to actual traffic, using the common LRU and LFU eviction policies.Comment: 8 page
Memory and Parallelism Analysis Using a Platform-Independent Approach
Emerging computing architectures such as near-memory computing (NMC) promise
improved performance for applications by reducing the data movement between CPU
and memory. However, detecting such applications is not a trivial task. In this
ongoing work, we extend the state-of-the-art platform-independent software
analysis tool with NMC related metrics such as memory entropy, spatial
locality, data-level, and basic-block-level parallelism. These metrics help to
identify the applications more suitable for NMC architectures.Comment: 22nd ACM International Workshop on Software and Compilers for
Embedded Systems (SCOPES '19), May 201
Component Substitution through Dynamic Reconfigurations
Component substitution has numerous practical applications and constitutes an
active research topic. This paper proposes to enrich an existing
component-based framework--a model with dynamic reconfigurations making the
system evolve--with a new reconfiguration operation which "substitutes"
components by other components, and to study its impact on sequences of dynamic
reconfigurations.
Firstly, we define substitutability constraints which ensure the component
encapsulation while performing reconfigurations by component substitutions.
Then, we integrate them into a substitutability-based simulation to take these
substituting reconfigurations into account on sequences of dynamic
reconfigurations. Thirdly, as this new relation being in general undecidable
for infinite-state systems, we propose a semi-algorithm to check it on the fly.
Finally, we report on experimentations using the B tools to show the
feasibility of the developed approach, and to illustrate the paper's proposals
on an example of the HTTP server.Comment: In Proceedings FESCA 2014, arXiv:1404.043
On the Implementation of GNU Prolog
GNU Prolog is a general-purpose implementation of the Prolog language, which
distinguishes itself from most other systems by being, above all else, a
native-code compiler which produces standalone executables which don't rely on
any byte-code emulator or meta-interpreter. Other aspects which stand out
include the explicit organization of the Prolog system as a multipass compiler,
where intermediate representations are materialized, in Unix compiler
tradition. GNU Prolog also includes an extensible and high-performance finite
domain constraint solver, integrated with the Prolog language but implemented
using independent lower-level mechanisms. This article discusses the main
issues involved in designing and implementing GNU Prolog: requirements, system
organization, performance and portability issues as well as its position with
respect to other Prolog system implementations and the ISO standardization
initiative.Comment: 30 pages, 3 figures, To appear in Theory and Practice of Logic
Programming (TPLP); Keywords: Prolog, logic programming system, GNU, ISO,
WAM, native code compilation, Finite Domain constraint
The AliEn system, status and perspectives
AliEn is a production environment that implements several components of the
Grid paradigm needed to simulate, reconstruct and analyse HEP data in a
distributed way. The system is built around Open Source components, uses the
Web Services model and standard network protocols to implement the computing
platform that is currently being used to produce and analyse Monte Carlo data
at over 30 sites on four continents. The aim of this paper is to present the
current AliEn architecture and outline its future developments in the light of
emerging standards.Comment: Talk from the 2003 Computing in High Energy and Nuclear Physics
(CHEP03), La Jolla, Ca, USA, March 2003, 10 pages, Word, 10 figures. PSN
MOAT00
Tracking Users across the Web via TLS Session Resumption
User tracking on the Internet can come in various forms, e.g., via cookies or
by fingerprinting web browsers. A technique that got less attention so far is
user tracking based on TLS and specifically based on the TLS session resumption
mechanism. To the best of our knowledge, we are the first that investigate the
applicability of TLS session resumption for user tracking. For that, we
evaluated the configuration of 48 popular browsers and one million of the most
popular websites. Moreover, we present a so-called prolongation attack, which
allows extending the tracking period beyond the lifetime of the session
resumption mechanism. To show that under the observed browser configurations
tracking via TLS session resumptions is feasible, we also looked into DNS data
to understand the longest consecutive tracking period for a user by a
particular website. Our results indicate that with the standard setting of the
session resumption lifetime in many current browsers, the average user can be
tracked for up to eight days. With a session resumption lifetime of seven days,
as recommended upper limit in the draft for TLS version 1.3, 65% of all users
in our dataset can be tracked permanently.Comment: 11 page
- …