16,779 research outputs found
Government mandated blocking of foreign Web content
Blocking of foreign Web content by Internet access providers has been a hot
topic for the last 18 months in Germany. Since fall 2001 the state of
North-Rhine-Westphalia very actively tries to mandate such blocking. This paper
will take a technical view on the problems imposed by the blocking orders and
blocking content at access or network provider level in general. It will also
give some empirical data on the effects of the blocking orders to help in the
legal assessment of the orders.Comment: Preprint, revised 30.6.200
Context Aware Computing for The Internet of Things: A Survey
As we are moving towards the Internet of Things (IoT), the number of sensors
deployed around the world is growing at a rapid pace. Market research has shown
a significant growth of sensor deployments over the past decade and has
predicted a significant increment of the growth rate in the future. These
sensors continuously generate enormous amounts of data. However, in order to
add value to raw sensor data we need to understand it. Collection, modelling,
reasoning, and distribution of context in relation to sensor data plays
critical role in this challenge. Context-aware computing has proven to be
successful in understanding sensor data. In this paper, we survey context
awareness from an IoT perspective. We present the necessary background by
introducing the IoT paradigm and context-aware fundamentals at the beginning.
Then we provide an in-depth analysis of context life cycle. We evaluate a
subset of projects (50) which represent the majority of research and commercial
solutions proposed in the field of context-aware computing conducted over the
last decade (2001-2011) based on our own taxonomy. Finally, based on our
evaluation, we highlight the lessons to be learnt from the past and some
possible directions for future research. The survey addresses a broad range of
techniques, methods, models, functionalities, systems, applications, and
middleware solutions related to context awareness and IoT. Our goal is not only
to analyse, compare and consolidate past research work but also to appreciate
their findings and discuss their applicability towards the IoT.Comment: IEEE Communications Surveys & Tutorials Journal, 201
BlogForever: D2.5 Weblog Spam Filtering Report and Associated Methodology
This report is written as a first attempt to define the BlogForever spam detection strategy. It comprises a survey of weblog spam technology and approaches to their detection. While the report was written to help identify possible approaches to spam detection as a component within the BlogForver software, the discussion has been extended to include observations related to the historical, social and practical value of spam, and proposals of other ways of dealing with spam within the repository without necessarily removing them. It contains a general overview of spam types, ready-made anti-spam APIs available for weblogs, possible methods that have been suggested for preventing the introduction of spam into a blog, and research related to spam focusing on those that appear in the weblog context, concluding in a proposal for a spam detection workflow that might form the basis for the spam detection component of the BlogForever software
Addressing the new generation of spam (Spam 2.0) through Web usage models
New Internet collaborative media introduce new ways of communicating that are not immune to abuse. A fake eye-catching profile in social networking websites, a promotional review, a response to a thread in online forums with unsolicited content or a manipulated Wiki page, are examples of new the generation of spam on the web, referred to as Web 2.0 Spam or Spam 2.0. Spam 2.0 is defined as the propagation of unsolicited, anonymous, mass content to infiltrate legitimate Web 2.0 applications.The current literature does not address Spam 2.0 in depth and the outcome of efforts to date are inadequate. The aim of this research is to formalise a definition for Spam 2.0 and provide Spam 2.0 filtering solutions. Early-detection, extendibility, robustness and adaptability are key factors in the design of the proposed method.This dissertation provides a comprehensive survey of the state-of-the-art web spam and Spam 2.0 filtering methods to highlight the unresolved issues and open problems, while at the same time effectively capturing the knowledge in the domain of spam filtering.This dissertation proposes three solutions in the area of Spam 2.0 filtering including: (1) characterising and profiling Spam 2.0, (2) Early-Detection based Spam 2.0 Filtering (EDSF) approach, and (3) On-the-Fly Spam 2.0 Filtering (OFSF) approach. All the proposed solutions are tested against real-world datasets and their performance is compared with that of existing Spam 2.0 filtering methods.This work has coined the term âSpam 2.0â, provided insight into the nature of Spam 2.0, and proposed filtering mechanisms to address this new and rapidly evolving problem
BlogForever D2.6: Data Extraction Methodology
This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform
Debugging of Web Applications with Web-TLR
Web-TLR is a Web verification engine that is based on the well-established
Rewriting Logic--Maude/LTLR tandem for Web system specification and
model-checking. In Web-TLR, Web applications are expressed as rewrite theories
that can be formally verified by using the Maude built-in LTLR model-checker.
Whenever a property is refuted, a counterexample trace is delivered that
reveals an undesired, erroneous navigation sequence. Unfortunately, the
analysis (or even the simple inspection) of such counterexamples may be
unfeasible because of the size and complexity of the traces under examination.
In this paper, we endow Web-TLR with a new Web debugging facility that supports
the efficient manipulation of counterexample traces. This facility is based on
a backward trace-slicing technique for rewriting logic theories that allows the
pieces of information that we are interested to be traced back through inverse
rewrite sequences. The slicing process drastically simplifies the computation
trace by dropping useless data that do not influence the final result. By using
this facility, the Web engineer can focus on the relevant fragments of the
failing application, which greatly reduces the manual debugging effort and also
decreases the number of iterative verifications.Comment: In Proceedings WWV 2011, arXiv:1108.208
A Process Framework for Semantics-aware Tourism Information Systems
The growing sophistication of user requirements in tourism due to the advent of new technologies such as the Semantic Web and mobile computing has imposed new possibilities for improved intelligence in Tourism Information Systems (TIS). Traditional software engineering and web engineering approaches cannot suffice, hence the need to find new product development approaches that would sufficiently enable the next generation of TIS. The next generation of TIS are expected among other things to: enable
semantics-based information processing, exhibit natural language capabilities, facilitate inter-organization exchange of information in a seamless way, and
evolve proactively in tandem with dynamic user requirements. In this paper, a product development approach called Product Line for Ontology-based Semantics-Aware Tourism Information Systems (PLOSATIS) which is a novel
hybridization of software product line engineering, and Semantic Web engineering concepts is proposed. PLOSATIS is presented as potentially effective, predictable and amenable to software process improvement initiatives
- âŠ