3,747 research outputs found
A Taxonomy of Hyperlink Hiding Techniques
Hidden links are designed solely for search engines rather than visitors. To
get high search engine rankings, link hiding techniques are usually used for
the profitability of black industries, such as illicit game servers, false
medical services, illegal gambling, and less attractive high-profit industry,
etc. This paper investigates hyperlink hiding techniques on the Web, and gives
a detailed taxonomy. We believe the taxonomy can help develop appropriate
countermeasures. Study on 5,583,451 Chinese sites' home pages indicate that
link hidden techniques are very prevalent on the Web. We also tried to explore
the attitude of Google towards link hiding spam by analyzing the PageRank
values of relative links. The results show that more should be done to punish
the hidden link spam.Comment: 12 pages, 2 figure
Hikester - the event management application
Today social networks and services are one of the most important part of our
everyday life. Most of the daily activities, such as communicating with
friends, reading news or dating is usually done using social networks. However,
there are activities for which social networks do not yet provide adequate
support. This paper focuses on event management and introduces "Hikester". The
main objective of this service is to provide users with the possibility to
create any event they desire and to invite other users. "Hikester" supports the
creation and management of events like attendance of football matches, quest
rooms, shared train rides or visit of museums in foreign countries. Here we
discuss the project architecture as well as the detailed implementation of the
system components: the recommender system, the spam recognition service and the
parameters optimizer
BlogForever D2.4: Weblog spider prototype and associated methodology
The purpose of this document is to present the evaluation of different solutions for capturing blogs, established methodology and to describe the developed blog spider prototype
The Best Answers? Think Twice: Online Detection of Commercial Campaigns in the CQA Forums
In an emerging trend, more and more Internet users search for information
from Community Question and Answer (CQA) websites, as interactive communication
in such websites provides users with a rare feeling of trust. More often than
not, end users look for instant help when they browse the CQA websites for the
best answers. Hence, it is imperative that they should be warned of any
potential commercial campaigns hidden behind the answers. However, existing
research focuses more on the quality of answers and does not meet the above
need. In this paper, we develop a system that automatically analyzes the hidden
patterns of commercial spam and raises alarms instantaneously to end users
whenever a potential commercial campaign is detected. Our detection method
integrates semantic analysis and posters' track records and utilizes the
special features of CQA websites largely different from those in other types of
forums such as microblogs or news reports. Our system is adaptive and
accommodates new evidence uncovered by the detection algorithms over time.
Validated with real-world trace data from a popular Chinese CQA website over a
period of three months, our system shows great potential towards adaptive
online detection of CQA spams.Comment: 9 pages, 10 figure
Web Tracking: Mechanisms, Implications, and Defenses
This articles surveys the existing literature on the methods currently used
by web services to track the user online as well as their purposes,
implications, and possible user's defenses. A significant majority of reviewed
articles and web resources are from years 2012-2014. Privacy seems to be the
Achilles' heel of today's web. Web services make continuous efforts to obtain
as much information as they can about the things we search, the sites we visit,
the people with who we contact, and the products we buy. Tracking is usually
performed for commercial purposes. We present 5 main groups of methods used for
user tracking, which are based on sessions, client storage, client cache,
fingerprinting, or yet other approaches. A special focus is placed on
mechanisms that use web caches, operational caches, and fingerprinting, as they
are usually very rich in terms of using various creative methodologies. We also
show how the users can be identified on the web and associated with their real
names, e-mail addresses, phone numbers, or even street addresses. We show why
tracking is being used and its possible implications for the users (price
discrimination, assessing financial credibility, determining insurance
coverage, government surveillance, and identity theft). For each of the
tracking methods, we present possible defenses. Apart from describing the
methods and tools used for keeping the personal data away from being tracked,
we also present several tools that were used for research purposes - their main
goal is to discover how and by which entity the users are being tracked on
their desktop computers or smartphones, provide this information to the users,
and visualize it in an accessible and easy to follow way. Finally, we present
the currently proposed future approaches to track the user and show that they
can potentially pose significant threats to the users' privacy.Comment: 29 pages, 212 reference
- …