937 research outputs found
An Empirical Examination of the Associations between Social Tags and Web Queries
Introduction. We aim to discover the associations between social tags for a Web page and Web queries that would retrieve the same Webpage in three major search engines.
Method. 4,827 query terms were submitted to the three major search engines to acquire search engine results pages. A series of Perl scripts were written to read search engine results pages and to identify, analyse, and extract organic links
Analysis. Web pages from the organic links in search engine results pages were examined to see whether and how they had been tagged in Delicious. Only the Webpages tagged by at least 100 taggers were included in this study. The top thirty popular social tags used were harvested. The two sets of data were quantitatively analysed to investigate the research questions.
Results. At least 60% of search engines\u27 query terms overlapped with social tags in Delicious; higher ranked social tags were more likely to be used as query terms for the same Web resources; and the co-occurring pattern of query terms and social tags over social ranking resembled a power law distribution.
Conclusions. Socially tagged resources are likely to be highly ranked in search engine results pages. The findings can be applicable to the future study of Web resource related tasks such as Web searching and Web indexing
BlogForever: D2.5 Weblog Spam Filtering Report and Associated Methodology
This report is written as a first attempt to define the BlogForever spam detection strategy. It comprises a survey of weblog spam technology and approaches to their detection. While the report was written to help identify possible approaches to spam detection as a component within the BlogForver software, the discussion has been extended to include observations related to the historical, social and practical value of spam, and proposals of other ways of dealing with spam within the repository without necessarily removing them. It contains a general overview of spam types, ready-made anti-spam APIs available for weblogs, possible methods that have been suggested for preventing the introduction of spam into a blog, and research related to spam focusing on those that appear in the weblog context, concluding in a proposal for a spam detection workflow that might form the basis for the spam detection component of the BlogForever software
Piggy Bank: Experience the Semantic Web Inside Your Web Browser
The original publication is available at www.springerlink.com
http://dx.doi.org/10.1007/11574620_31The Semantic Web Initiative envisions a Web wherein information is offered free of presentation, allowing more effective exchange and mixing across web sites and across web pages. But without substantial Semantic Web content, few tools will be written to consume it; without many such tools, there is little appeal to publish Semantic Web content.
To break this chicken-and-egg problem, thus enabling more flexible information access, we have created a web browser extension called Piggy Bankthat lets users make use of Semantic Web content within Web content as users browse the Web. Wherever Semantic Web content is not available, Piggy Bank can invoke screenscrapers to restructure information within web pages into Semantic Web format. Through the use of Semantic Web technologies, Piggy Bank provides direct, immediate benefits to users in their use of the existing Web. Thus, the existence of even just a few Semantic Web-enabled sites or a few scrapers already benefits users. Piggy Bank thereby offers an easy, incremental upgrade path to users without requiring a wholesale adoption of the Semantic Webâs vision.
To further improve this Semantic Web experience, we have created Semantic Bank, a web server application that lets Piggy Bank users share the Semantic Web information they have collected, enabling collaborative efforts to build sophisticated Semantic Web information repositories through simple, everydayâs use of Piggy Bank
A Large-Scale Study of Phishing PDF Documents
Phishing PDFs are malicious PDF documents that do not embed malware but trick
victims into visiting malicious web pages leading to password theft or drive-by
downloads. While recent reports indicate a surge of phishing PDFs, prior works
have largely neglected this new threat, positioning phishing PDFs as
accessories distributed via email phishing campaigns.
This paper challenges this belief and presents the first systematic and
comprehensive study centered on phishing PDFs. Starting from a real-world
dataset, we first identify 44 phishing PDF campaigns via clustering and
characterize them by looking at their volumetric, temporal, and visual
features. Among these, we identify three large campaigns covering 89% of the
dataset, exhibiting significantly different volumetric and temporal properties
compared to classical email phishing, and relying on web UI elements as visual
baits. Finally, we look at the distribution vectors and show that phishing PDFs
are not only distributed via attachments but also via SEO attacks, placing
phishing PDFs outside the email distribution ecosystem.
This paper also assesses the usefulness of the VirusTotal scoring system,
showing that phishing PDFs are ranked considerably low, creating a blind spot
for organizations. While URL blocklists can help to prevent victims from
visiting the attack web pages, PDF documents seem not subjected to any form of
content-based filtering or detection
Toward location-aware Web: extraction method, applications and evaluation
Location-based services (LBS) belong to one of the most popular types of services today. However, a recurring issue is that most of the content in LBS has to be created from scratch and needs to be explicitly tagged to locations, which makes existing Web content not directly usable for LBS. In this paper, we aim at making Web sites location-aware and feed this information to LBS. Our approach toward location-aware Web is threefold: First, we present a location extraction method: SALT. It receives Web sites as input and equips them with location tags. Compared to other approaches, SALT is capable of extracting locations with a precision up to the street level. Performance evaluations further show high applicability for practice. Second, we present three applications for SALT: Webnear.me, Local Browsing and Local Facebook. Webnear.me offers location-aware Web surfing through a mobile Web site and a smartphone app. Local Browsing adds the feature to browse by nearby tags, extracted from Web sites delivered by SALT. Local Facebook extends location tagging to social networks, allowing to run SALT on one's own and one's friends' timeline. Finally, we evaluate SALT for technology acceptance of Webnear.me through a formative user study. Through real user data, collected during a 3 months pilot field deployment of Webnear.me, we assess whether SALT is a proper instance of "location of a Web siteâ
Social bookmarking in the classroom
The vast amount of content on the Internet causes complications when struggling to tame it. The purpose of this literature review is to uncover the viability of social bookmarking for managing Internet content for classroom learning. It also reveals how collaboration in social bookmarking can increase its effectiveness in the classroom and how social bookmarking models best facilitate learning. Sources researched were published in the last seven years, when social bookmarking started to become widely recognized. Studies in the areas of content organization, searching, collaboration, and education were reviewed. The conclusions acknowledge social bookmarking as not a replacement for, but a complement to more traditional methods of managing Internet content
Folksonomy: the New Way to Serendipity
Folksonomy expands the collaborative process by allowing contributors to index content. It rests on three powerful properties: the absence of a prior taxonomy, multi-indexation and the absence of thesaurus. It concerns a more exploratory search than an entry in a search engine. Its original relationship-based structure (the three-way relationship between users, content and tags) means that folksonomy allows various modalities of curious explorations: a cultural exploration and a social exploration. The paper has two goals. Firstly, it tries to draw a general picture of the various folksonomy websites. Secundly, since labelling lacks any standardisation, folksonomies are often under threat of invasion by noise. This paper consequently tries to explore the different possible ways of regulating the self-generated indexation process.taxonomy; indexation; innovation and user-created content
Recommended from our members
NPCs as People, Too: The Extreme AI Personality Engine
PK Dick once asked âDo Androids Dream of Electric Sheep?â In video games, a similar question could be asked of non-player characters: Do NPCs have dreams? Can they live and change as humans do? Can NPCs have personalities, and can these develop through interactions with players, other NPCs, and the world around them? Despite advances in personality AI for games, most NPCs are still undeveloped and undeveloping, reacting with flat affect and predictable routines that make them far less than humanâ in fact, they become little more than bits of the scenery that give out parcels of information. This need not be the case. Extreme AI, a psychology-based personality engine, creates adaptive NPC personalities. Originally developed as part of the thesis âNPCs as People: Using Databases and Behaviour Trees to Give Non-Player Characters Personality,â Extreme AI is now a fully functioning personality engine using all thirty facets of the Five Factor model of personality and an AI system that is live throughout gameplay. This paper discusses the research leading to Extreme AI; develops the ideas found in that thesis; discusses the development of other personality engines; and provides examples of Extreme AIâs use in two game demos
Social Computing: An Overview
A collection of technologies termed social computing is driving a dramatic evolution of the Web, matching the dot-com era in growth, excitement, and investment. All of these share high degree of community formation, user level content creation, and computing, and a variety of other characteristics. We provide an overview of social computing and identify salient characteristics. We argue that social computing holds tremendous disruptive potential in the business world and can significantly impact society, and outline possible changes in organized human action that could be brought about. Social computing can also have deleterious effects associated with it, including security issues. We suggest that social computing should be a priority for researchers and business leaders and illustrate the fundamental shifts in communication, computing, collaboration, and commerce brought about by this trend
- âŠ