937 research outputs found

    An Empirical Examination of the Associations between Social Tags and Web Queries

    Get PDF
    Introduction. We aim to discover the associations between social tags for a Web page and Web queries that would retrieve the same Webpage in three major search engines. Method. 4,827 query terms were submitted to the three major search engines to acquire search engine results pages. A series of Perl scripts were written to read search engine results pages and to identify, analyse, and extract organic links Analysis. Web pages from the organic links in search engine results pages were examined to see whether and how they had been tagged in Delicious. Only the Webpages tagged by at least 100 taggers were included in this study. The top thirty popular social tags used were harvested. The two sets of data were quantitatively analysed to investigate the research questions. Results. At least 60% of search engines\u27 query terms overlapped with social tags in Delicious; higher ranked social tags were more likely to be used as query terms for the same Web resources; and the co-occurring pattern of query terms and social tags over social ranking resembled a power law distribution. Conclusions. Socially tagged resources are likely to be highly ranked in search engine results pages. The findings can be applicable to the future study of Web resource related tasks such as Web searching and Web indexing

    BlogForever: D2.5 Weblog Spam Filtering Report and Associated Methodology

    Get PDF
    This report is written as a first attempt to define the BlogForever spam detection strategy. It comprises a survey of weblog spam technology and approaches to their detection. While the report was written to help identify possible approaches to spam detection as a component within the BlogForver software, the discussion has been extended to include observations related to the historical, social and practical value of spam, and proposals of other ways of dealing with spam within the repository without necessarily removing them. It contains a general overview of spam types, ready-made anti-spam APIs available for weblogs, possible methods that have been suggested for preventing the introduction of spam into a blog, and research related to spam focusing on those that appear in the weblog context, concluding in a proposal for a spam detection workflow that might form the basis for the spam detection component of the BlogForever software

    Piggy Bank: Experience the Semantic Web Inside Your Web Browser

    Get PDF
    The original publication is available at www.springerlink.com http://dx.doi.org/10.1007/11574620_31The Semantic Web Initiative envisions a Web wherein information is offered free of presentation, allowing more effective exchange and mixing across web sites and across web pages. But without substantial Semantic Web content, few tools will be written to consume it; without many such tools, there is little appeal to publish Semantic Web content. To break this chicken-and-egg problem, thus enabling more flexible information access, we have created a web browser extension called Piggy Bankthat lets users make use of Semantic Web content within Web content as users browse the Web. Wherever Semantic Web content is not available, Piggy Bank can invoke screenscrapers to restructure information within web pages into Semantic Web format. Through the use of Semantic Web technologies, Piggy Bank provides direct, immediate benefits to users in their use of the existing Web. Thus, the existence of even just a few Semantic Web-enabled sites or a few scrapers already benefits users. Piggy Bank thereby offers an easy, incremental upgrade path to users without requiring a wholesale adoption of the Semantic Web’s vision. To further improve this Semantic Web experience, we have created Semantic Bank, a web server application that lets Piggy Bank users share the Semantic Web information they have collected, enabling collaborative efforts to build sophisticated Semantic Web information repositories through simple, everyday’s use of Piggy Bank

    A Large-Scale Study of Phishing PDF Documents

    Full text link
    Phishing PDFs are malicious PDF documents that do not embed malware but trick victims into visiting malicious web pages leading to password theft or drive-by downloads. While recent reports indicate a surge of phishing PDFs, prior works have largely neglected this new threat, positioning phishing PDFs as accessories distributed via email phishing campaigns. This paper challenges this belief and presents the first systematic and comprehensive study centered on phishing PDFs. Starting from a real-world dataset, we first identify 44 phishing PDF campaigns via clustering and characterize them by looking at their volumetric, temporal, and visual features. Among these, we identify three large campaigns covering 89% of the dataset, exhibiting significantly different volumetric and temporal properties compared to classical email phishing, and relying on web UI elements as visual baits. Finally, we look at the distribution vectors and show that phishing PDFs are not only distributed via attachments but also via SEO attacks, placing phishing PDFs outside the email distribution ecosystem. This paper also assesses the usefulness of the VirusTotal scoring system, showing that phishing PDFs are ranked considerably low, creating a blind spot for organizations. While URL blocklists can help to prevent victims from visiting the attack web pages, PDF documents seem not subjected to any form of content-based filtering or detection

    Toward location-aware Web: extraction method, applications and evaluation

    Get PDF
    Location-based services (LBS) belong to one of the most popular types of services today. However, a recurring issue is that most of the content in LBS has to be created from scratch and needs to be explicitly tagged to locations, which makes existing Web content not directly usable for LBS. In this paper, we aim at making Web sites location-aware and feed this information to LBS. Our approach toward location-aware Web is threefold: First, we present a location extraction method: SALT. It receives Web sites as input and equips them with location tags. Compared to other approaches, SALT is capable of extracting locations with a precision up to the street level. Performance evaluations further show high applicability for practice. Second, we present three applications for SALT: Webnear.me, Local Browsing and Local Facebook. Webnear.me offers location-aware Web surfing through a mobile Web site and a smartphone app. Local Browsing adds the feature to browse by nearby tags, extracted from Web sites delivered by SALT. Local Facebook extends location tagging to social networks, allowing to run SALT on one's own and one's friends' timeline. Finally, we evaluate SALT for technology acceptance of Webnear.me through a formative user study. Through real user data, collected during a 3 months pilot field deployment of Webnear.me, we assess whether SALT is a proper instance of "location of a Web site”

    Social bookmarking in the classroom

    Get PDF
    The vast amount of content on the Internet causes complications when struggling to tame it. The purpose of this literature review is to uncover the viability of social bookmarking for managing Internet content for classroom learning. It also reveals how collaboration in social bookmarking can increase its effectiveness in the classroom and how social bookmarking models best facilitate learning. Sources researched were published in the last seven years, when social bookmarking started to become widely recognized. Studies in the areas of content organization, searching, collaboration, and education were reviewed. The conclusions acknowledge social bookmarking as not a replacement for, but a complement to more traditional methods of managing Internet content

    Folksonomy: the New Way to Serendipity

    Get PDF
    Folksonomy expands the collaborative process by allowing contributors to index content. It rests on three powerful properties: the absence of a prior taxonomy, multi-indexation and the absence of thesaurus. It concerns a more exploratory search than an entry in a search engine. Its original relationship-based structure (the three-way relationship between users, content and tags) means that folksonomy allows various modalities of curious explorations: a cultural exploration and a social exploration. The paper has two goals. Firstly, it tries to draw a general picture of the various folksonomy websites. Secundly, since labelling lacks any standardisation, folksonomies are often under threat of invasion by noise. This paper consequently tries to explore the different possible ways of regulating the self-generated indexation process.taxonomy; indexation; innovation and user-created content

    Social Computing: An Overview

    Get PDF
    A collection of technologies termed social computing is driving a dramatic evolution of the Web, matching the dot-com era in growth, excitement, and investment. All of these share high degree of community formation, user level content creation, and computing, and a variety of other characteristics. We provide an overview of social computing and identify salient characteristics. We argue that social computing holds tremendous disruptive potential in the business world and can significantly impact society, and outline possible changes in organized human action that could be brought about. Social computing can also have deleterious effects associated with it, including security issues. We suggest that social computing should be a priority for researchers and business leaders and illustrate the fundamental shifts in communication, computing, collaboration, and commerce brought about by this trend
    • 

    corecore