Search CORE

475 research outputs found

Protecting Web Contents Against Persistent Crawlers

Author: Wan Shengye
Publication venue: W&M ScholarWorks
Publication date: 01/10/2016
Field of study

Web crawlers have been developed for several malicious purposes like downloading server data without permission from website administrator. Armored stealthy crawlers are evolving against new anti-crawler mechanisms in the arms race between the crawler developers and crawler defenders. In this paper, we develop a new anti-crawler mechanism called PathMarker to detect and constrain crawlers that crawl content of servers stealthy and persistently. The basic idea is to add a marker to each web page URL and then encrypt the URL and marker. By using the URL path and user information contained in the marker as the novel features of machine learning, we could accurately detect stealthy crawlers at the earliest stage. Besides effectively detecting crawlers, PathMarker can also dramatically suppress the efficiency of crawlers before they are detected by misleading the crawlers visiting same page\u27s URL with different markers. We deploy our approach on a forum website to collect normal users\u27 data. The evaluation results show that PathMarker can quickly capture all 12 open-source and in-house crawlers, plus two external crawlers (i.e., Googlebots and Yahoo Slurp)

College of William & Mary: W&M Publish

State of the art 2015: a literature review of social media intelligence capabilities for counter-terrorism

Author: Jamie Bartlett
Louis Reynolds
Publication venue: Demos
Publication date
Field of study

Overview This paper is a review of how information and insight can be drawn from open social media sources. It focuses on the specific research techniques that have emerged, the capabilities they provide, the possible insights they offer, and the ethical and legal questions they raise. These techniques are considered relevant and valuable in so far as they can help to maintain public safety by preventing terrorism, preparing for it, protecting the public from it and pursuing its perpetrators. The report also considers how far this can be achieved against the backdrop of radically changing technology and public attitudes towards surveillance. This is an updated version of a 2013 report paper on the same subject, State of the Art. Since 2013, there have been significant changes in social media, how it is used by terrorist groups, and the methods being developed to make sense of it.  The paper is structured as follows: Part 1 is an overview of social media use, focused on how it is used by groups of interest to those involved in counter-terrorism. This includes new sections on trends of social media platforms; and a new section on Islamic State (IS). Part 2 provides an introduction to the key approaches of social media intelligence (henceforth ‘SOCMINT’) for counter-terrorism. Part 3 sets out a series of SOCMINT techniques. For each technique a series of capabilities and insights are considered, the validity and reliability of the method is considered, and how they might be applied to counter-terrorism work explored. Part 4 outlines a number of important legal, ethical and practical considerations when undertaking SOCMINT work

Analysis and Policy Observatory (APO)

JISC Preservation of Web Resources (PoWR) Handbook

Author: Ashley Kevin
Davis Richard M.
Guy Marieke
Kelly Brian
Pinsent Edward
Publication venue
Publication date: 01/11/2008
Field of study

Handbook of Web Preservation produced by the JISC-PoWR project which ran from April to November 2008. The handbook specifically addresses digital preservation issues that are relevant to the UK HE/FE web management community”. The project was undertaken jointly by UKOLN at the University of Bath and ULCC Digital Archives department

Understanding emerging client-Side web vulnerabilities using dynamic program analysis

Author: Steffens Marius
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2021
Field of study

Today's Web heavily relies on JavaScript as it is the main driving force behind the plethora of Web applications that we enjoy daily. The complexity and amount of this client-side code have been steadily increasing over the years. At the same time, new vulnerabilities keep being uncovered, for which we mostly rely on manual analysis of security experts. Unfortunately, such manual efforts do not scale to the problem space at hand. Therefore in this thesis, we present techniques capable of finding vulnerabilities automatically and at scale that originate from malicious inputs to postMessage handlers, polluted prototypes, and client-side storage mechanisms. Our results highlight that the investigated vulnerabilities are prevalent even among the most popular sites, showing the need for automated systems that help developers uncover them in a timely manner. Using the insights gained during our empirical studies, we provide recommendations for developers and browser vendors to tackle the underlying problems in the future. Furthermore, we show that security mechanisms designed to mitigate such and similar issues cannot currently be deployed by first-party applications due to their reliance on third-party functionality. This leaves developers in a no-win situation, in which either functionality can be preserved or security enforced.JavaScript ist die treibende Kraft hinter all den Web Applikationen, die wir heutzutage täglich nutzen. Allerdings ist über die Zeit hinweg gesehen die Masse, aber auch die Komplexität, von Client-seitigem JavaScript Code stetig gestiegen. Außerdem finden Sicherheitsexperten immer wieder neue Arten von Verwundbarkeiten, meistens durch manuelle Analyse des Codes. In diesem Werk untersuchen wir deshalb Methodiken, mit denen wir automatisch Verwundbarkeiten finden können, die von postMessages, veränderten Prototypen, oder Werten aus Client-seitigen Persistenzmechnanismen stammen. Unsere Ergebnisse zeigen, dass die untersuchten Schwachstellen selbst unter den populärsten Websites weit verbreitet sind, was den Bedarf an automatisierten Systemen zeigt, die Entwickler bei der rechtzeitigen Aufdeckung dieser Schwachstellen unterstützen. Anhand der in unseren empirischen Studien gewonnenen Erkenntnissen geben wir Empfehlungen für Entwickler und Browser-Anbieter, um die zugrunde liegenden Probleme in Zukunft anzugehen. Zudem zeigen wir auf, dass Sicherheitsmechanismen, die solche und ähnliche Probleme mitigieren sollen, derzeit nicht von Seitenbetreibern eingesetzt werden können, da sie auf die Funktionalität von Drittanbietern angewiesen sind. Dies zwingt den Seitenbetreiber dazu, zwischen Funktionalität und Sicherheit zu wählen

Universaar

Acronym

Privacy Issues of the W3C Geolocation API

Author: Doty Nick
Mulligan Deirdre K.
Wilde Erik
Publication venue
Publication date: 01/01/2010
Field of study

The W3C's Geolocation API may rapidly standardize the transmission of location information on the Web, but, in dealing with such sensitive information, it also raises serious privacy concerns. We analyze the manner and extent to which the current W3C Geolocation API provides mechanisms to support privacy. We propose a privacy framework for the consideration of location information and use it to evaluate the W3C Geolocation API, both the specification and its use in the wild, and recommend some modifications to the API as a result of our analysis

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California

An automated system to search, track, classify and report sensitive information exposed on an intranet

Author: Duarte Tiago Filipe Eleutério
Publication venue
Publication date: 01/01/2015
Field of study

Tese de mestrado em Segurança Informática, Universidade de Lisboa, Faculdade de Ciências, 2015Through time, enterprises have been focusing their main attentions towards cyber attacks against their infrastructures derived from the outside and so they end, somehow, underrating the existing dangers on their internal network. This leads to a low importance given to the information available to every employee connected to the internal network, may it be of a sensitive nature and most likely should not be available to everyone’s access. Currently, the detection of documents with sensitive or confidential information unduly exposed on PTP’s (Portugal Telecom Portugal) internal network is a rather time consuming manual process. This project’s contribution is Hound, an automated system that searches for documents, exposed to all employees, with possible sensitive content and classifies them according to its degree of sensitivity, generating reports with that gathered information. This system was integrated in a PT project of larger dimensions, in order to provide DCY (Cybersecurity Department) with mechanisms to improve its effectiveness on the vulnerability detection area, in terms of exposure of files/documents with sensitive or confidential information in its internal network.Ao longo do tempo, as empresas têm vindo a focar as suas principais atenções para os ataques contra as suas infraestruturas provenientes do exterior acabando por, de certa forma, menosprezar os perigos existentes no interior da sua rede. Isto leva a que não dêem a devida importância à informação que está disponível para todos os funcionários na rede interna, podendo a mesma ser de caráter sensível e que muito provavelmente não deveria estar disponível para o acesso de todos. Atualmente, a deteção de ficheiros com informação sensível ou confidencial indevidamente expostos na rede interna da PTP (Portugal Telecom Portugal) é um processo manual bastante moroso. A contribuição deste projeto é o Hound, um sistema automatizado que procura documentos, expostos aos colaboradores, com conteúdo potencialmente sensível. Estes documentos são classificados de acordo com o seu grau de sensibilidade, gerando relatórios com a informação obtida. Este sistema foi integrado num projeto de maiores dimensões da PT de forma a dotar o Departamento de Cibersegurança dos mecanismos necessários a melhorar a sua eficácia nas áreas de deteção de vulnerabilidades, em termos de exposição de ficheiros/documentos com informação sensível ou confidencial na sua rede interna

Universidade de Lisboa: Repositório.UL

Cookie Dialogs and their Compliance: The quest for an automated audit process to enhance privacy regulation

Author: Aerts K.
Publication venue
Publication date: 22/07/2021
Field of study

Open University of the Netherlands Research Portal

CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

Author: Boujemaa Nozha
Compañó Ramón
Dosch Christoph
Geurts Joost
Karlgren Jussi
King Paul
Kompatsiaris Yiannis
Köhler Joachim
Le Moine Jean-Yves
Ortgies Robert
Point Jean-Charles
Rotenberg Boris
Rudström Åsa
Sebe Nicu
Publication venue: Chorus Project Consortium
Publication date: 01/01/2007
Field of study

Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

RISE – Research Institutes of Sweden

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Swedish Institute of Computer Science Publications Database

Software institutes' Online Digital Archive

Don’t Trust The Locals: Investigating the Prevalence of Persistent Client-Side Cross-Site Scripting in the Wild.

Author: Johns Martin
Rossow Christian
Steffens Marius
Stock Ben
Publication venue
Publication date: 01/01/2019
Field of study

The Web has become highly interactive and an important driver for modern life, enabling information retrieval, social exchange, and online shopping. From the security perspective, Cross-Site Scripting (XSS) is one of the most nefarious attacks against Web clients. Research has long since focused on three categories of XSS: Reflected, Persistent, and DOMbased XSS. In this paper, we argue that our community must consider at least four important classes of XSS, and present the first systematic study of the threat of Persistent Client-Side XSS, caused by the insecure use of client-side storage. While the existence of this class has been acknowledged, especially by the non-academic community like OWASP, prior works have either only found such flaws as side effects of other analyses or focused on a limited set of applications to analyze. Therefore, the community lacks in-depth knowledge about the actual prevalence of Persistent Client-Side XSS in the wild. To close this research gap, we leverage taint tracking to identify suspicious flows from client-side persistent storage (Web Storage, cookies) to dangerous sinks (HTML, JavaScript, and script.src). We discuss two attacker models capable of injecting malicious payloads into storage, i.e., a Network Attacker capable of temporarily hijacking HTTP communication (e.g., in a public WiFi), and a Web Attacker who can leverage flows into storage or an existing reflected XSS flaw to persist their payload. With our taint-aware browser and these models in mind, we study the prevalence of Persistent Client-Side XSS in the Alexa Top 5,000 domains. We find that more than 8% of them have unfiltered data flows from persistent storage to a dangerous sink, which showcases the developers’ inherent trust in the integrity of storage content. Even worse, if we only consider sites that make use of data originating from storage, 21% of the sites are vulnerable. For those sites with vulnerable flows from storage to sink, we find that at least 70% are directly exploitable by our attacker models. Finally, investigating the vulnerable flows originating from storage allows us to categorize them into four disjoint categories and propose appropriate mitigations

CISPA – Helmholtz-Zentrum für Informationssicherheit

Crossref