475 research outputs found

    Protecting Web Contents Against Persistent Crawlers

    Get PDF
    Web crawlers have been developed for several malicious purposes like downloading server data without permission from website administrator. Armored stealthy crawlers are evolving against new anti-crawler mechanisms in the arms race between the crawler developers and crawler defenders. In this paper, we develop a new anti-crawler mechanism called PathMarker to detect and constrain crawlers that crawl content of servers stealthy and persistently. The basic idea is to add a marker to each web page URL and then encrypt the URL and marker. By using the URL path and user information contained in the marker as the novel features of machine learning, we could accurately detect stealthy crawlers at the earliest stage. Besides effectively detecting crawlers, PathMarker can also dramatically suppress the efficiency of crawlers before they are detected by misleading the crawlers visiting same page\u27s URL with different markers. We deploy our approach on a forum website to collect normal users\u27 data. The evaluation results show that PathMarker can quickly capture all 12 open-source and in-house crawlers, plus two external crawlers (i.e., Googlebots and Yahoo Slurp)

    State of the art 2015: a literature review of social media intelligence capabilities for counter-terrorism

    Get PDF
    Overview This paper is a review of how information and insight can be drawn from open social media sources. It focuses on the specific research techniques that have emerged, the capabilities they provide, the possible insights they offer, and the ethical and legal questions they raise. These techniques are considered relevant and valuable in so far as they can help to maintain public safety by preventing terrorism, preparing for it, protecting the public from it and pursuing its perpetrators. The report also considers how far this can be achieved against the backdrop of radically changing technology and public attitudes towards surveillance. This is an updated version of a 2013 report paper on the same subject, State of the Art. Since 2013, there have been significant changes in social media, how it is used by terrorist groups, and the methods being developed to make sense of it.  The paper is structured as follows: Part 1 is an overview of social media use, focused on how it is used by groups of interest to those involved in counter-terrorism. This includes new sections on trends of social media platforms; and a new section on Islamic State (IS). Part 2 provides an introduction to the key approaches of social media intelligence (henceforth ‘SOCMINT’) for counter-terrorism. Part 3 sets out a series of SOCMINT techniques. For each technique a series of capabilities and insights are considered, the validity and reliability of the method is considered, and how they might be applied to counter-terrorism work explored. Part 4 outlines a number of important legal, ethical and practical considerations when undertaking SOCMINT work

    JISC Preservation of Web Resources (PoWR) Handbook

    Get PDF
    Handbook of Web Preservation produced by the JISC-PoWR project which ran from April to November 2008. The handbook specifically addresses digital preservation issues that are relevant to the UK HE/FE web management community”. The project was undertaken jointly by UKOLN at the University of Bath and ULCC Digital Archives department

    Understanding emerging client-Side web vulnerabilities using dynamic program analysis

    Get PDF
    Today's Web heavily relies on JavaScript as it is the main driving force behind the plethora of Web applications that we enjoy daily. The complexity and amount of this client-side code have been steadily increasing over the years. At the same time, new vulnerabilities keep being uncovered, for which we mostly rely on manual analysis of security experts. Unfortunately, such manual efforts do not scale to the problem space at hand. Therefore in this thesis, we present techniques capable of finding vulnerabilities automatically and at scale that originate from malicious inputs to postMessage handlers, polluted prototypes, and client-side storage mechanisms. Our results highlight that the investigated vulnerabilities are prevalent even among the most popular sites, showing the need for automated systems that help developers uncover them in a timely manner. Using the insights gained during our empirical studies, we provide recommendations for developers and browser vendors to tackle the underlying problems in the future. Furthermore, we show that security mechanisms designed to mitigate such and similar issues cannot currently be deployed by first-party applications due to their reliance on third-party functionality. This leaves developers in a no-win situation, in which either functionality can be preserved or security enforced.JavaScript ist die treibende Kraft hinter all den Web Applikationen, die wir heutzutage tĂ€glich nutzen. Allerdings ist ĂŒber die Zeit hinweg gesehen die Masse, aber auch die KomplexitĂ€t, von Client-seitigem JavaScript Code stetig gestiegen. Außerdem finden Sicherheitsexperten immer wieder neue Arten von Verwundbarkeiten, meistens durch manuelle Analyse des Codes. In diesem Werk untersuchen wir deshalb Methodiken, mit denen wir automatisch Verwundbarkeiten finden können, die von postMessages, verĂ€nderten Prototypen, oder Werten aus Client-seitigen Persistenzmechnanismen stammen. Unsere Ergebnisse zeigen, dass die untersuchten Schwachstellen selbst unter den populĂ€rsten Websites weit verbreitet sind, was den Bedarf an automatisierten Systemen zeigt, die Entwickler bei der rechtzeitigen Aufdeckung dieser Schwachstellen unterstĂŒtzen. Anhand der in unseren empirischen Studien gewonnenen Erkenntnissen geben wir Empfehlungen fĂŒr Entwickler und Browser-Anbieter, um die zugrunde liegenden Probleme in Zukunft anzugehen. Zudem zeigen wir auf, dass Sicherheitsmechanismen, die solche und Ă€hnliche Probleme mitigieren sollen, derzeit nicht von Seitenbetreibern eingesetzt werden können, da sie auf die FunktionalitĂ€t von Drittanbietern angewiesen sind. Dies zwingt den Seitenbetreiber dazu, zwischen FunktionalitĂ€t und Sicherheit zu wĂ€hlen

    Privacy Issues of the W3C Geolocation API

    Full text link
    The W3C's Geolocation API may rapidly standardize the transmission of location information on the Web, but, in dealing with such sensitive information, it also raises serious privacy concerns. We analyze the manner and extent to which the current W3C Geolocation API provides mechanisms to support privacy. We propose a privacy framework for the consideration of location information and use it to evaluate the W3C Geolocation API, both the specification and its use in the wild, and recommend some modifications to the API as a result of our analysis

    An automated system to search, track, classify and report sensitive information exposed on an intranet

    Get PDF
    Tese de mestrado em Segurança InformĂĄtica, Universidade de Lisboa, Faculdade de CiĂȘncias, 2015Through time, enterprises have been focusing their main attentions towards cyber attacks against their infrastructures derived from the outside and so they end, somehow, underrating the existing dangers on their internal network. This leads to a low importance given to the information available to every employee connected to the internal network, may it be of a sensitive nature and most likely should not be available to everyone’s access. Currently, the detection of documents with sensitive or confidential information unduly exposed on PTP’s (Portugal Telecom Portugal) internal network is a rather time consuming manual process. This project’s contribution is Hound, an automated system that searches for documents, exposed to all employees, with possible sensitive content and classifies them according to its degree of sensitivity, generating reports with that gathered information. This system was integrated in a PT project of larger dimensions, in order to provide DCY (Cybersecurity Department) with mechanisms to improve its effectiveness on the vulnerability detection area, in terms of exposure of files/documents with sensitive or confidential information in its internal network.Ao longo do tempo, as empresas tĂȘm vindo a focar as suas principais atençÔes para os ataques contra as suas infraestruturas provenientes do exterior acabando por, de certa forma, menosprezar os perigos existentes no interior da sua rede. Isto leva a que nĂŁo dĂȘem a devida importĂąncia Ă  informação que estĂĄ disponĂ­vel para todos os funcionĂĄrios na rede interna, podendo a mesma ser de carĂĄter sensĂ­vel e que muito provavelmente nĂŁo deveria estar disponĂ­vel para o acesso de todos. Atualmente, a deteção de ficheiros com informação sensĂ­vel ou confidencial indevidamente expostos na rede interna da PTP (Portugal Telecom Portugal) Ă© um processo manual bastante moroso. A contribuição deste projeto Ă© o Hound, um sistema automatizado que procura documentos, expostos aos colaboradores, com conteĂșdo potencialmente sensĂ­vel. Estes documentos sĂŁo classificados de acordo com o seu grau de sensibilidade, gerando relatĂłrios com a informação obtida. Este sistema foi integrado num projeto de maiores dimensĂ”es da PT de forma a dotar o Departamento de Cibersegurança dos mecanismos necessĂĄrios a melhorar a sua eficĂĄcia nas ĂĄreas de deteção de vulnerabilidades, em termos de exposição de ficheiros/documentos com informação sensĂ­vel ou confidencial na sua rede interna

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    Don’t Trust The Locals: Investigating the Prevalence of Persistent Client-Side Cross-Site Scripting in the Wild.

    Get PDF
    The Web has become highly interactive and an important driver for modern life, enabling information retrieval, social exchange, and online shopping. From the security perspective, Cross-Site Scripting (XSS) is one of the most nefarious attacks against Web clients. Research has long since focused on three categories of XSS: Reflected, Persistent, and DOMbased XSS. In this paper, we argue that our community must consider at least four important classes of XSS, and present the first systematic study of the threat of Persistent Client-Side XSS, caused by the insecure use of client-side storage. While the existence of this class has been acknowledged, especially by the non-academic community like OWASP, prior works have either only found such flaws as side effects of other analyses or focused on a limited set of applications to analyze. Therefore, the community lacks in-depth knowledge about the actual prevalence of Persistent Client-Side XSS in the wild. To close this research gap, we leverage taint tracking to identify suspicious flows from client-side persistent storage (Web Storage, cookies) to dangerous sinks (HTML, JavaScript, and script.src). We discuss two attacker models capable of injecting malicious payloads into storage, i.e., a Network Attacker capable of temporarily hijacking HTTP communication (e.g., in a public WiFi), and a Web Attacker who can leverage flows into storage or an existing reflected XSS flaw to persist their payload. With our taint-aware browser and these models in mind, we study the prevalence of Persistent Client-Side XSS in the Alexa Top 5,000 domains. We find that more than 8% of them have unfiltered data flows from persistent storage to a dangerous sink, which showcases the developers’ inherent trust in the integrity of storage content. Even worse, if we only consider sites that make use of data originating from storage, 21% of the sites are vulnerable. For those sites with vulnerable flows from storage to sink, we find that at least 70% are directly exploitable by our attacker models. Finally, investigating the vulnerable flows originating from storage allows us to categorize them into four disjoint categories and propose appropriate mitigations
    • 

    corecore