50 research outputs found

    Lawful Hacking: Using Existing Vulnerabilities for Wiretapping on the Internet

    Get PDF
    For years, legal wiretapping was straightforward: the officer doing the intercept connected a tape recorder or the like to a single pair of wires. By the 1990s, however, the changing structure of telecommunications—there was no longer just “Ma Bell” to talk to—and new technologies such as ISDN and cellular telephony made executing a wiretap more complicated for law enforcement. Simple technologies would no longer suffice. In response, Congress passed the Communications Assistance for Law Enforcement Act (CALEA) which mandated a standardized lawful intercept interface on all local phone switches. Since its passage, technology has continued to progress, and in the face of new forms of communication—Skype, voice chat during multiplayer online games, instant messaging, etc.—law enforcement is again experiencing problems. The FBI has called this “Going Dark”: their loss of access to suspects’ communication. According to news reports, law enforcement wants changes to the wiretap laws to require a CALEA-like interface in Internet software. CALEA, though, has its own issues: it is complex software specifically intended to create a security hole—eavesdropping capability—in the already-complex environment of a phone switch. It has unfortunately made wiretapping easier for everyone, not just law enforcement. Congress failed to heed experts’ warnings of the danger posed by this mandated vulnerability, and time has proven the experts right. The so-called “Athens Affair,” where someone used the built-in lawful intercept mechanism to listen to the cell phone calls of high Greek officials, including the Prime Minister, is but one example. In an earlier work, we showed why extending CALEA to the Internet would create very serious problems, including the security problems it has visited on the phone system. In this paper, we explore the viability and implications of an alternative method for addressing law enforcements need to access communications: legalized hacking of target devices through existing vulnerabilities in end-user software and platforms. The FBI already uses this approach on a small scale; we expect that its use will increase, especially as centralized wiretapping capabilities become less viable. Relying on vulnerabilities and hacking poses a large set of legal and policy questions, some practical and some normative. Among these are: (1) Will it create disincentives to patching? (2) Will there be a negative effect on innovation? (Lessons from the so-called “Crypto Wars” of the 1990s, and in particular the debate over export controls on cryptography, are instructive here.) (3) Will law enforcement’s participation in vulnerabilities purchasing skew the market? (4) Do local and even state law enforcement agencies have the technical sophistication to develop and use exploits? If not, how should this be handled? A larger FBI role? (5) Should law enforcement even be participating in a market where many of the sellers and other buyers are themselves criminals? (6) What happens if these tools are captured and repurposed by miscreants? (7) Should we sanction otherwise illegal network activity to aid law enforcement? (8) Is the probability of success from such an approach too low for it to be useful? As we will show, these issues are indeed challenging. We regard the issues raised by using vulnerabilities as, on balance, preferable to adding more complexity and insecurity to online systems

    Mining complex trees for hidden fruit : a graph–based computational solution to detect latent criminal networks : a thesis presented in partial fulfilment of the requirements for the degree of Doctor of Philosophy in Information Technology at Massey University, Albany, New Zealand.

    Get PDF
    The detection of crime is a complex and difficult endeavour. Public and private organisations – focusing on law enforcement, intelligence, and compliance – commonly apply the rational isolated actor approach premised on observability and materiality. This is manifested largely as conducting entity-level risk management sourcing ‘leads’ from reactive covert human intelligence sources and/or proactive sources by applying simple rules-based models. Focusing on discrete observable and material actors simply ignores that criminal activity exists within a complex system deriving its fundamental structural fabric from the complex interactions between actors - with those most unobservable likely to be both criminally proficient and influential. The graph-based computational solution developed to detect latent criminal networks is a response to the inadequacy of the rational isolated actor approach that ignores the connectedness and complexity of criminality. The core computational solution, written in the R language, consists of novel entity resolution, link discovery, and knowledge discovery technology. Entity resolution enables the fusion of multiple datasets with high accuracy (mean F-measure of 0.986 versus competitors 0.872), generating a graph-based expressive view of the problem. Link discovery is comprised of link prediction and link inference, enabling the high-performance detection (accuracy of ~0.8 versus relevant published models ~0.45) of unobserved relationships such as identity fraud. Knowledge discovery uses the fused graph generated and applies the “GraphExtract” algorithm to create a set of subgraphs representing latent functional criminal groups, and a mesoscopic graph representing how this set of criminal groups are interconnected. Latent knowledge is generated from a range of metrics including the “Super-broker” metric and attitude prediction. The computational solution has been evaluated on a range of datasets that mimic an applied setting, demonstrating a scalable (tested on ~18 million node graphs) and performant (~33 hours runtime on a non-distributed platform) solution that successfully detects relevant latent functional criminal groups in around 90% of cases sampled and enables the contextual understanding of the broader criminal system through the mesoscopic graph and associated metadata. The augmented data assets generated provide a multi-perspective systems view of criminal activity that enable advanced informed decision making across the microscopic mesoscopic macroscopic spectrum

    low-energy mobile packet radio networks : routing, scheduling, and architecture

    Get PDF
    Thesis (Ph.D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, February 2000.Includes bibliographical references (leaves 171-176).Packet Radio Networks (PRNETs), which are also called ad-hoc networks, have the capability of fast (and ad-hoc) deployment and set-up, and therefore potentially have several useful civilian and military applications. Building low-energy PRNETs is an important design goal, because the communication devices are typically powered by batteries, and therefore are useless when the batteries are depleted. We choose to look at low-energy PRNETs by focusing on the problem of minimum-energy communication over a PRNET, resolving any related issues or design decisions in a manner consistent with the overall goal of low-energy PRNETs. We conclude that the problem of minimum-energy communication over a PRNET is really a joint routing-scheduling-topological problem. We find the joint problem to be intractable, and therefore propose to solve it by decomposing it, solving each component separately. The resulting solution is not optimal but the degree of suboptimality depends on how the problem is decomposed. Therefore we compare different decomposition methods, and select the one that is likely to yield the best solution to the joint problem. After deciding how to decompose the joint problem, we study the separate components. For the topological problem we decide that nodes should communicate with a limited number of other nodes, referred to as neighbors. We also propose and analyze the performance of a procedure for managing the set of neighbors. For the scheduling problem, we propose a novel and practical class of scheduling algorithms. The routing problem is more complex than wireline routing because of interference and fading. When they are incorporated, routing becomes a non-convex problem; and we overcome this by a novel approach that is non-optimal, but is more robust than the optimal approach.by Hisham Ibrahim Kassab.Ph.D

    Graphical Models and Symmetries : Loopy Belief Propagation Approaches

    Get PDF
    Whenever a person or an automated system has to reason in uncertain domains, probability theory is necessary. Probabilistic graphical models allow us to build statistical models that capture complex dependencies between random variables. Inference in these models, however, can easily become intractable. Typical ways to address this scaling issue are inference by approximate message-passing, stochastic gradients, and MapReduce, among others. Exploiting the symmetries of graphical models, however, has not yet been considered for scaling statistical machine learning applications. One instance of graphical models that are inherently symmetric are statistical relational models. These have recently gained attraction within the machine learning and AI communities and combine probability theory with first-order logic, thereby allowing for an efficient representation of structured relational domains. The provided formalisms to compactly represent complex real-world domains enable us to effectively describe large problem instances. Inference within and training of graphical models, however, have not been able to keep pace with the increased representational power. This thesis tackles two major aspects of graphical models and shows that both inference and training can indeed benefit from exploiting symmetries. It first deals with efficient inference exploiting symmetries in graphical models for various query types. We introduce lifted loopy belief propagation (lifted LBP), the first lifted parallel inference approach for relational as well as propositional graphical models. Lifted LBP can effectively speed up marginal inference, but cannot straightforwardly be applied to other types of queries. Thus we also demonstrate efficient lifted algorithms for MAP inference and higher order marginals, as well as the efficient handling of multiple inference tasks. Then we turn to the training of graphical models and introduce the first lifted online training for relational models. Our training procedure and the MapReduce lifting for loopy belief propagation combine lifting with the traditional statistical approaches to scaling, thereby bridging the gap between statistical relational learning and traditional statistical machine learning

    Algorithms to Explore the Structure and Evolution of Biological Networks

    Get PDF
    High-throughput experimental protocols have revealed thousands of relationships amongst genes and proteins under various conditions. These putative associations are being aggressively mined to decipher the structural and functional architecture of the cell. One useful tool for exploring this data has been computational network analysis. In this thesis, we propose a collection of novel algorithms to explore the structure and evolution of large, noisy, and sparsely annotated biological networks. We first introduce two information-theoretic algorithms to extract interesting patterns and modules embedded in large graphs. The first, graph summarization, uses the minimum description length principle to find compressible parts of the graph. The second, VI-Cut, uses the variation of information to non-parametrically find groups of topologically cohesive and similarly annotated nodes in the network. We show that both algorithms find structure in biological data that is consistent with known biological processes, protein complexes, genetic diseases, and operational taxonomic units. We also propose several algorithms to systematically generate an ensemble of near-optimal network clusterings and show how these multiple views can be used together to identify clustering dynamics that any single solution approach would miss. To facilitate the study of ancient networks, we introduce a framework called ``network archaeology'') for reconstructing the node-by-node and edge-by-edge arrival history of a network. Starting with a present-day network, we apply a probabilistic growth model backwards in time to find high-likelihood previous states of the graph. This allows us to explore how interactions and modules may have evolved over time. In experiments with real-world social and biological networks, we find that our algorithms can recover significant features of ancestral networks that have long since disappeared. Our work is motivated by the need to understand large and complex biological systems that are being revealed to us by imperfect data. As data continues to pour in, we believe that computational network analysis will continue to be an essential tool towards this end

    Improving the Scalability of High Performance Computer Systems

    Full text link
    Improving the performance of future computing systems will be based upon the ability of increasing the scalability of current technology. New paths need to be explored, as operating principles that were applied up to now are becoming irrelevant for upcoming computer architectures. It appears that scaling the number of cores, processors and nodes within an system represents the only feasible alternative to achieve Exascale performance. To accomplish this goal, we propose three novel techniques addressing different layers of computer systems. The Tightly Coupled Cluster technique significantly improves the communication for inter node communication within compute clusters. By improving the latency by an order of magnitude over existing solutions the cost of communication is considerably reduced. This enables to exploit fine grain parallelism within applications, thereby, extending the scalability considerably. The mechanism virtually moves the network interconnect into the processor, bypassing the latency of the I/O interface and rendering protocol conversions unnecessary. The technique is implemented entirely through firmware and kernel layer software utilizing off-the-shelf AMD processors. We present a proof-of-concept implementation and real world benchmarks to demonstrate the superior performance of our technique. In particular, our approach achieves a software-to-software communication latency of 240 ns between two remote compute nodes. The second part of the dissertation introduces a new framework for scalable Networks-on-Chip. A novel rapid prototyping methodology is proposed, that accelerates the design and implementation substantially. Due to its flexibility and modularity a large application space is covered ranging from Systems-on-chip, to high performance many-core processors. The Network-on-Chip compiler enables to generate complex networks in the form of synthesizable register transfer level code from an abstract design description. Our engine supports different target technologies including Field Programmable Gate Arrays and Application Specific Integrated Circuits. The framework enables to build large designs while minimizing development and verification efforts. Many topologies and routing algorithms are supported by partitioning the tasks into several layers and by the introduction of a protocol agnostic architecture. We provide a thorough evaluation of the design that shows excellent results regarding performance and scalability. The third part of the dissertation addresses the Processor-Memory Interface within computer architectures. The increasing compute power of many-core processors, leads to an equally growing demand for more memory bandwidth and capacity. Current processor designs exhibit physical limitations that restrict the scalability of main memory. To address this issue we propose a memory extension technique that attaches large amounts of DRAM memory to the processor via a low pin count interface using high speed serial transceivers. Our technique transparently integrates the extension memory into the system architecture by providing full cache coherency. Therefore, applications can utilize the memory extension by applying regular shared memory programming techniques. By supporting daisy chained memory extension devices and by introducing the asymmetric probing approach, the proposed mechanism ensures high scalability. We furthermore propose a DMA offloading technique to improve the performance of the processor memory interface. The design has been implemented in a Field Programmable Gate Array based prototype. Driver software and firmware modifications have been developed to bring up the prototype in a Linux based system. We show microbenchmarks that prove the feasibility of our design

    Distributed WEB caching

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Visualizing multidimensional data similarities:Improvements and applications

    Get PDF
    Multidimensional data is increasingly more prominent and important in many application domains. Such data typically consist of a large set of elements, each of which described by several measurements (dimensions). During the design of techniques and tools to process this data, a key component is to gather insights into their structure and patterns, which can be described by the notion of similarity between elements. Among these techniques, multidimensional projections and similarity trees can effectively capture similarity patterns and handle a large number of data elements and dimensions. However, understanding and interpreting these patterns in terms of the original data dimensions is still hard. This thesis addresses the development of visual explanatory techniques for the easy interpretation of similarity patterns present in multidimensional projections and similarity trees, by several contributions. First, we propose methods that make the computation of similarity trees efficient for large datasets, and also enhance its visual representation to allow the exploration of more data in a limited screen. Secondly, we propose methods for the visual explanation of multidimensional projections in terms of groups of similar elements. These are automatically annotated to describe which dimensions are more important to define their notion of group similarity. We show next how these explanatory mechanisms can be adapted to handle both static and time-dependent data. Our proposed techniques are designed to be easy to use, work nearly automatically, and are demonstrated on a variety of real-world large data obtained from image collections, text archives, scientific measurements, and software engineering
    corecore