25 research outputs found

    Community-Based Intrusion Detection

    Get PDF
    Today, virtually every company world-wide is connected to the Internet. This wide-spread connectivity has given rise to sophisticated, targeted, Internet-based attacks. For example, between 2012 and 2013 security researchers counted an average of about 74 targeted attacks per day. These attacks are motivated by economical, financial, or political interests and commonly referred to as “Advanced Persistent Threat (APT)” attacks. Unfortunately, many of these attacks are successful and the adversaries manage to steal important data or disrupt vital services. Victims are preferably companies from vital industries, such as banks, defense contractors, or power plants. Given that these industries are well-protected, often employing a team of security specialists, the question is: How can these attacks be so successful? Researchers have identified several properties of APT attacks which make them so efficient. First, they are adaptable. This means that they can change the way they attack and the tools they use for this purpose at any given moment in time. Second, they conceal their actions and communication by using encryption, for example. This renders many defense systems useless as they assume complete access to the actual communication content. Third, their actions are stealthy — either by keeping communication to the bare minimum or by mimicking legitimate users. This makes them “fly below the radar” of defense systems which check for anomalous communication. And finally, with the goal to increase their impact or monetisation prospects, their attacks are targeted against several companies from the same industry. Since months can pass between the first attack, its detection, and comprehensive analysis, it is often too late to deploy appropriate counter-measures at businesses peers. Instead, it is much more likely that they have already been attacked successfully. This thesis tries to answer the question whether the last property (industry-wide attacks) can be used to detect such attacks. It presents the design, implementation and evaluation of a community-based intrusion detection system, capable of protecting businesses at industry-scale. The contributions of this thesis are as follows. First, it presents a novel algorithm for community detection which can detect an industry (e.g., energy, financial, or defense industries) in Internet communication. Second, it demonstrates the design, implementation, and evaluation of a distributed graph mining engine that is able to scale with the throughput of the input data while maintaining an end-to-end latency for updates in the range of a few milliseconds. Third, it illustrates the usage of this engine to detect APT attacks against industries by analyzing IP flow information from an Internet service provider. Finally, it introduces a detection algorithm- and input-agnostic intrusion detection engine which supports not only intrusion detection on IP flow but any other intrusion detection algorithm and data-source as well

    Multifaceted Geotagging for Streaming News

    Get PDF
    News sources on the Web generate constant streams of information, describing the events that shape our world. In particular, geography plays a key role in the news, and understanding the geographic information present in news allows for its useful spatial browsing and retrieval. This process of understanding is called geotagging, and involves first finding in the document all textual references to geographic locations, known as toponyms, and second, assigning the correct lat/long values to each toponym, steps which are termed toponym recognition and toponym resolution, respectively. These steps are difficult due to ambiguities in natural language: some toponyms share names with non-location entities, and further, a given toponym can have many location interpretations. Removing these ambiguities is crucial for successful geotagging. To this end, geotagging methods are described which were developed for streaming news. First, a spatio-textual search engine named STEWARD, and an interactive map-based news browsing system named NewsStand are described, which feature geotaggers as central components, and served as motivating systems and experimental testbeds for developing geotagging methods. Next, a geotagging methodology is presented that follows a multifaceted approach involving a variety of techniques. First, a multifaceted toponym recognition process is described that uses both rule-based and machine learning–based methods to ensure high toponym recall. Next, various forms of toponym resolution evidence are explored. One such type of evidence is lists of toponyms, termed comma groups, whose toponyms share a common thread in their geographic properties that enables correct resolution. In addition to explicit evidence, authors take advantage of the implicit geographic knowledge of their audiences. Understanding the local places known by an audience, termed its local lexicon, affords great performance gains when geotagging articles from local newspapers, which account for the vast majority of news on the Web. Finally, considering windows of text of varying size around each toponym, termed adaptive context, allows for a tradeoff between geotagging execution speed and toponym resolution accuracy. Extensive experimental evaluations of all the above methods, using existing and two newly-created, large corpora of streaming news, show great performance gains over several competing prominent geotagging methods

    Spectra of Uniform Hypergraphs

    Get PDF
    We present a spectral theory of hypergraphs that closely parallels Spectral Graph Theory. A number of recent developments building upon classical work has led to a rich understanding of "hyperdeterminants" of hypermatrices, a.k.a. multidimensional arrays. Hyperdeterminants share many properties with determinants, but the context of multilinear algebra is substantially more complicated than the linear algebra required to address Spectral Graph Theory (i.e., ordinary matrices). Nonetheless, it is possible to define eigenvalues of a hypermatrix via its characteristic polynomial as well as variationally. We apply this notion to the "adjacency hypermatrix" of a uniform hypergraph, and prove a number of natural analogues of basic results in Spectral Graph Theory. Open problems abound, and we present a number of directions for further study.Comment: 32 pages, no figure

    Algorithms for the Identification of Central Nodes in Large Real-World Networks

    Get PDF

    35th Symposium on Theoretical Aspects of Computer Science: STACS 2018, February 28-March 3, 2018, Caen, France

    Get PDF
    corecore