109,702 research outputs found

    Finding the most relevant fragments in networks

    Get PDF
    We study a point pattern detection problem on networks, motivated by applications in geographical analysis, such as crime hotspot detection. Given a network N (a connected graph with non-negative edge lengths) together with a set of sites, which lie on the edges or vertices of N, we look for a connected subnetwork F of N of small total length that contains many sites. The edges of F can form parts of the edges of N. We consider different variants of this problem where N is either a general graph or restricted to a tree, and the subnetwork F that we are looking for is either a simple path or a tree. We give polynomial-time algorithms, NP-hardness and NP-completeness proofs, approximation algorithms, and also fixed-parameter tractable algorithms

    Entity Ranking on Graphs: Studies on Expert Finding

    Get PDF
    Todays web search engines try to offer services for finding various information in addition to simple web pages, like showing locations or answering simple fact queries. Understanding the association of named entities and documents is one of the key steps towards such semantic search tasks. This paper addresses the ranking of entities and models it in a graph-based relevance propagation framework. In particular we study the problem of expert finding as an example of an entity ranking task. Entity containment graphs are introduced that represent the relationship between text fragments on the one hand and their contained entities on the other hand. The paper shows how these graphs can be used to propagate relevance information from the pre-ranked text fragments to their entities. We use this propagation framework to model existing approaches to expert finding based on the entity's indegree and extend them by recursive relevance propagation based on a probabilistic random walk over the entity containment graphs. Experiments on the TREC expert search task compare the retrieval performance of the different graph and propagation models

    Distributed Computing on Core-Periphery Networks: Axiom-based Design

    Full text link
    Inspired by social networks and complex systems, we propose a core-periphery network architecture that supports fast computation for many distributed algorithms and is robust and efficient in number of links. Rather than providing a concrete network model, we take an axiom-based design approach. We provide three intuitive (and independent) algorithmic axioms and prove that any network that satisfies all axioms enjoys an efficient algorithm for a range of tasks (e.g., MST, sparse matrix multiplication, etc.). We also show the minimality of our axiom set: for networks that satisfy any subset of the axioms, the same efficiency cannot be guaranteed for any deterministic algorithm

    Optimizing MDS Codes for Caching at the Edge

    Full text link
    In this paper we investigate the problem of optimal MDS-encoded cache placement at the wireless edge to minimize the backhaul rate in heterogeneous networks. We derive the backhaul rate performance of any caching scheme based on file splitting and MDS encoding and we formulate the optimal caching scheme as a convex optimization problem. We then thoroughly investigate the performance of this optimal scheme for an important heterogeneous network scenario. We compare it to several other caching strategies and we analyze the influence of the system parameters, such as the popularity and size of the library files and the capabilities of the small-cell base stations, on the overall performance of our optimal caching strategy. Our results show that the careful placement of MDS-encoded content in caches at the wireless edge leads to a significant decrease of the load of the network backhaul and hence to a considerable performance enhancement of the network.Comment: to appear in Globecom 201

    Evaluation of forensic DNA traces when propositions of interest relate to activities: analysis and discussion of recurrent concerns

    Get PDF
    When forensic scientists evaluate and report on the probative strength of single DNA traces, they commonly rely on only one number, expressing the rarity of the DNA profile in the population of interest. This is so because the focus is on propositions regarding the source of the recovered trace material, such as “the person of interest is the source of the crime stain.” In particular, when the alternative proposition is “an unknown person is the source of the crime stain,” one is directed to think about the rarity of the profile. However, in the era of DNA profiling technology capable of producing results from small quantities of trace material (i.e., non-visible staining) that is subject to easy and ubiquitous modes of transfer, the issue of source is becoming less central, to the point that it is often not contested. There is now a shift from the question “whose DNA is this?” to the question “how did it get there?” As a consequence, recipients of expert information are now very much in need of assistance with the evaluation of the meaning and probative strength of DNA profiling results when the competing propositions of interest refer to different activities. This need is widely demonstrated in day-to-day forensic practice and is also voiced in specialized literature. Yet many forensic scientists remain reluctant to assess their results given propositions that relate to different activities. Some scientists consider evaluations beyond the issue of source as being overly speculative, because of the lack of relevant data and knowledge regarding phenomena and mechanisms of transfer, persistence and background of DNA. Similarly, encouragements to deal with these activity issues, expressed in a recently released European guideline on evaluative reporting (Willis et al., 2015), which highlights the need for rethinking current practice, are sometimes viewed skeptically or are not considered feasible. In this discussion paper, we select and discuss recurrent skeptical views brought to our attention, as well as some of the alternative solutions that have been suggested. We will argue that the way forward is to address now, rather than later, the challenges associated with the evaluation of DNA results (from small quantities of trace material) in light of different activities to prevent them being misrepresented in court

    Dynamic load balancing for the distributed mining of molecular structures

    Get PDF
    In molecular biology, it is often desirable to find common properties in large numbers of drug candidates. One family of methods stems from the data mining community, where algorithms to find frequent graphs have received increasing attention over the past years. However, the computational complexity of the underlying problem and the large amount of data to be explored essentially render sequential algorithms useless. In this paper, we present a distributed approach to the frequent subgraph mining problem to discover interesting patterns in molecular compounds. This problem is characterized by a highly irregular search tree, whereby no reliable workload prediction is available. We describe the three main aspects of the proposed distributed algorithm, namely, a dynamic partitioning of the search space, a distribution process based on a peer-to-peer communication framework, and a novel receiverinitiated load balancing algorithm. The effectiveness of the distributed method has been evaluated on the well-known National Cancer Institute’s HIV-screening data set, where we were able to show close-to linear speedup in a network of workstations. The proposed approach also allows for dynamic resource aggregation in a non dedicated computational environment. These features make it suitable for large-scale, multi-domain, heterogeneous environments, such as computational grids

    Flow Logic

    Get PDF
    Flow networks have attracted a lot of research in computer science. Indeed, many questions in numerous application areas can be reduced to questions about flow networks. Many of these applications would benefit from a framework in which one can formally reason about properties of flow networks that go beyond their maximal flow. We introduce Flow Logics: modal logics that treat flow functions as explicit first-order objects and enable the specification of rich properties of flow networks. The syntax of our logic BFL* (Branching Flow Logic) is similar to the syntax of the temporal logic CTL*, except that atomic assertions may be flow propositions, like >γ> \gamma or γ\geq \gamma, for γN\gamma \in \mathbb{N}, which refer to the value of the flow in a vertex, and that first-order quantification can be applied both to paths and to flow functions. We present an exhaustive study of the theoretical and practical aspects of BFL*, as well as extensions and fragments of it. Our extensions include flow quantifications that range over non-integral flow functions or over maximal flow functions, path quantification that ranges over paths along which non-zero flow travels, past operators, and first-order quantification of flow values. We focus on the model-checking problem and show that it is PSPACE-complete, as it is for CTL*. Handling of flow quantifiers, however, increases the complexity in terms of the network to PNP{\rm P}^{\rm NP}, even for the LFL and BFL fragments, which are the flow-counterparts of LTL and CTL. We are still able to point to a useful fragment of BFL* for which the model-checking problem can be solved in polynomial time. Finally, we introduce and study the query-checking problem for BFL*, where under-specified BFL* formulas are used for network exploration

    Hypermedia-based discovery for source selection using low-cost linked data interfaces

    Get PDF
    Evaluating federated Linked Data queries requires consulting multiple sources on the Web. Before a client can execute queries, it must discover data sources, and determine which ones are relevant. Federated query execution research focuses on the actual execution, while data source discovery is often marginally discussed-even though it has a strong impact on selecting sources that contribute to the query results. Therefore, the authors introduce a discovery approach for Linked Data interfaces based on hypermedia links and controls, and apply it to federated query execution with Triple Pattern Fragments. In addition, the authors identify quantitative metrics to evaluate this discovery approach. This article describes generic evaluation measures and results for their concrete approach. With low-cost data summaries as seed, interfaces to eight large real-world datasets can discover each other within 7 minutes. Hypermedia-based client-side querying shows a promising gain of up to 50% in execution time, but demands algorithms that visit a higher number of interfaces to improve result completeness

    The Impact of IPv6 on Penetration Testing

    Get PDF
    In this paper we discuss the impact the use of IPv6 has on remote penetration testing of servers and web applications. Several modifications to the penetration testing process are proposed to accommodate IPv6. Among these modifications are ways of performing fragmentation attacks, host discovery and brute-force protection. We also propose new checks for IPv6-specific vulnerabilities, such as bypassing firewalls using extension headers and reaching internal hosts through available transition mechanisms. The changes to the penetration testing process proposed in this paper can be used by security companies to make their penetration testing process applicable to IPv6 targets

    Data fragmentation for parallel transitive closure strategies

    Get PDF
    Addresses the problem of fragmenting a relation to make the parallel computation of the transitive closure efficient, based on the disconnection set approach. To better understand this design problem, the authors focus on transportation networks. These are characterized by loosely interconnected clusters of nodes with a high internal connectivity rate. Three requirements that have to be fulfilled by a fragmentation are formulated, and three different fragmentation strategies are presented, each emphasizing one of these requirements. Some test results are presented to show the performance of the various fragmentation strategie
    corecore