142 research outputs found
Efficient pruning of large knowledge graphs
In this paper we present an efficient and highly accurate algorithm to prune noisy or over-ambiguous knowledge graphs given as input an extensional definition of a domain of interest, namely as a set
of instances or concepts. Our method climbs the graph in a bottom-up fashion, iteratively layering
the graph and pruning nodes and edges in each layer while not compromising the connectivity of the set of input nodes. Iterative layering and protection of pre-defined nodes allow to extract semantically coherent DAG structures from noisy or over-ambiguous cyclic graphs, without loss of information and without incurring in computational bottlenecks, which are the main problem of stateof- the-art methods for cleaning large, i.e., Webscale,
knowledge graphs. We apply our algorithm to the tasks of pruning automatically acquired taxonomies using benchmarking data from a SemEval evaluation exercise, as well as the extraction of a domain-adapted taxonomy from theWikipedia category hierarchy. The results show the superiority of our approach over state-of-art algorithms in terms of both output quality and computational efficiency
A Survey of Symbolic Execution Techniques
Many security and software testing applications require checking whether
certain properties of a program hold for any possible usage scenario. For
instance, a tool for identifying software vulnerabilities may need to rule out
the existence of any backdoor to bypass a program's authentication. One
approach would be to test the program using different, possibly random inputs.
As the backdoor may only be hit for very specific program workloads, automated
exploration of the space of possible inputs is of the essence. Symbolic
execution provides an elegant solution to the problem, by systematically
exploring many possible execution paths at the same time without necessarily
requiring concrete inputs. Rather than taking on fully specified input values,
the technique abstractly represents them as symbols, resorting to constraint
solvers to construct actual instances that would cause property violations.
Symbolic execution has been incubated in dozens of tools developed over the
last four decades, leading to major practical breakthroughs in a number of
prominent software reliability applications. The goal of this survey is to
provide an overview of the main ideas, challenges, and solutions developed in
the area, distilling them for a broad audience.
The present survey has been accepted for publication at ACM Computing
Surveys. If you are considering citing this survey, we would appreciate if you
could use the following BibTeX entry: http://goo.gl/Hf5FvcComment: This is the authors pre-print copy. If you are considering citing
this survey, we would appreciate if you could use the following BibTeX entry:
http://goo.gl/Hf5Fv
Conflict-free star-access in parallel memory systems
We study conflict-free data distribution schemes in parallel memories in multiprocessor system architectures. Given a host graph G, the problem is to map the nodes of G into memory modules such that any instance of a template type T in G can be accessed without memory conflicts. A conflict occurs if two or more nodes of T are mapped to the same memory module. The mapping algorithm should: (i) be fast in terms of data access (possibly mapping each node in constant time); (ii) minimize the required number of memory modules for accessing any instance in G of the given template type; and (iii) guarantee load balancing on the modules. In this paper, we consider conflict-free access to star templates. i.e., to any node of G along with all of its neighbors. Such a template type arises in many classical algorithms like breadth-first search in a graph, message broadcasting in networks, and nearest neighbor based approximation in numerical computation. We consider the star-template access problem on two specific host graphs-tori and hypercubes-that are also popular interconnection network topologies. The proposed conflict-free mappings on these graphs are fast, use an optimal or provably good number of memory modules, and guarantee load balancing. (C) 2006 Elsevier Inc. All rights reserved
Efficient pruning of large knowledge graphs
In this paper we present an efficient and highly accurate algorithm to prune noisy or over-ambiguous knowledge graphs given as input an extensional definition of a domain of interest, namely as a set
of instances or concepts. Our method climbs the graph in a bottom-up fashion, iteratively layering
the graph and pruning nodes and edges in each layer while not compromising the connectivity of the set of input nodes. Iterative layering and protection of pre-defined nodes allow to extract semantically coherent DAG structures from noisy or over-ambiguous cyclic graphs, without loss of information and without incurring in computational bottlenecks, which are the main problem of stateof- the-art methods for cleaning large, i.e., Webscale,
knowledge graphs. We apply our algorithm to the tasks of pruning automatically acquired taxonomies using benchmarking data from a SemEval evaluation exercise, as well as the extraction of a domain-adapted taxonomy from theWikipedia category hierarchy. The results show the superiority of our approach over state-of-art algorithms in terms of both output quality and computational efficiency
The Power of Pivoting for Exact Clique Counting
Clique counting is a fundamental task in network analysis, and even the
simplest setting of -cliques (triangles) has been the center of much recent
research. Getting the count of -cliques for larger is algorithmically
challenging, due to the exponential blowup in the search space of large
cliques. But a number of recent applications (especially for community
detection or clustering) use larger clique counts. Moreover, one often desires
\textit{local} counts, the number of -cliques per vertex/edge.
Our main result is Pivoter, an algorithm that exactly counts the number of
-cliques, \textit{for all values of }. It is surprisingly effective in
practice, and is able to get clique counts of graphs that were beyond the reach
of previous work. For example, Pivoter gets all clique counts in a social
network with a 100M edges within two hours on a commodity machine. Previous
parallel algorithms do not terminate in days. Pivoter can also feasibly get
local per-vertex and per-edge -clique counts (for all ) for many public
data sets with tens of millions of edges. To the best of our knowledge, this is
the first algorithm that achieves such results.
The main insight is the construction of a Succinct Clique Tree (SCT) that
stores a compressed unique representation of all cliques in an input graph. It
is built using a technique called \textit{pivoting}, a classic approach by
Bron-Kerbosch to reduce the recursion tree of backtracking algorithms for
maximal cliques. Remarkably, the SCT can be built without actually enumerating
all cliques, and provides a succinct data structure from which exact clique
statistics (-clique counts, local counts) can be read off efficiently.Comment: 10 pages, WSDM 202
Outcome of hematopoietic cell transplantation for DNA double-strand break repair disorders
Background: Rare DNA breakage repair disorders predispose to infection and lymphoreticular malignancies. Hematopoietic cell transplantation (HCT) is curative, but coadministered chemotherapy or radiotherapy is damaging because of systemic radiosensitivity. We collected HCT outcome data for Nijmegen breakage syndrome, DNA ligase IV deficiency, Cernunnos-XRCC4-like factor (Cernunnos-XLF) deficiency, and ataxia-telangiectasia (AT). Methods: Data from 38 centers worldwide, including indication, donor, conditioning regimen, graft-versus-host disease, and outcome, were analyzed. Conditioning was classified as myeloablative conditioning (MAC) if it contained radiotherapy or alkylators and reduced-intensity conditioning (RIC) if no alkylators and/or 150 mg/m(2) fludarabine or less and 40 mg/kg cyclophosphamide or less were used. Results: Fifty-five new, 14 updated, and 18 previously published patients were analyzed. Median age at HCT was 48 months (range, 1.5-552 months). Twenty-nine patients underwent transplantation for infection, 21 had malignancy, 13 had bone marrow failure, 13 received pre-emptive transplantation, 5 had multiple indications, and 6 had no information. Twenty-two received MAC, 59 received RIC, and 4 were infused; information was unavailable for 2 patients. Seventy-three of 77 patients with DNA ligase IV deficiency, Cernunnos-XLF deficiency, or Nijmegen breakage syndrome received conditioning. Survival was 53 (69%) of 77 and was worse for those receiving MAC than for those receiving RIC (P=.006). Most deaths occurred early after transplantation, suggesting poor tolerance of conditioning. Survival in patients with AT was 25%. Forty-one (49%) of 83 patients experienced acute GvHD, which was less frequent in those receiving RIC compared with those receiving MAC (26/56 [46%] vs 12/21 [57%], P=.45). Median follow-up was 35 months (range, 2-168 months). No secondary malignancies were reported during 15 years of follow-up. Growth and developmental delay remained after HCT; immune-mediated complications resolved. Conclusion: RIC HCT resolves DNA repair disorder associated immunodeficiency. Long-term follow-up is required for secondary malignancy surveillance. Routine HCT for AT is not recommended.Peer reviewe
- …