18,120 research outputs found

    Growing Graphs with Hyperedge Replacement Graph Grammars

    Full text link
    Discovering the underlying structures present in large real world graphs is a fundamental scientific problem. In this paper we show that a graph's clique tree can be used to extract a hyperedge replacement grammar. If we store an ordering from the extraction process, the extracted graph grammar is guaranteed to generate an isomorphic copy of the original graph. Or, a stochastic application of the graph grammar rules can be used to quickly create random graphs. In experiments on large real world networks, we show that random graphs, generated from extracted graph grammars, exhibit a wide range of properties that are very similar to the original graphs. In addition to graph properties like degree or eigenvector centrality, what a graph "looks like" ultimately depends on small details in local graph substructures that are difficult to define at a global level. We show that our generative graph model is able to preserve these local substructures when generating new graphs and performs well on new and difficult tests of model robustness.Comment: 18 pages, 19 figures, accepted to CIKM 2016 in Indianapolis, I

    Wide-coverage deep statistical parsing using automatic dependency structure annotation

    Get PDF
    A number of researchers (Lin 1995; Carroll, Briscoe, and Sanfilippo 1998; Carroll et al. 2002; Clark and Hockenmaier 2002; King et al. 2003; Preiss 2003; Kaplan et al. 2004;Miyao and Tsujii 2004) have convincingly argued for the use of dependency (rather than CFG-tree) representations for parser evaluation. Preiss (2003) and Kaplan et al. (2004) conducted a number of experiments comparing “deep” hand-crafted wide-coverage with “shallow” treebank- and machine-learning based parsers at the level of dependencies, using simple and automatic methods to convert tree output generated by the shallow parsers into dependencies. In this article, we revisit the experiments in Preiss (2003) and Kaplan et al. (2004), this time using the sophisticated automatic LFG f-structure annotation methodologies of Cahill et al. (2002b, 2004) and Burke (2006), with surprising results. We compare various PCFG and history-based parsers (based on Collins, 1999; Charniak, 2000; Bikel, 2002) to find a baseline parsing system that fits best into our automatic dependency structure annotation technique. This combined system of syntactic parser and dependency structure annotation is compared to two hand-crafted, deep constraint-based parsers (Carroll and Briscoe 2002; Riezler et al. 2002). We evaluate using dependency-based gold standards (DCU 105, PARC 700, CBS 500 and dependencies for WSJ Section 22) and use the Approximate Randomization Test (Noreen 1989) to test the statistical significance of the results. Our experiments show that machine-learning-based shallow grammars augmented with sophisticated automatic dependency annotation technology outperform hand-crafted, deep, widecoverage constraint grammars. Currently our best system achieves an f-score of 82.73% against the PARC 700 Dependency Bank (King et al. 2003), a statistically significant improvement of 2.18%over the most recent results of 80.55%for the hand-crafted LFG grammar and XLE parsing system of Riezler et al. (2002), and an f-score of 80.23% against the CBS 500 Dependency Bank (Carroll, Briscoe, and Sanfilippo 1998), a statistically significant 3.66% improvement over the 76.57% achieved by the hand-crafted RASP grammar and parsing system of Carroll and Briscoe (2002)

    Feature selection for an SVM based webpage classifier

    Get PDF

    Towards automatic extraction of expressive elements from motion pictures : tempo

    Full text link
    This paper proposes a unique computational approach to extraction of expressive elements of motion pictures for deriving high level semantics of stories portrayed, thus enabling better video annotation and interpretation systems. This approach, motivated and directed by the existing cinematic conventions known as film grammar, as a first step towards demonstrating its effectiveness, uses the attributes of motion and shot length to define and compute a novel measure of tempo of a movie. Tempo flow plots are defined and derived for four full-length movies and edge analysis is performed leading to the extraction of dramatic story sections and events signaled by their unique tempo. The results confirm tempo as a useful attribute in its own right and a promising component of semantic constructs such as tone or mood of a film
    • 

    corecore