26 research outputs found

    GTRACE-RS: Efficient Graph Sequence Mining using Reverse Search

    Full text link
    The mining of frequent subgraphs from labeled graph data has been studied extensively. Furthermore, much attention has recently been paid to frequent pattern mining from graph sequences. A method, called GTRACE, has been proposed to mine frequent patterns from graph sequences under the assumption that changes in graphs are gradual. Although GTRACE mines the frequent patterns efficiently, it still needs substantial computation time to mine the patterns from graph sequences containing large graphs and long sequences. In this paper, we propose a new version of GTRACE that enables efficient mining of frequent patterns based on the principle of a reverse search. The underlying concept of the reverse search is a general scheme for designing efficient algorithms for hard enumeration problems. Our performance study shows that the proposed method is efficient and scalable for mining both long and large graph sequence patterns and is several orders of magnitude faster than the original GTRACE

    A nationwide, multi-center, retrospective study of symptomatic small bowel stricture in patients with Crohn\u27s disease.

    Get PDF
    BACKGROUND:Small bowel stricture is one of the most common complications in patients with Crohn\u27s disease (CD). Endoscopic balloon dilatation (EBD) is a minimally invasive treatment intended to avoid surgery; however, whether EBD prevents subsequent surgery remains unclear. We aimed to reveal the factors contributing to surgery in patients with small bowel stricture and the factors associated with subsequent surgery after initial EBD.METHODS:Data were retrospectively collected from surgically untreated CD patients who developed symptomatic small bowel stricture after 2008 when the use of balloon-assisted enteroscopy and maintenance therapy with anti-tumor necrosis factor (TNF) became available.RESULTS:A total of 305 cases from 32 tertiary referral centers were enrolled. Cumulative surgery-free survival was 74.0% at 1 year, 54.4% at 5 years, and 44.3% at 10 years. The factors associated with avoiding surgery were non-stricturing, non-penetrating disease at onset, mild severity of symptoms, successful EBD, stricture length < 2 cm, and immunomodulator or anti-TNF added after onset of obstructive symptoms. In 95 cases with successful initial EBD, longer EBD interval was associated with lower risk of surgery. Receiver operating characteristic analysis revealed that an EBD interval of ≤ 446 days predicted subsequent surgery, and the proportion of smokers was significantly high in patients who required frequent dilatation.CONCLUSIONS:In CD patients with symptomatic small bowel stricture, addition of immunomodulator or anti-TNF and smoking cessation may improve the outcome of symptomatic small bowel stricture, by avoiding frequent EBD and subsequent surgery after initial EBD

    Similar Supergraph Search Based on Graph Edit Distance

    No full text
    Subgraph and supergraph search methods are promising techniques for the development of new drugs. For example, the chemical structure of favipiravir—an antiviral treatment for influenza—resembles the structure of some components of RNA. Represented as graphs, such compounds are similar to a subgraph of favipiravir. However, the existing supergraph search methods can only discover compounds that match exactly. We propose a novel problem, called similar supergraph search, and design an efficient algorithm to solve it. The problem is to identify all graphs in a database that are similar to any subgraph of a query graph, where similarity is defined as edit distance. Our algorithm represents the set of candidate subgraphs by a code tree, which it uses to efficiently compute edit distance. With a distance threshold of zero, our algorithm is equivalent to an existing efficient algorithm for exact supergraph search. Our experiments show that the computation time increased exponentially as the distance threshold increased, but increased sublinearly with the number of graphs in the database

    Marginalized kernels between labeled graphs

    No full text
    A new kernel function between two labeled graphs is presented. Feature vectors are defined as the counts of label paths produced by random walks on graphs. The kernel computation finally boils down to obtaining the stationary state of a discrete-time linear system, thus is efficiently performed by solving simultaneous linear equations. Our kernel is based on an infinite dimensional feature space, so it is fundamentally different from other string or tree kernels based on dynamic programming. We will present promising empirical results in classification of chemical compounds. 1 1

    IOS Press A General Framework for Mining Frequent Subgraphs from Labeled Graphs

    No full text
    Abstract. The derivation of frequent subgraphs from a dataset of labeled graphs has high computational complexity because the hard problems of isomorphism and subgraph isomorphism have to be solved as part of this derivation. To deal with this computational complexity, all previous approaches have focused on one particular kind of graph. In this paper, we propose an approach to conduct a complete search for various classes of frequent subgraphs in a massive dataset of labeled graphs within a practical time. The power of our approach comes from the algebraic representation of graphs, its associated operations and well-organized bias constraints to limit the search space efficiently. The performance has been evaluated using real world datasets, and the high scalability and flexibility of our approach have been confirmed with respect to the amount of data and the computation time

    Complete mining of frequent patterns from graphs: Mining graph data

    No full text
    Abstract. Basket Analysis, which is a standard method for data mining, derives frequent itemsets from database. However, its mining ability is limited to transaction data consisting of items. In reality, there are many applications where data are described in a more structural way, e.g. chemical compounds and Web browsing history. There are a few approaches that can discover characteristic patterns from graph-structured data in the field of machine learning. However, almost all of them are not suitable for such applications that require a complete search for all frequent subgraph patterns in the data. In this paper, we propose a novel principle and its algorithm that derive the characteristic patterns which frequently appear in graphstructured data. Our algorithm can derive all frequent induced subgraphs from both directed and undirected graph structured data having loops (including self-loops) with labeled or unlabeled nodes and links. Its performance is evaluated through the applications to Web browsing pattern analysis and chemical carcinogenesis analysis
    corecore