1,272 research outputs found

    Razor: Mining distance-constrained embedded subtrees

    Get PDF
    Our work is focused on the task of mining frequent subtrees from a database of rooted ordered labelled subtrees. Previously we have developed an efficient algorithm, MB3 [12], for mining frequent embedded subtrees from a database of rooted labeled and ordered subtrees. The efficiency comes from the utilization of a novel Embedding List representation for Tree Model Guided (TMG) candidate generation. As an extension the IMB3 [13] algorithm introduces the Level of Embedding constraint. In this study we extend our past work by developing an algorithm, Razor, for mining embedded subtrees where the distance of nodes relative to the root of the subtree needs to be considered. This notion of distance constrained embedded tree mining will have important applications in web information systems, conceptual model analysis and more sophisticated ontology matching. Domains representing their knowledge in a tree structured form may require this additional distance information as it commonly indicates the amount of specific knowledge stored about a particular concept within the hierarchy. The structure based approaches for schema matching commonly take the distance among the concept nodes within a sub-structure into account when evaluating the concept similarity across different schemas. We present an encoding strategy to efficiently enumerate candidate subtrees taking the distance of nodes relative to the root of the subtree into account. The algorithm is applied to both synthetic and real-world datasets, and the experimental results demonstrate the correctness and effectiveness of the proposed technique

    SEQUEST: Mining frequent subsequences using DMA strips

    Get PDF
    Sequential patterns exist in data such as DNA string databases, occurrences of recurrent illness, etc. In this study, we present an algorithm, SEQUEST, to mine frequent subsequences from sequential patterns. The challenges of mining a very large database of sequences is computationally expensive and require large memory space. SEQUEST uses a Direct Memory Access Strips (DMA-Strips) structure to efficiently generate candidate subsequences. DMA-Strips structure provides direct access to each item to be manipulated and thus is optimized for speed and space performance. In addition, the proposed technique uses a hybrid principle of frequency counting by the vertical join approach and candidate generation by structure guided method. The structure guided method is adapted from the TMG approach used for enumerating subtrees in our previous work [8]. Experiments utilizing very large databases of sequences which compare our technique with the existing technique, PLWAP [4], demonstrate the effectiveness of our proposed technique

    Mining substructures in protein data

    Get PDF
    In this paper we consider the 'Prions' database that describes protein instances stored for Human Prion Proteins. The Prions database can be viewed as a database of rooted ordered labeled subtrees. Mining frequent substructures from tree databases is an important task and it has gained a considerable amount of interest in areas such as XML mining, Bioinformatics, Web mining etc. This has given rise to the development of many tree mining algorithms which can aid in structural comparisons, association rule discovery and in general mining of tree structured knowledge representations. Previously we have developed the MB3 tree mining algorithm, which given a minimum support threshold, efficiently discovers all frequent embedded subtrees from a database of rooted ordered labeled subtrees. In this work we apply the algorithm to the Prions database in order to extract the frequently occurring patterns, which in this case are of induced subtree type. Obtaining the set of frequent induced subtrees from the Prions database can potentially reveal some useful knowledge. This aspect will be demonstrated by providing an analysis of the extracted frequent subtrees with respect to discovering interesting protein information. Furthermore, the minimum support threshold can be used as the controlling factor for answering specific queries posed on the Prions dataset. This approach is shown to be a viable technique for mining protein data

    Anomalous f-electron Hall Effect in the Heavy-Fermion System CeTIn5_{5} (T = Co, Ir, or Rh)

    Full text link
    The in-plane Hall coefficient RH(T)R_{H}(T) of CeRhIn5_{5}, CeIrIn5_{5}, and CeCoIn5_{5} and their respective non-magnetic lanthanum analogs are reported in fields to 90 kOe and at temperatures from 2 K to 325 K. RH(T)R_{H}(T) is negative, field-independent, and dominated by skew-scattering above \sim 50 K in the Ce compounds. RH(H0)R_{H}(H \to 0) becomes increasingly negative below 50 K and varies with temperature in a manner that is inconsistent with skew scattering. Field-dependent measurements show that the low-T anomaly is strongly suppressed when the applied field is increased to 90 kOe. Measurements on LaRhIn5_{5}, LaIrIn5_{5}, and LaCoIn5_{5} indicate that the same anomalous temperature dependence is present in the Hall coefficient of these non-magnetic analogs, albeit with a reduced amplitude and no field dependence. Hall angle (θH\theta_{H}) measurements find that the ratio ρxx/ρxy=cot(θH)\rho_{xx}/\rho_{xy}=\cot(\theta_{H}) varies as T2T^{2} below 20 K for all three Ce-115 compounds. The Hall angle of the La-115 compounds follow this T-dependence as well. These data suggest that the electronic-structure contribution dominates the Hall effect in the 115 compounds, with ff-electron and Kondo interactions acting to magnify the influence of the underlying complex band structure. This is in stark contrast to the situation in most 4f4f and 5f5f heavy-fermion compounds where the normal carrier contribution to the Hall effect provides only a small, T-independent background to RH.R_{H}.Comment: 23 pages and 8 figure

    Genome-wide features of neuroendocrine regulation in Drosophila by the basic helix-loop-helix transcription factor DIMMED.

    Get PDF
    Neuroendocrine (NE) cells use large dense core vesi-cles (LDCVs) to traffic, process, store and secrete neuropeptide hormones through the regulated secre-tory pathway. The dimmed (DIMM) basic helix-loop-helix transcription factor of Drosophila controls the level of regulated secretory activity in NE cells. To pursue its mechanisms, we have performed two in-dependent genome-wide analyses of DIMM’s activi-ties: (i) in vivo chromatin immunoprecipitation (ChIP) to define genomic sites of DIMM occupancy and (ii) deep sequencing of purified DIMM neurons to char-acterize their transcriptional profile. By this com-bined approach, we showed that DIMM binds to con-served E-boxes in enhancers of 212 genes whose expression is enriched in DIMM-expressing NE cells. DIMM binds preferentially to certain E-boxes within first introns of specific gene isoforms. Statistical ma-chine learning revealed that flanking regions of puta-tive DIMM binding sites contribute to its DNA binding specificity. DIMM’s transcriptional repertoire features at least 20 LDCV constituents. In addition, DIMM no-tably targets the pro-secretory transcription factor, creb-A, but significantly, DIMM does not target any neuropeptide genes. DIMM therefore prescribes the scale of secretory activity in NE neurons, by a sys-tematic control of both proximal and distal points in the regulated secretory pathway

    Continuously-variable survival exponent for random walks with movable partial reflectors

    Full text link
    We study a one-dimensional lattice random walk with an absorbing boundary at the origin and a movable partial reflector. On encountering the reflector, at site x, the walker is reflected (with probability r) to x-1 and the reflector is simultaneously pushed to x+1. Iteration of the transition matrix, and asymptotic analysis of the probability generating function show that the critical exponent delta governing the survival probability varies continuously between 1/2 and 1 as r varies between 0 and 1. Our study suggests a mechanism for nonuniversal kinetic critical behavior, observed in models with an infinite number of absorbing configurations.Comment: 5 pages, 3 figure

    Artificial Intelligence in Radiation Therapy

    Get PDF
    Artificial intelligence (AI) has great potential to transform the clinical workflow of radiotherapy. Since the introduction of deep neural networks, many AI-based methods have been proposed to address challenges in different aspects of radiotherapy. Commercial vendors have started to release AI-based tools that can be readily integrated to the established clinical workflow. To show the recent progress in AI-aided radiotherapy, we have reviewed AI-based studies in five major aspects of radiotherapy including image reconstruction, image registration, image segmentation, image synthesis, and automatic treatment planning. In each section, we summarized and categorized the recently published methods, followed by a discussion of the challenges, concerns, and future development. Given the rapid development of AI-aided radiotherapy, the efficiency and effectiveness of radiotherapy in the future could be substantially improved through intelligent automation of various aspects of radiotherapy

    Tree model guided candidate generation for mining frequent subtrees from XML

    Get PDF
    Due to the inherent flexibilities in both structure and semantics, XML association rules mining faces few challenges, such as: a more complicated hierarchical data structure and ordered data context. Mining frequent patterns from XML documents can be recast as mining frequent tree structures from a database of XML documents. In this study, we model a database of XML documents as a database of rooted labeled ordered subtrees. In particular, we are mainly coneerned with mining frequent induced and embedded ordered subtrees. Our main contributions arc as follows. We describe our unique embedding list representation of the tree structure, which enables efficient implementation ofour Tree Model Guided (TMG) candidate generation. TMG is an optimal, non-redundant enumeration strategy which enumerates all the valid candidates that conform to the structural aspects of the data. We show through a mathematical model and experiments that TMG has better complexity compared to the commonly used join approach. In this paper, we propose two algorithms, MB3Miner and iMB3-Miner. MB3-Miner mines embedded subtrees. iMB3-Miner mines induced and/or embedded subtrees by using the maximum level of embedding constraint. Our experiments with both synthetic and real datasets against two well known algorithms for mining induced and embedded subtrees, demonstrate the effeetiveness and the efficiency of the proposed techniques

    Large Anomalous Hall effect in a silicon-based magnetic semiconductor

    Full text link
    Magnetic semiconductors are attracting high interest because of their potential use for spintronics, a new technology which merges electronics and manipulation of conduction electron spins. (GaMn)As and (GaMn)N have recently emerged as the most popular materials for this new technology. While Curie temperatures are rising towards room temperature, these materials can only be fabricated in thin film form, are heavily defective, and are not obviously compatible with Si. We show here that it is productive to consider transition metal monosilicides as potential alternatives. In particular, we report the discovery that the bulk metallic magnets derived from doping the narrow gap insulator FeSi with Co share the very high anomalous Hall conductance of (GaMn)As, while displaying Curie temperatures as high as 53 K. Our work opens up a new arena for spintronics, involving a bulk material based only on transition metals and Si, and which we have proven to display a variety of large magnetic field effects on easily measured electrical properties.Comment: 19 pages with 5 figure
    corecore