1,639 research outputs found

    Extraction of Keyphrases from Text: Evaluation of Four Algorithms

    Get PDF
    This report presents an empirical evaluation of four algorithms for automatically extracting keywords and keyphrases from documents. The four algorithms are compared using five different collections of documents. For each document, we have a target set of keyphrases, which were generated by hand. The target keyphrases were generated for human readers; they were not tailored for any of the four keyphrase extraction algorithms. Each of the algorithms was evaluated by the degree to which the algorithm’s keyphrases matched the manually generated keyphrases. The four algorithms were (1) the AutoSummarize feature in Microsoft’s Word 97, (2) an algorithm based on Eric Brill’s part-of-speech tagger, (3) the Summarize feature in Verity’s Search 97, and (4) NRC’s Extractor algorithm. For all five document collections, NRC’s Extractor yields the best match with the manually generated keyphrases

    A Hybrid Radial Basis Function - Pseudospectral Method for Thermal Convection in a 3-D Spherical Shell

    Get PDF
    A novel hybrid spectral method that combines radial basis function (RBF) and Chebyshev pseudospectral (PS) methods in a “2+1” approach is presented for numerically simulating thermal convection in a 3-D spherical shell. This is the first study to apply RBFs to a full 3D physical model in spherical geometry. In addition to being spectrally accurate, RBFs are not defined in terms of any surface based coordinate system such as spherical coordinates. As a result, when used in the lateral directions, as in this study, they completely circumvent the pole issue with the further advantage that nodes can be “scattered” over the surface of a sphere. In the radial direction, Chebyshev polynomials are used, which are also spectrally accurate and provide the necessary clustering near the boundaries to resolve boundary layers. Applications of this new hybrid methodology are given to the problem of convection in the Earth’s mantle,which is modeled by a Boussinesq fluid at infinite Prandtl number. To see whether this numerical technique warrants further investigation, the study limits itself to an isoviscous mantle.Benchmark comparisons are presented with other currently used mantle convection codes for Rayleigh number 7 · 103 and 105. The algorithmic simplicity of the code (mostly due to RBFs)allows it to be written in less than 400 lines of Matlab and run on a single workstation. We find that our method is very competitive with those currently used in the literature

    Learning to Extract Keyphrases from Text

    Get PDF
    Many academic journals ask their authors to provide a list of about five to fifteen key words, to appear on the first page of each article. Since these key words are often phrases of two or more words, we prefer to call them keyphrases. There is a surprisingly wide variety of tasks for which keyphrases are useful, as we discuss in this paper. Recent commercial software, such as Microsoft?s Word 97 and Verity?s Search 97, includes algorithms that automatically extract keyphrases from documents. In this paper, we approach the problem of automatically extracting keyphrases from text as a supervised learning task. We treat a document as a set of phrases, which the learning algorithm must learn to classify as positive or negative examples of keyphrases. Our first set of experiments applies the C4.5 decision tree induction algorithm to this learning task. The second set of experiments applies the GenEx algorithm to the task. We developed the GenEx algorithm specifically for this task. The third set of experiments examines the performance of GenEx on the task of metadata generation, relative to the performance of Microsoft?s Word 97. The fourth and final set of experiments investigates the performance of GenEx on the task of highlighting, relative to Verity?s Search 97. The experimental results support the claim that a specialized learning algorithm (GenEx) can generate better keyphrases than a general-purpose learning algorithm (C4.5) and the non-learning algorithms that are used in commercial software (Word 97 and Search 97)

    Using noun phrases extraction for the improvement of hybrid clustering with text- and citation-based components. The example of “Information Systems Research”

    Get PDF
    The hybrid clustering approach combining lexical and link-based similarities suffered for a long time from the different properties of the underlying networks. We propose a method based on noun phrase extraction using natural language processing to improve the measurement of the lexical component. Term shingles of different length are created form each of the extracted noun phrases. Hybrid networks are built based on weighted combination of the two types of similarities with seven different weights. We conclude that removing all single term shingles provides the best results at the level of computational feasibility, comparability with bibliographic coupling and also in a community detection application

    Iterative approach to computational enzyme design

    Get PDF
    A general approach for the computational design of enzymes to catalyze arbitrary reactions is a goal at the forefront of the field of protein design. Recently, computationally designed enzymes have been produced for three chemical reactions through the synthesis and screening of a large number of variants. Here, we present an iterative approach that has led to the development of the most catalytically efficient computationally designed enzyme for the Kemp elimination to date. Previously established computational techniques were used to generate an initial design, HG-1, which was catalytically inactive. Analysis of HG-1 with molecular dynamics simulations (MD) and X-ray crystallography indicated that the inactivity might be due to bound waters and high flexibility of residues within the active site. This analysis guided changes to our design procedure, moved the design deeper into the interior of the protein, and resulted in an active Kemp eliminase, HG-2. The cocrystal structure of this enzyme with a transition state analog (TSA) revealed that the TSA was bound in the active site, interacted with the intended catalytic base in a catalytically relevant manner, but was flipped relative to the design model. MD analysis of HG-2 led to an additional point mutation, HG-3, that produced a further threefold improvement in activity. This iterative approach to computational enzyme design, including detailed MD and structural analysis of both active and inactive designs, promises a more complete understanding of the underlying principles of enzymatic catalysis and furthers progress toward reliably producing active enzymes

    Nanoscale alpha-structural domains in the phonon-glass thermoelectric material beta-Zn4Sb3

    Get PDF
    A study of the local atomic structure of the promising thermoelectric material beta-Zn4Sb3, using atomic pair distribution function (PDF) analysis of x-ray- and neutron-diffraction data, suggests that the material is nanostructured. The local structure of the beta phase closely resembles that of the low-temperature alpha phase. The alpha structure contains ordered zinc interstitial atoms which are not long range ordered in the beta phase. A rough estimate of the domain size from a visual inspection of the PDF is <~10 nm. It is probable that the nanoscale domains found in this study play an important role in the exceptionally low thermal conductivity of beta-Zn4Sb3

    Exchange biasing of single-domain Ni nanoparticles spontaneously grown in an antiferromagnetic MnO matrix

    Full text link
    Exchange biased composites of ferromagnetic single-domain Ni nanoparticles embedded within large grains of MnO have been prepared by reduction of Nix_xMn1x_{1-x}O4_4 phases in flowing hydrogen. The Ni precipitates are 15-30 nm in extent, and the majority are completely encased within the MnO matrix. The manner in which the Ni nanoparticles are spontaneously formed imparts a high ferromagnetic- antiferromagnetic interface/volume ratio, which results in substantial exchange bias effects. Exchange bias fields of up to 100 Oe are observed, in cases where the starting Ni content xx in the precursor Nix_xMn1x_{1-x}O4_4 phase is small. For particles of approximately the same size, the exchange bias leads to significant hardening of the magnetization, with the coercive field scaling nearly linearly with the exchange bias field.Comment: 6 pages PDFLaTeX with 9 figure
    corecore