51 research outputs found

    Automatic term identification for bibliometric mapping

    Get PDF
    A term map is a map that visualizes the structure of a scientific field by showing the relations between important terms in the field. The terms shown in a term map are usually selected manually with the help of domain experts. Manual term selection has the disadvantages of being subjective and labor-intensive. To overcome these disadvantages, we propose a methodology for automatic term identification and we use this methodology to select the terms to be included in a term map. To evaluate the proposed methodology, we use it to construct a term map of the field of operations research. The quality of the map is assessed by a number of operations research experts. It turns out that in general the proposed methodology performs quite well

    Facilitating the development of controlled vocabularies for metabolomics technologies with text mining

    Get PDF
    BACKGROUND: Many bioinformatics applications rely on controlled vocabularies or ontologies to consistently interpret and seamlessly integrate information scattered across public resources. Experimental data sets from metabolomics studies need to be integrated with one another, but also with data produced by other types of omics studies in the spirit of systems biology, hence the pressing need for vocabularies and ontologies in metabolomics. However, it is time-consuming and non trivial to construct these resources manually. RESULTS: We describe a methodology for rapid development of controlled vocabularies, a study originally motivated by the needs for vocabularies describing metabolomics technologies. We present case studies involving two controlled vocabularies (for nuclear magnetic resonance spectroscopy and gas chromatography) whose development is currently underway as part of the Metabolomics Standards Initiative. The initial vocabularies were compiled manually, providing a total of 243 and 152 terms. A total of 5,699 and 2,612 new terms were acquired automatically from the literature. The analysis of the results showed that full-text articles (especially the Materials and Methods sections) are the major source of technology-specific terms as opposed to paper abstracts. CONCLUSIONS: We suggest a text mining method for efficient corpus-based term acquisition as a way of rapidly expanding a set of controlled vocabularies with the terms used in the scientific literature. We adopted an integrative approach, combining relatively generic software and data resources for time- and cost-effective development of a text mining tool for expansion of controlled vocabularies across various domains, as a practical alternative to both manual term collection and tailor-made named entity recognition methods

    An Improved Method for Analysis of Oxidation Dyes in Human Hair

    No full text

    Multi Word Term Queries for Focused Information Retrieval.

    Get PDF
    International audienceIn this paper, we address both standard and focused retrieval tasks based on comprehensible language models and interactive query expansion (IQE). Query topics are expanded using an initial set of Multi Word Terms (MWTs) selected from top n ranked documents. MWTs are special text units that represent domain concepts and objects. As such, they can better represent query topics than ordinary phrases or n-grams. We tested different query representations: bag-of-words, phrases, flat list of MWTs, subsets of MWTs. We also combined the initial set of MWTs obtained in an IQE process with automatic query expansion (AQE) using language models and smoothing mechanism. We chose as baseline the Indri IR engine based on the language model using Dirichlet smoothing. The experiment is carried out on two benchmarks: TREC Enterprise track (TRECent) 2007 and 2008 collections; INEX 2008 Ad-hoc track using the Wikipedia collection

    Creating ontologies for content representation–-the OntoSeed suite

    Get PDF
    Bontas Simperl EP, Schlangen D, Schrader T. Creating ontologies for content representation–-the OntoSeed suite. In: Meersman R, ed. Proceedings of the CoopIS/DOA/ODBASE. Berlin, Heidelberg: Springer Verlag; 2005: 924

    Demonstration of a wideband submillimeter-wave low-noise receiver with 4–21 GHz IF output digitized by a high-speed 32 GSps ADC

    No full text
    We report on a 275–500 GHz heterodyne receiver system in combination with a wideband intermediate-frequency (IF) backend to realize 17 GHz instantaneous bandwidth. The receiver frontend implements a heterodyne mixer module that integrates a superconductor-insulator-superconductor (SIS) mixer chip and a cryogenic low-noise preamplifier. The SIS mixer is developed based on high-current-density junction technologies to achieve a wideband radio frequency (RF) and IF bandwidth. The IF backend comprises an IF chain divided into two channels for 4.0–11.5 GHz and 11.3–21.0 GHz and an analog-to-digital converter (ADC) module that is capable of high-speed sampling at 32 Giga samples per second with 12.5 GHz bandwidth per channel and an effective number of bits of 6.5. The IF backend allows us to simultaneously cover the full 4–21 GHz IF range of the receiver frontend. The measured noise temperature of the receiver frontend was below three times the quantum noise (hf/kB) over the entire RF band. A dual-polarization sideband-separating receiver based on this technique could provide up to 64 GHz of instantaneous bandwidth, which demonstrates the possibility of future wideband radio astronomical observations with advanced submillimeter-wave heterodyne receivers
    corecore