302,553 research outputs found

    User's operating procedures. Volume 2: Scout project financial analysis program

    Get PDF
    A review is presented of the user's operating procedures for the Scout Project Automatic Data system, called SPADS. SPADS is the result of the past seven years of software development on a Prime mini-computer located at the Scout Project Office, NASA Langley Research Center, Hampton, Virginia. SPADS was developed as a single entry, multiple cross-reference data management and information retrieval system for the automation of Project office tasks, including engineering, financial, managerial, and clerical support. This volume, two (2) of three (3), provides the instructions to operate the Scout Project Financial Analysis program in data retrieval and file maintenance via the user friendly menu drivers

    Earth observing system. Output data products and input requirements, version 2.0. Volume 3: Algorithm summary tables and non-EOS data products

    Get PDF
    Volume 3 assists Earth Observing System (EOS) investigators in locating required non-EOS data products by identifying their non-EOS input requirements and providing the information on data sets available at various Distributed Active Archive Centers (DAAC's), including those from Pathfinder Activities and Earth Probes. Volume 3 is intended to complement, not to duplicate, the the EOSDIS Science Data Plan (SDP) by providing detailed data set information which was not presented in the SDP. Section 9 of this volume discusses the algorithm summary tables containing information on retrieval algorithms, expected outputs and required input data. Section 10 describes the non-EOS input requirements of instrument teams and IDS investigators. Also described are the current and future data holdings of the original seven DAACS and data products planned from the future missions and projects including Earth Probes and Pathfinder Activities. Section 11 describes source of information used in compiling data set information presented in this volume. A list of data set attributes used to describe various data sets is presented in section 12 along with their descriptions. Finally, Section 13 presents the SPSO's future plan to improve this report

    Earth observing system. Output data products and input requirements, version 2.0. Volume 1: Instrument data product characteristics

    Get PDF
    Information on Earth Observing System (EOS) output data products and input data requirements that has been compiled by the Science Processing Support Office (SPSO) at GSFC is presented. Since Version 1.0 of the SPSO Report was released in August 1991, there have been significant changes in the EOS program. In anticipation of a likely budget cut for the EOS Project, NASA HQ restructured the EOS program. An initial program consisting of two large platforms was replaced by plans for multiple, smaller platforms, and some EOS instruments were either deselected or descoped. Updated payload information reflecting the restructured EOS program superseding the August 1991 version of the SPSO report is included. This report has been expanded to cover information on non-EOS data products, and consists of three volumes (Volumes 1, 2, and 3). Volume 1 provides information on instrument outputs and input requirements. Volume 2 is devoted to Interdisciplinary Science (IDS) outputs and input requirements, including the 'best' and 'alternative' match analysis. Volume 3 provides information about retrieval algorithms, non-EOS input requirements of instrument teams and IDS investigators, and availability of non-EOS data products at seven primary Distributed Active Archive Centers (DAAC's)

    Evaluation of the effects of varying moisture contents on microwave thermal emissions from agriculture fields

    Get PDF
    Three tasks related to soil moisture sensing at microwave wavelengths were undertaken: (1) analysis of data at L, X and K sub 21 band wavelengths over bare and vegetated fields from the 1975 NASA sponsored flight experiment over Phoenix, Arizona; (2) modeling of vegetation canopy at microwave wavelengths taking into consideration both absorption and volume scattering effects; and (3) investigation of overall atmospheric effects at microwave wavelengths that can affect soil moisture retrieval. Data for both bare and vegetated fields are found to agree well with theoretical estimates. It is observed that the retrieval of surface and near surface soil moisture information is feasible through multi-spectral and multi-temporal analysis. It is also established that at long wavelengths, which are optimal for surface sensing, atmospheric effects are generally minimal. At shorter wavelengths, which are optimal for atmosheric retrieval, the background surface properties are also established

    Cross-Language Plagiarism Detection

    Full text link
    Cross-language plagiarism detection deals with the automatic identification and extraction of plagiarism in a multilingual setting. In this setting, a suspicious document is given, and the task is to retrieve all sections from the document that originate from a large, multilingual document collection. Our contributions in this field are as follows: (1) a comprehensive retrieval process for cross-language plagiarism detection is introduced, highlighting the differences to monolingual plagiarism detection, (2) state-of-the-art solutions for two important subtasks are reviewed, (3) retrieval models for the assessment of cross-language similarity are surveyed, and, (4) the three models CL-CNG, CL-ESA and CL-ASA are compared. Our evaluation is of realistic scale: it relies on 120,000 test documents which are selected from the corpora JRC-Acquis and Wikipedia, so that for each test document highly similar documents are available in all of the six languages English, German, Spanish, French, Dutch, and Polish. The models are employed in a series of ranking tasks, and more than 100 million similarities are computed with each model. The results of our evaluation indicate that CL-CNG, despite its simple approach, is the best choice to rank and compare texts across languages if they are syntactically related. CL-ESA almost matches the performance of CL-CNG, but on arbitrary pairs of languages. CL-ASA works best on "exact" translations but does not generalize well.This work was partially supported by the TEXT-ENTERPRISE 2.0 TIN2009-13391-C04-03 project and the CONACyT-Mexico 192021 grant.Potthast, M.; Barrón Cedeño, LA.; Stein, B.; Rosso, P. (2011). Cross-Language Plagiarism Detection. Language Resources and Evaluation. 45(1):45-62. https://doi.org/10.1007/s10579-009-9114-zS4562451Ballesteros, L. A. (2001). Resolving ambiguity for cross-language information retrieval: A dictionary approach. PhD thesis, University of Massachusetts Amherst, USA, Bruce Croft.Barrón-Cedeño, A., Rosso, P., Pinto, D., & Juan A. (2008). On cross-lingual plagiarism analysis using a statistical model. In S. Benno, S. Efstathios, & K. Moshe (Eds.), ECAI 2008 workshop on uncovering plagiarism, authorship, and social software misuse (PAN 08) (pp. 9–13). Patras, Greece.Baum, L. E. (1972). An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process. Inequalities, 3, 1–8.Berger, A., & Lafferty, J. (1999). Information retrieval as statistical translation. In SIGIR’99: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval (vol. 4629, pp. 222–229). Berkeley, California, United States: ACM.Brin, S., Davis, J., & Garcia-Molina, H. (1995). Copy detection mechanisms for digital documents. In SIGMOD ’95 (pp. 398–409). New York, NY, USA: ACM Press.Brown, P. F., Della Pietra, S. A., Della Pietra, V. J., & Mercer R. L. (1993). The mathematics of statistical machine translation: Parameter estimation. Computational Linguistics, 19(2), 263–311.Ceska, Z., Toman, M., & Jezek, K. (2008). Multilingual plagiarism detection. In AIMSA’08: Proceedings of the 13th international conference on artificial intelligence (pp. 83–92). Berlin, Heidelberg: Springer.Clough, P. (2003). Old and new challenges in automatic plagiarism detection. National UK Plagiarism Advisory Service, http://www.ir.shef.ac.uk/cloughie/papers/pas_plagiarism.pdf .Dempster A. P., Laird N. M., Rubin D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society. Series B (Methodological), 39(1), 1–38.Dumais, S. T., Letsche, T. A., Littman, M. L., & Landauer, T. K. (1997). Automatic cross-language retrieval using latent semantic indexing. In D. Hull & D. Oard (Eds.), AAAI-97 spring symposium series: Cross-language text and speech retrieval (pp. 18–24). Stanford University, American Association for Artificial Intelligence.Gabrilovich, E., & Markovitch, S. (2007). Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In Proceedings of the 20th international joint conference for artificial intelligence, Hyderabad, India.Hoad T. C., & Zobel, J. (2003). Methods for identifying versioned and plagiarised documents. American Society for Information Science and Technology, 54(3), 203–215.Levow, G.-A., Oard, D. W., & Resnik, P. (2005). Dictionary-based techniques for cross-language information retrieval. Information Processing & Management, 41(3), 523–547.Littman, M., Dumais, S. T., & Landauer, T. K. (1998). Automatic cross-language information retrieval using latent semantic indexing. In Cross-language information retrieval, chap. 5 (pp. 51–62). Kluwer.Maurer, H., Kappe, F., & Zaka, B. (2006). Plagiarism—a survey. Journal of Universal Computer Science, 12(8), 1050–1084.McCabe, D. (2005). Research report of the Center for Academic Integrity. http://www.academicintegrity.org .Mcnamee, P., & Mayfield, J. (2004). Character N-gram tokenization for European language text retrieval. Information Retrieval, 7(1–2), 73–97.Meyer zu Eissen, S., & Stein, B. (2006). Intrinsic plagiarism detection. In M. Lalmas, A. MacFarlane, S. M. Rüger, A. Tombros, T. Tsikrika, & A. Yavlinsky (Eds.), Proceedings of the European conference on information retrieval (ECIR 2006), volume 3936 of Lecture Notes in Computer Science (pp. 565–569). Springer.Meyer zu Eissen, S., Stein, B., & Kulig, M. (2007). Plagiarism detection without reference collections. In R. Decker & H. J. Lenz (Eds.), Advances in data analysis (pp. 359–366), Springer.Och, F. J., & Ney, H. (2003). A systematic comparison of various statistical alignment models. Computational Linguistics, 29(1), 19–51.Pinto, D., Juan, A., & Rosso, P. (2007). Using query-relevant documents pairs for cross-lingual information retrieval. In V. Matousek & P. Mautner (Eds.), Lecture Notes in Artificial Intelligence (pp. 630–637). Pilsen, Czech Republic.Pinto, D., Civera, J., Barrón-Cedeño, A., Juan, A., & Rosso, P. (2009). A statistical approach to cross-lingual natural language tasks. Journal of Algorithms, 64(1), 51–60.Potthast, M. (2007). Wikipedia in the pocket-indexing technology for near-duplicate detection and high similarity search. In C. Clarke, N. Fuhr, N. Kando, W. Kraaij, & A. de Vries (Eds.), 30th Annual international ACM SIGIR conference (pp. 909–909). ACM.Potthast, M., Stein, B., & Anderka, M. (2008). A Wikipedia-based multilingual retrieval model. In C. Macdonald, I. Ounis, V. Plachouras, I. Ruthven, & R. W. White (Eds.), 30th European conference on IR research, ECIR 2008, Glasgow , volume 4956 LNCS of Lecture Notes in Computer Science (pp. 522–530). Berlin: Springer.Pouliquen, B., Steinberger, R., & Ignat, C. (2003a). Automatic annotation of multilingual text collections with a conceptual thesaurus. In Proceedings of the workshop ’ontologies and information extraction’ at the Summer School ’The Semantic Web and Language Technology—its potential and practicalities’ (EUROLAN’2003) (pp. 9–28), Bucharest, Romania.Pouliquen, B., Steinberger, R., & Ignat, C. (2003b). Automatic identification of document translations in large multilingual document collections. In Proceedings of the international conference recent advances in natural language processing (RANLP’2003) (pp. 401–408). Borovets, Bulgaria.Stein, B. (2007). Principles of hash-based text retrieval. In C. Clarke, N. Fuhr, N. Kando, W. Kraaij, & A. de Vries (Eds.), 30th Annual international ACM SIGIR conference (pp. 527–534). ACM.Stein, B. (2005). Fuzzy-fingerprints for text-based information retrieval. In K. Tochtermann & H. Maurer (Eds.), Proceedings of the 5th international conference on knowledge management (I-KNOW 05), Graz, Journal of Universal Computer Science. (pp. 572–579). Know-Center.Stein, B., & Anderka, M. (2009). Collection-relative representations: A unifying view to retrieval models. In A. M. Tjoa & R. R. Wagner (Eds.), 20th International conference on database and expert systems applications (DEXA 09) (pp. 383–387). IEEE.Stein, B., & Meyer zu Eissen, S. (2007). Intrinsic plagiarism analysis with meta learning. In B. Stein, M. Koppel, & E. Stamatatos (Eds.), SIGIR workshop on plagiarism analysis, authorship identification, and near-duplicate detection (PAN 07) (pp. 45–50). CEUR-WS.org.Stein, B., & Potthast, M. (2007). Construction of compact retrieval models. In S. Dominich & F. Kiss (Eds.), Studies in theory of information retrieval (pp. 85–93). Foundation for Information Society.Stein, B., Meyer zu Eissen, S., & Potthast, M. (2007). Strategies for retrieving plagiarized documents. In C. Clarke, N. Fuhr, N. Kando, W. Kraaij, & A. de Vries (Eds.), 30th Annual international ACM SIGIR conference (pp. 825–826). ACM.Steinberger, R., Pouliquen, B., Widiger, A., Ignat, C., Erjavec, T., Tufis, D., & Varga, D. (2006). The JRC-Acquis: A multilingual aligned parallel corpus with 20+ languages. In Proceedings of the 5th international conference on language resources and evaluation (LREC’2006).Steinberger, R., Pouliquen, B., & Ignat, C. (2004). Exploiting multilingual nomenclatures and language-independent text features as an interlingua for cross-lingual text analysis applications. In Proceedings of the 4th Slovenian language technology conference. Information Society 2004 (IS’2004).Vinokourov, A., Shawe-Taylor, J., & Cristianini, N. (2003). Inferring a semantic representation of text via cross-language correlation analysis. In S. Becker, S. Thrun, & K. Obermayer (Eds.), NIPS-02: Advances in neural information processing systems (pp. 1473–1480). MIT Press.Yang, Y., Carbonell, J. G., Brown, R. D., & Frederking, R. E. (1998). Translingual information retrieval: Learning from bilingual corpora. Artificial Intelligence, 103(1–2), 323–345

    Organizing gene literature retrieval, profiling, and visualization training workshops for early career researchers

    Get PDF
    Developing the skills needed to effectively search and extract information from biomedical literature is essential for early-career researchers. It is, for instance, on this basis that the novelty of experimental results, and therefore publishing opportunities, can be evaluated. Given the unprecedented volume of publications in the field of biomedical research, new systematic approaches need to be devised and adopted for the retrieval and curation of literature relevant to a specific theme. Here we describe a hands-on training curriculum aimed at retrieval, profiling, and visualization of literature associated with a given topic. This curriculum was implemented in a workshop in January 2021. We provide supporting material and step-by-step implementation guidelines with the ISG15 gene literature serving as an illustrative use case. Through participation in such a workshop, trainees can learn: 1) to build and troubleshoot PubMed queries in order to retrieve the literature associated with a gene of interest; 2) to identify key concepts relevant to given themes (such as cell types, diseases, and biological processes); 3) to measure the prevalence of these concepts in the gene literature; 4) to extract key information from relevant articles, and 5) to develop a background section or summary on the basis of this information. Finally, trainees can learn to consolidate the structured information captured through this process for presentation via an interactive web application
    • …
    corecore