Search CORE

13,985 research outputs found

Matching MEDLINE/PubMed data with Web of Science (WoS): a routine in R language

Author: Boyack
Guo
Leydesdorff
Leydesdorff
Lundberg
Ohniwa
Stegmann
Torvik
Wickham
Publication venue: 'Wiley'
Publication date: 10/07/2014
Field of study

We present a novel routine, namely medlineR, based on R language, that enables the user to match data from MEDLINE/PubMed with records indexed in the ISI Web of Science (WoS) database. The matching allows exploiting the rich and controlled vocabulary of Medical Sub- ject Headings (MeSH) of MEDLINE/PubMed with additional fields of WoS. The integration provides data (e.g. citation data, list of cited reference, list of the addresses of authors’ host organisations, WoS subject categories) to perform a variety of scientometric analyses. This brief communication describes medlineR, the methodology on which it relies, and the steps the user should follow to perform the matching across the two databases. In order to specify the differences from Leydesdorff and Opthof (2013), we conclude the brief communication by testing the routine on the case of the "Burgada Syndrome"

arXiv.org e-Print Archive

CiteSeerX

Crossref

Sussex Research Online

International Migration, Integration and Social Cohesion online publications

Computation of multiple correspondence analysis, with code in R

Author: Michael Greenacre
Oleg Nenadic
Publication venue
Publication date
Field of study

The generalization of simple correspondence analysis, for two categorical variables, to multiple correspondence analysis where they may be three or more variables, is not straighforward, both from a mathematical and computational point of view. In this paper we detail the exact computational steps involved in performing a multiple correspondence analysis, including the special aspects of adjusting the principal inertias to correct the percentages of inertia, supplementary points and subset analysis. Furthermore, we give the algorithm for joint correspondence analysis where the cross-tabulations of all unique pairs of variables are analysed jointly. The code in the R language for every step of the computations is given, as well as the results of each computation.Adjustment of principal inertias, Burt matrix, correspondence analysis, multiple correspondence analysis, R language, singular value decomposition, subset analysis

Research Papers in Economics

Web Scraping in the R Language: A Tutorial

Author: Krotov Vlad
Tennyson Matthew
Publication venue: AIS Electronic Library (AISeL)
Publication date: 10/03/2021
Field of study

Information Systems researchers can now more easily access vast amounts of data on the World Wide Web to answer both familiar and new questions with more rigor, precision, and timeliness. The main goal of this tutorial is to explain how Information Systems researchers can automatically “scrape” data from the web using the R programming language. This article provides a conceptual overview of the Web Scraping process. The tutorial discussion is about two R packages useful for Web Scraping: “rvest” and “xml2”. Simple examples of web scraping involving these two packages are provided. This tutorial concludes with an example of a complex web scraping task involving retrieving data from Bayt.com - a leading employment website in the Middle East

AIS Electronic Library (AISeL)

Some Notes on the Past and Future of Lisp-Stat

Author: Luke Tierney
Publication venue
Publication date
Field of study

Lisp-Stat was originally developed as a framework for experimenting with dynamic graphics in statistics. To support this use, it evolved into a platform for more general statistical computing. The choice of the Lisp language as the basis of the system was in part coincidence and in part a very deliberate decision. This paper describes the background behind the choice of Lisp, as well as the advantages and disadvantages of this choice. The paper then discusses some lessons that can be drawn from experience with Lisp-Stat and with the R language to guide future development of Lisp-Stat, R, and similar systems.

Research Papers in Economics

Using garch algorithm to analyze data in R language

Author: Ahmed Wasfi Dhahir Al-Mahmood
Natalia Markovskaya
Publication venue: 'DAAAM International'
Publication date: 01/01/2019
Field of study

One of the challenging aspects of conditional heteroskedasticity series is that if we were to plot the correlogram of a series with volatility we might still see what appears to be a realisation of stationary discrete white noise. That is, the volatility itself is hard to detect purely from the correlogram. This is despite the fact that the series is most definitely non-stationary as its variance is not constant in time. So ARCH and GARCH models have become important tools in the analysis of time series data, particularly in financial applications. These models are especially useful when the goal of the study is to analyze and forecast volatility. This paper gives the motivation behind the simplest GARCH model and illustrates its usefulness in examining portfolio risk. So an ARCH (autoregressive conditionally heteroskedasticity) model is a model for the variance of a time series. ARCH models are used to describe a changing, possibly volatile variance. Although an ARCH model could possibly be used to describe a gradually increasing variance over time, most often it is used in situations in which there may be short periods of increased variation. (Gradually increasing variance connected to a gradually increasing mean level might be better handled by transforming the variable). In this article we will see what is ARCH and GARCH, how it’s helpful for analyzing economic and financial data and how to use it in R-Studio

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

On testing the significance of sets of genes

Author: Efron Bradley
Tibshirani Robert
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2006
Field of study

This paper discusses the problem of identifying differentially expressed groups of genes from a microarray experiment. The groups of genes are externally defined, for example, sets of gene pathways derived from biological databases. Our starting point is the interesting Gene Set Enrichment Analysis (GSEA) procedure of Subramanian et al. [Proc. Natl. Acad. Sci. USA 102 (2005) 15545--15550]. We study the problem in some generality and propose two potential improvements to GSEA: the maxmean statistic for summarizing gene-sets, and restandardization for more accurate inferences. We discuss a variety of examples and extensions, including the use of gene-set scores for class predictions. We also describe a new R language package GSA that implements our ideas.Comment: Published at http://dx.doi.org/10.1214/07-AOAS101 in the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref