51 research outputs found

    Infodemiology and Infoveillance: Scoping Review

    Get PDF
    Background: Web-based sources are increasingly employed in the analysis, detection, and forecasting of diseases and epidemics, and in predicting human behavior toward several health topics. This use of the internet has come to be known as infodemiology, a concept introduced by Gunther Eysenbach. Infodemiology and infoveillance studies use web-based data and have become an integral part of health informatics research over the past decade. Objective: The aim of this paper is to provide a scoping review of the state-of-the-art in infodemiology along with the background and history of the concept, to identify sources and health categories and topics, to elaborate on the validity of the employed methods, and to discuss the gaps identified in current research. Methods: The PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) guidelines were followed to extract the publications that fall under the umbrella of infodemiology and infoveillance from the JMIR, PubMed, and Scopus databases. A total of 338 documents were extracted for assessment. Results: Of the 338 studies, the vast majority (n=282, 83.4%) were published with JMIR Publications. The Journal of Medical Internet Research features almost half of the publications (n=168, 49.7%), and JMIR Public Health and Surveillance has more than one-fifth of the examined studies (n=74, 21.9%). The interest in the subject has been increasing every year, with 2018 featuring more than one-fourth of the total publications (n=89, 26.3%), and the publications in 2017 and 2018 combined accounted for more than half (n=171, 50.6%) of the total number of publications in the last decade. The most popular source was Twitter with 45.0% (n=152), followed by Google with 24.6% (n=83), websites and platforms with 13.9% (n=47), blogs and forums with 10.1% (n=34), Facebook with 8.9% (n=30), and other search engines with 5.6% (n=19). As for the subjects examined, conditions and diseases with 17.2% (n=58) and epidemics and outbreaks with 15.7% (n=53) were the most popular categories identified in this review, followed by health care (n=39, 11.5%), drugs (n=40, 10.4%), and smoking and alcohol (n=29, 8.6%). Conclusions: The field of infodemiology is becoming increasingly popular, employing innovative methods and approaches for health assessment. The use of web-based sources, which provide us with information that would not be accessible otherwise and tackles the issues arising from the time-consuming traditional methods, shows that infodemiology plays an important role in health informatics research

    Association of the COVID-19 pandemic with Internet Search Volumes: A Google Trends TM Analysis

    Get PDF
    Objectives: To assess the association of public interest in coronavirus infections with the actual number of infected cases for selected countries across the globe. Methods: We performed a Google TrendsTM search for "Coronavirus" and compared Relative Search Volumes (RSV) indices to the number of reported COVID-19 cases by the European Center for Disease Control (ECDC) using time-lag correlation analysis. Results: Worldwide public interest in Coronavirus reached its first peak end of January when numbers of newly infected patients started to increase exponentially in China. The worldwide Google TrendsTM index reached its peak on the 12th of March 2020 at a time when numbers of infected patients started to increase in Europe and COVID-19 was declared a pandemic. At this time the general interest in China but also the Republic of Korea has already been significantly decreased as compared to end of January. Correlations between RSV indices and number of new COVID-19 cases were observed across all investigated countries with highest correlations observed with a time lag of -11.5 days, i.e. highest interest in coronavirus observed 11.5 days before the peak of newly infected cases. This pattern was very consistent across European countries but also holds true for the US. In Brazil and Australia, highest correlations were observed with a time lag of -7 days. In Egypt the highest correlation is given with a time lag of 0, potentially indicating that in this country, numbers of newly infected patients will increase exponentially within the course of April. Conclusions: Public interest indicated by RSV indices can help to monitor the progression of an outbreak such as the current COVID-19 pandemic. Public interest is on average highest 11.5 days before the peak of newly infected cases.ope

    Defying easy categorization: Wikipedia as primary, secondary and tertiary resource

    Get PDF
    Wikipedia is the world’s largest information source, used daily by millions of individuals around the world – yet such is its uniqueness and dominance that rarely is the question asked: what exactly is Wikipedia? This article sets out to explore the different categories of source that Wikipedia could be defined as (primary, secondary or tertiary) alongside the varied ways in which Wikipedia is used, which defy easy categorization, exemplified by a broad-ranging literature review and focusing on the English language Wikipedia. It concludes that Wikipedia cannot easily be categorized in any information category but is defined instead by the ways it is used and interpreted by its users

    Social Media and Public Health: Opportunities and Challenges

    Get PDF
    Social media has the potential to provide rapid insights into unfolding public health emergencies such as infectious disease outbreaks. They can also be drawn upon for rapid, survey-based insights into various health topics. Social media has also been utilised by medical professionals for the purposes of sharing scholarly works, international collaboration, and engaging in policy debates. One benefit of using social media platforms to gain insight into health is that they have the ability to capture unfiltered public opinion in large volumes, avoiding the potential biases introduced by surveys or interviews. Social media platforms can also be utilised to pilot surveys, for instance, though the use of Twitter polls. Social media data have also been drawn upon in medical emergencies and crisis situations as a public health surveillance tool. A number of software and online tools also exist, developed specifically to aide public health research utilising social media data. In recent years, ethical issues regarding the retrieval and analysis of data have also arisen

    Field to Feedlot: How US Policy Promoted Cattle Concentration

    Get PDF
    Senior Project submitted to The Division of Social Studies of Bard College

    ������ ������ ��������� ������ ��������� ��������� ��������� ������ ��������������� ������

    Get PDF
    Big data analysis and machine learning are rising analytical tools in data analysis. Big data is an area that collects and maintains a huge amount of raw data for field-specific data analysis. Machine learning is the main analytical tool for handling such data. This study investigates the applicability of keyword search volume, and develops an ANN (Artificial Neural Network) model using panel data to analyze electricity demand and forecast prices.There is no analysis using keyword search volume in econometrics, especially energy economics. Therefore, this study intends to build a new electricity demand model. In addition, since there is no model building study that applies panel data, this study constructs a novel panel ANN model. This study consists of two essays: panel analysis model development and panel ANN model development. In the first essay, this analysis derives the relationship between US household electricity consumption and renewable energy. For this purpose, keyword search volume is used to present new influential factors in analyzing economic indicators. The model considers three keywords related to electricity consumption: “renewable,” “weather forecast,” and “temperature.” Furthermore, there has been no way to quantify household renewable energy consumption, no studies have analyzed the correlation between renewable energy and US household electricity consumption. Such consumption is difficult to estimate and it is more difficult to grasp than other major sectors including commerce and industry because of issues related to personal information collection and the cost of measurement. This study therefore analyzes the correlation with household electricity consumption by constructing a model including interest in renewable energy using keyword search volume. The model, which analyzes the impact of these keywords is constructed using three regression equations based on the static energy demand model, and analyze the impact of these keywords. In the household sector, although a variety of renewable energy is used, it is difficult to derive the economic implications of such use as it is not converted into a quantifiable value. Therefore, this study uses the search keyword “renewable” to estimate the impact of renewable energy. “Weather forecast” and “temperature” were also selected as Internet search keywords. These keywords are used because temperature is one of the important factors in determining household electricity consumption. As a result, all the variables are stationary and the Hausman test indicates that the fixed effects estimation is more robust than the random effects estimation. In the case of the model using the keyword “renewable” as an explanatory variable, all the variables except the price variable are statistically significant at the 1% level; this search term has a negative correlation with household electricity consumption. Household electricity consumption decreases by 16.017 million kWh for every one unit increase in the keywords search using “renewable.” “Temperature” also has a negative coefficient, which is similar to heating degree days. The correlation between the two variables, which intuitively appear to be unrelated, could have significant meaning. When one searches for “renewable” in the context of their household, they probably have a clear purpose. In the event that excessive electricity is consumed or electricity bills are high, households will search for alternatives to reduce electricity consumption. In the case of households equipped with renewable energy facilities, the power consumption will decrease in proportion to the capacity, and the results of the estimation can be seen. This study finds that the correlation coefficient of the “renewable” variable is the highest, and the “temperature” variable also has a significant correlation with household electricity consumption. The “renewable” keyword has a large negative correlation with household electricity consumption, which can be estimated as being a result of the growing interest in renewable energy. Although the electricity consumption patterns of households are influenced by many variables, this study suggests that interest in renewable energy should also be included as a major factor influencing such consumption. In the second essay, this study predicts electricity price using ANN, which have already been used as tools for prediction in various fields. In general, ANN have been used for short-term forecasting in many economic analysis studies. On the other hand, as the forecast point increases, the accuracy of prediction decreases sharply. The forecasting accuracy in long-term forecasting is greater than that of short-term forecasting in the same dataset. Therefore, this study uses panel data to compensate for the decline in ANN forecasting accuracy in long-term forecasts in the same dataset. The panel data contains information that time series data does not have. It has trend information of time series data as well as state or country characteristics. However, there are very few studies in economics that have used panel data for prediction using ANN. Existing studies use panel data without differentiating between entities in the model structure. The panel ANN studies did not differentiate between state and national data or have independent learning such as the pooled OLS method. Therefore, this study constructs a panel ANN structure using the advantages of panel data and analyzes its accuracy according to the change of forecasting periods. The model intends to improve the accuracy of predicted values by learning the unobserved heterogeneity contained in panel data from each state. The analysis is conducted on the assumption that it would be possible to learn not only time series information but also country or state information. The panel analysis removes the cross-sectional dependence in the unobserved heterogeneity of the panel data. Unlike panel analysis, this study constructs a model structure to learn the unobserved heterogeneity of such data. The learning is conducted separately for each state, and two or three hidden layers are inserted. After 6, 12, 18 and 24 months forecasting, total RMSE and MAPE are estimated and the optimal model is selected. For empirical analysis, this study uses panel data of US electricity prices by state. Natural gas prices are also predicted for additional model verification. For the electricity price forecasting model, the accuracy of the result using time series data in 6 and 12 months forecasts is higher than using panel data. On the other hand, the results of 18 and 24 months indicate that the results of panel data are much better. In the case of natural gas Citygate prices, the results of the model using time series data for only 6-month predictions are better while other predictions show that the panel data model has high accuracy. A noteworthy point is that panel data models tend to be more accurate as the forecast period increases. Although the timing of improvement in accuracy differs, both models show an improvement of the panel data forecasting model in long-term predictions. According to the results, when estimating a small number of predicted values, the trend of the time-series data greatly influences the result and a time-series model produces better predictions. On the other hand, the longer the forecast period, the better the panel data model that learns from unobserved heterogeneity of the states rather than from the trends. Since weights are updated without affecting each layer, it can be said that the model learns by considering the heterogeneity of each state. In comparison to a time series model in which only the trend is learned, the panel data model utilizes more information to improve accuracy by learning the trends and heterogeneity of each state. In this study, electricity consumption is analyzed using panel data and electricity price prediction is performed. The electricity consumption analysis suggests a new approach based on the model considered in household electricity consumption literature that incorporates data drawn from keyword search volume. This study used keyword search volume as a substitute variable to analyze the phenomenon that was impossible to explain due to the lack of quantitative data. This study shows that variables that have not been used hitherto, as they are not quantifiable or statistically significant, can be analyzed through keyword search volume. In the electricity price forecasting analysis, a novel panel ANN model is proposed to compensate for the decrease in forecasting accuracy when the forecasting period increases Panel ANN is a model that can be applied from day-to-day and hourly forecasts to longterm trends of several years depending on the type of panel data. In analyzing the longterm trends, a neural network model that can replace the large-scale simulation models such as NEMS (National Energy Modeling System) and WEM (World Energy Model) can also be constructed. Therefore, this model can be applied in various fields ranging from the hourly price forecast of the next day's electricity market to the long-term trend of CO2 emissions. ��� ��������� ������ ������������ ������������ ������ ������������ ������������ ������ ������ ��������� ���������������. ������ ������������ ������ ��������� ��������� ������ ������ ������������ ��������� ��������� ������������ ��������� ��������� ��������� ������������ ��������� ��������� ������������ ������. ��� ������ ������ ������ ������ ��� ��������� ������������������ ������ ��������� ������ ������������ ��������� ������������ ������������ ������������ ������������ ������ ��������� ��� ��������� ��������� ��� ��������������� ��������� ������������ ������ ��������� ������������ ���������������. ��������������� ��� ��������� ������������ ������������ ������������ ������������ ������������������ ��������������� ������������ ��������� ��������� ������������ ������ ������ ��� ��� ��������� ������������.Docto

    2018 Touro College & University System Faculty Publications

    Get PDF
    This is the 2018 edition of the Faculty Publications Book of the Touro College & University System. It includes all eligible 2018 publication citations of faculty within the Touro College & University System, including New York Medical College (NYMC). It was produced as a joint effort of the Touro College Libraries and the Health Sciences Library at NYMC.https://touroscholar.touro.edu/facpubs/1008/thumbnail.jp

    Low Back Pain (LBP)

    Get PDF
    Low back pain (LBP) is a major public health problem, being the most commonly reported musculoskeletal disorder (MSD) and the leading cause of compromised quality of life and work absenteeism. Indeed, LBP is the leading worldwide cause of years lost to disability, and its burden is growing alongside the increasing and aging population. The etiology, pathogenesis, and occupational risk factors of LBP are still not fully understood. It is crucial to give a stronger focus to reducing the consequences of LBP, as well as preventing its onset. Primary prevention at the occupational level remains important for highly exposed groups. Therefore, it is essential to identify which treatment options and workplace-based intervention strategies are effective in increasing participation at work and encouraging early return-to-work to reduce the consequences of LBP. The present Special Issue offers a unique opportunity to update many of the recent advances and perspectives of this health problem. A number of topics will be covered in order to attract high-quality research papers, including the following major areas: prevalence and epidemiological data, etiology, prevention, assessment and treatment approaches, and health promotion strategies for LBP. We have received a wide range of submissions, including research on the physical, psychosocial, environmental, and occupational perspectives, also focused on workplace interventions

    The impact of technology on data collection: Case studies in privacy and economics

    Get PDF
    Technological advancement can often act as a catalyst for scientific paradigm shifts. Today the ability to collect and process large amounts of data about individuals is arguably a paradigm-shift enabling technology in action. One manifestation of this technology within the sciences is the ability to study historically qualitative fields with a more granular quantitative lens than ever before. Despite the potential for this technology, wide-adoption is accompanied by some risks. In this thesis, I will present two case studies. The first, focuses on the impact of machine learning in a cheapest-wins motor insurance market by designing a competition-based data collection mechanism. Pricing models in the insurance industry are changing from statistical methods to machine learning. In this game, close to 2000 participants, acting as insurance companies, trained and submitted pricing models to compete for profit using real motor insurance policies --- with a roughly equal split between legacy and advanced models. With this trend towards machine learning in motion, preliminary analysis of the results suggest that future markets might realise cheaper prices for consumers. Additionally legacy models competing against modern algorithms, may experience a reduction in earning stability --- accelerating machine learning adoption. Overall, the results of this field experiment demonstrate the potential for digital competition-based studies of markets in the future. The second case studies the privacy risks of data collection technologies. Despite a large body of research in re-identification of anonymous data, the question remains: if a dataset was big enough, would records become anonymous by being "lost in the crowd"? Using 3 months of location data, we show that the risk of re-identification decreases slowly with dataset size. This risk is modelled and extrapolated to larger populations with 93% of people being uniquely identifiable using 4 points of auxiliary information among 60M people. These results show how the privacy of individuals is very unlikely to be preserved even in country-scale location datasets and that alternative paradigms of data sharing are still required.Open Acces
    corecore