2,860 research outputs found

    An Open Source Based Data Warehouse Architecture to Support Decision Making in the Tourism Sector

    Get PDF
    In this paper an alternative Tourism oriented Data Warehousing architecture is proposed which makes use of the most recent free and open source technologies like Java, Postgresql and XML. Such architecture's aim will be to support the decision making process and giving an integrated view of the whole Tourism reality in an established context (local, regional, national, etc.) without requesting big investments for getting the necessary software.Tourism, Data warehousing architecture

    Google Trends as a Method to predict new COVID-19 Cases and socio-psychological Consequences of the Pandemic

    Get PDF
    Background: Understanding how people react to the COVID-19 crisis, and what the consequences are of the COVID-19 pandemic is key to enable public health and other agencies to develop optimal intervention strategies. Objective: Because the timely identification of new cases of infection has proven to be the key to timely respond to the spread of infection within a particular region, we have developed a method that can detect and predict the emergence of new cases of COVID-19 at an early stage. Further, this method can give useful insights into a family’s life during the pandemic and give the prediction of birth rates. Methods: The basic methodological concept of our approach is to monitor the digital trace of language searches with the Google Trends analytical tool (GT). We divided the keyword frequency for selected words giving us a search frequency index and then compared searches with official statistics to prove the significations of results. Results: 1.) Google Trends tools are suitable for predicting the emergence of new COVID-19 cases in Croatia. The data collected by this method correlate with official data. In Croatia search activities using GT for terms such as "PCR +COVID", and symptoms "cough + corona", "pneumonia + corona"; "muscle pain + corona" correlate strongly with officially reported cases of the disease. 2.) The method also shows effects on family life, increase in stress, and domestic violence. 3.) Birth rate in 2021 will be just 87% of what it would be "a normal year“ in Croatia. 4.) This tool can give useful insights into domestic violence. Limitations: Unquestionably, there are still significant open methodological issues and the questionable integrity of the data obtained using this source. The fact is also a problem that GT does not provide data on which population was sampled or how it was structured. Conclusion: Although these open-ended issues pose serious challenges for making clear estimates, statistics offer a range of tools available to deal with imperfect data as well as to develop controls that take data quality into account. All these insights show that GT has the potential to capture attitudes in the broad spectrum of family life themes. The benefit of this method is reliable estimates that can enable public health officials to prepare and better respond to the possible return of a pandemic in certain parts of the country and the need for responses to protect family well-being

    Digital and Computational Approaches to Migration Studies: 3 Essays

    Get PDF
    This dissertation aims to contribute to the literature on computational social sciences and presents three essays in migration studies and demography, using digital data and computational methods. The first essay focuses on visual comparison of migration patterns using Turkey as a case study. The internal migration patterns in Turkey are compared with the settlement patterns of Syrians under temporary protection in Turkey, while questioning whether there is a possibility for replacement migration policies. The second essay also uses the case of Syrians under temporary protection in Turkey and contributes to the literature on nowcasting & forecasting based on digital data by following the mobility patterns of Syrians inside Turkey using online search data from Google Trends. The third essay contributes to the literature on high-skilled migration and the use of bibliometric data. The essay uses the Brexit decision in 2016 and the academic environment in the United Kingdom as a case study and monitors the change that occurred in the in- and out-migration patterns of researchers with respect to the UK, before and after the Brexit referendum

    Queries to Google Search as Predictors of Migration Flows from Latin America to Spain

    Get PDF
    This study evaluates the relationship between the changes in proportion of migration-related queries reported by Google Trends and changes in volume of migration flows between origin and destination countries. The study assesses if cost-free Google Trends improves the prediction of international migratory flows, and whether it could be proposed as a tool for organizations and policymakers. Previous research has used the activity of email users and other online services to track human mobility. At the same time, IP geolocation linked to Google Search has proven to be efficient in geographically tracking outbreaks of illnesses, as well as predicting changes in economic indicators and travel patterns. This research draws from both experiences. It uses a regression analysis of time series data to compare the popularity of migration related queries introduced to Google Search in Colombia, Argentina and Peru, to changes in a quantity of residents’ registrations in Spain, performed by immigrants proceeding from these countries between the years 2005 and 2010. The results show a significant correlation and weak to moderate predictability for the lags of several months depending on the particular country. The findings demonstrate that trends in queries to Google Search provided by Google Trends might constitute a useful predictor of migration flows. At the same time, it indicates the need for further technological developments to improve analytical capacities

    Arrivals of tourists in Cyprus: mind the web search intensity

    Get PDF
    This paper validates the raison d’être of the effortlessly recovered web Search Intensity Indices (SII) for predicting the arrivals of tourists in Cyprus. By using monthly data (2004-2015) and two causality testing procedures we find, for properly selected key-phrases, that web search intensity (adjusted for different languages and different search engines) turns out to convey a useful predictive content for the arrivals of tourists in Cyprus. Additionally, we show that whenever the prevailing shares of visitors come from countries in different languages, then the identification of the aggregate SII becomes complex. Hence, we argue that blindly using key-phrases to identify an aggregate SII is like an immersion into the unknown, since two sources of bias (the language bias and the search engine bias) are fully neglected. Given the importance of the tourism sector in the total economy activity of Cyprus, our findings might prove to be quite useful to governmental agencies, policy makers and other stakeholders of the sector when their purpose is to allocate effectively the existing limited resources, and to plan short- and long-run promotion and investment strategies

    From Social Data Mining to Forecasting Socio-Economic Crisis

    Full text link
    Socio-economic data mining has a great potential in terms of gaining a better understanding of problems that our economy and society are facing, such as financial instability, shortages of resources, or conflicts. Without large-scale data mining, progress in these areas seems hard or impossible. Therefore, a suitable, distributed data mining infrastructure and research centers should be built in Europe. It also appears appropriate to build a network of Crisis Observatories. They can be imagined as laboratories devoted to the gathering and processing of enormous volumes of data on both natural systems such as the Earth and its ecosystem, as well as on human techno-socio-economic systems, so as to gain early warnings of impending events. Reality mining provides the chance to adapt more quickly and more accurately to changing situations. Further opportunities arise by individually customized services, which however should be provided in a privacy-respecting way. This requires the development of novel ICT (such as a self- organizing Web), but most likely new legal regulations and suitable institutions as well. As long as such regulations are lacking on a world-wide scale, it is in the public interest that scientists explore what can be done with the huge data available. Big data do have the potential to change or even threaten democratic societies. The same applies to sudden and large-scale failures of ICT systems. Therefore, dealing with data must be done with a large degree of responsibility and care. Self-interests of individuals, companies or institutions have limits, where the public interest is affected, and public interest is not a sufficient justification to violate human rights of individuals. Privacy is a high good, as confidentiality is, and damaging it would have serious side effects for society.Comment: 65 pages, 1 figure, Visioneer White Paper, see http://www.visioneer.ethz.c

    Phase II Final Project Report Paso del Norte Watershed Council Coordinated Water Resources Database and GIS Project

    Get PDF
    The Coordinated Water Resources Database and GIS Project (Project) was developed to provide improved access to regional water resources data in the Paso del Norte region for regional water stakeholders to make timely decisions in water operations and flood control. Tasks accomplished in Phase II include the complete migration of the Project Website and related databases to the ArcIMS software, which provides a better spatial query capacity. The database was enhanced by incorporating more gauge stations, limited groundwater data (well information, water levels, water quality, and pumpage) and other new data, and strengthened data sharing by implementing FGDC classic metadata. Protocols were explored for data sharing and spatial queries and opportunities for more active participation of volunteer regional data providers in the Project. The linkage of the PdNWC database with future groundwater and surface water model development was also assessed. Based on the experiences gained in the Project, the following recommendations for future Project work include: * Continued compilation of new data sources not yet included in the Project to enhance data sharing, * Installation of additional new monitoring stations and equipment and inclusion of these monitoring sites in future ArcIMS map products to fill data gaps and provide additional real-time data, * Strengthening the links with the Upper Rio Grande Water Operations Model (URGWOM) being advanced by the USACE. Special focus will be given to serving DEM and orthophoto data recently transferred from the USACE to NMWRRI and enhancing direct Web linkages with USACE and URGWOM project activities to improve model development capacity and enhance sharing of modeling results, * Development and implementation of a user needs survey focusing on new data sets of interest, enhanced access mechanisms, and other suggestions to improve the Project Website, * Development and making available online for download a Microsoft Access database of Project water resource data to provide search and query functions, * Development of an online help tutorial that would support online searches of the database, making the site easier for end users to navigate and utilize, and * Continuity in the exploration of future funding opportunities for Project activities, especially through linkages with other regional data compilation and modeling projects. Part I of this report presents major historical and technical components of the Phase II development of the Database and GIS prepared by C. Brown, Z. Sheng, and M. Bourdon. Groundwater elements of interest, relevant to the development of the coordinated database and to the integral comprehension of the watershed’s mission and planning are also included as Part II of this report. This part, prepared by Z. Sheng and others, presents the sources of regional groundwater resources data compiled by different federal and state entities and outlines suggestions for regional groundwater data to be implemented with an ArcIMS interface so that this data can be shared and accessed by all Paso del Norte Watershed Council stakeholders. Part III, prepared by R. Srinivasan, presents the technical challenges posed to data sharing by multiple data collectors and sources and summarizes the different protocols available for an effective transfer and sharing of data through a GIS ArcIMS interface. Part IV, prepared by Z. Sheng and D. Zhang, explores the possibility to link the Database Project to a comprehensive development of regional hydrological models within the Rio Grande reach between Elephant Butte Dam, in New Mexico, and Fort Quitman, Texas. Finally, Part V, prepared by C. Brown, Z. Sheng, and M. Bourdon, presents closing comments as well as a summary of the recommendations made throughout the document. Dr. Hanks provided assistance in summarizing preliminary user survey result

    Predictability analysis of the Pound's Brexit exchange rates based on Google Trends data

    Get PDF
    During the last decade, the use of online search traffic data is becoming popular in examining, analyzing, and predicting human behavior, with Google Trends being a popular tool in monitoring and analyzing the users' online search patterns in several research areas, like health, medicine, politics, economics, and finance. Towards the direction of exploring the Sterling Pound’s predictability, we employ Google Trends data from the last 5 years (March 1st, 2015 to February 29th, 2020) and perform predictability analysis on the Pound’s exchange rates to Euro and Dollar. The period selected includes the 2016 UK referendum as well as the actual Brexit day (January 31st, 2020), with the analysis aiming at analyzing the Pound’s relationships with Google query data on Pound-related keywords and topics. A quantile dependence method is employed, i.e., cross-quantilograms, to test for directional predictability from Google Trends data to the Pound’s exchange rates for lags from zero to 30 (in weeks). The results indicate that statistically significant quantile dependencies exist between Google query data and the Pound’s exchange rates, which point to the direction of one of the main implications in this field, that is to examine whether the movements in one economic variable can cause reactions in other economic variables

    Using newspapers for textual indicators: which and how many?

    Get PDF
    Este trabajo investiga el papel que desempeñan dos elecciones metodológicas a la hora de construir indicadores basados en análisis textuales: la selección de periódicos —locales frente a extranjeros— y la amplitud de la cobertura de prensa (es decir, la cantidad de periódicos que se utilizan para elaborarlos). La literatura que se ha ido desarrollando recientemente en este campo apenas toca estos dos temas ni examina la solidez de los resultados ante distintas elecciones en los dos terrenos mencionados. Para ofrecer una respuesta, tomamos como ejemplo el índice de incertidumbre de políticas económicas (EPU, por sus siglas en inglés) para varios países de América Latina y para España. En primer lugar, desarrollamos EPU basados en prensa con diferentes niveles de proximidad, es decir, local frente a extranjera, y corroboramos que ofrecen narrativas similares en términos generales. En segundo lugar, examinamos los efectos macroeconómicos de los shocks a los EPU calculados utilizando las diferentes fuentes por medio de un modelo bayesiano de vectores autorregresivos estructural, y encontramos respuestas similares desde el punto de vista estadístico. Finalmente, mostramos que la construcción de índices EPU basados en un solo periódico puede generar respuestas sesgadas. Esto sugiere que es importante maximizar la cobertura de prensa cuando se construyen indicadores basados en texto, ya que esto mejora la credibilidad de los resultados. En este sentido, nuestros primer y segundo resultados son buenas noticias para los investigadores, dado que brindan una justificación para el uso combinado de una mayor cantidad de datos de fuentes locales y extranjeras.This paper investigates the role that two key methodological choices play in the construction of textual indicators: the selection of local versus foreign newspapers and the breadth of the press coverage (i.e. the number of newspapers considered). The large literature in this field is almost silent about the robustness of research results to these two choices. We use as a case study the well-known economic policy uncertainty (EPU) index, taking as examples Latin America and Spain. First, we develop EPU measures based on press with different levels of proximity, i.e. local versus foreign, and corroborate that they deliver broadly similar narratives. Second, we examine the macroeconomic effects of EPU shocks computed using these different sources by means of a structural Bayesian vector autoregression framework and find similar responses from the statistical point of view. Third, we show that constructing EPU indexes based on only one newspaper may yield biased responses. This suggests that it is important to maximize the breadth of press coverage when building text-based indicators, since this improves the credibility of results. In this regard, our first and second results are good news for researchers, given that they provide a justification for the combined use of a larger amount of data from local and foreign sources

    A new economic policy uncertainty index for Spain

    Get PDF
    En este documento elaboramos un nuevo índice de incertidumbre sobre las políticas económicas (Economic Policy Uncertainty, EPU) para España, siguiendo la influyente metodología de Baker, Bloom y Davis (2016), y lo comparamos con el elaborado por estos autores. El nuevo índice incorpora mejoras metodológicas, entre las que destacan, en primer lugar, la mayor cobertura de periódicos de referencia para el análisis textual (de dos a siete, entre los que se incluye prensa económico-financiera)en segundo lugar, el uso de expresiones de búsqueda (palabras clave) más ricas y ajustadas al uso del español, y, finalmente, el uso de una muestra temporal más amplia. El nuevo índice proporciona una medición de la incertidumbre que captura los principales eventos de la historia reciente que se podrían asociar con aumentos de la incertidumbre sobre las políticas económicas. Asimismo, los aumentos inesperados de la incertidumbre de acuerdo con este índice se encuentran asociados a caídas de la actividad económica, del consumo privado y de la inversión empresarialWe construct a new Economic Policy Uncertainty (EPU) index for Spain, building on the influential methodology of Baker, Bloom and Davis (2016), and compare it with the EPU for Spain that these authors provide. We refine the index in several dimensions: we expand the headline newspaper coverage from 2 to 7, including economic-financial ones, use a much richer set of keywords to form the search expressions, and cover a longer sample period. Two results stand out: (i) the new index presents a more consistent chronology of economic policy events(ii) the macroeconomic effects of uncertainty shocks identified from the new index yield significant negative responses of GDP, private consumption and private investment, compared to mute responses obtained using the original one. Beyond the results for the Spanish case, our results suggest that, in addition to the richness of the keywords in the search expressions, widening the press and time coverage is key to improve the quality of the aggregate EPU inde
    corecore