1,467 research outputs found

    record linkage of banks and municipalities through multiple criteria and neural networks

    Get PDF
    Record linkage aims to identify records from multiple data sources that refer to the same entity of the real world. It is a well known data quality process studied since the second half of the last century, with an established pipeline and a rich literature of case studies mainly covering census, administrative or health domains. In this paper, a method to recognize matching records from real municipalities and banks through multiple similarity criteria and a Neural Network classifier is proposed: starting from a labeled subset of the available data, first several similarity measures are combined and weighted to build a feature vector, then a Multi-Layer Perceptron (MLP) network is trained and tested to find matching pairs. For validation, seven real datasets have been used (three from banks and four from municipalities), purposely chosen in the same geographical area to increase the probability of matches. The training only involved two municipalities, while testing involved all sources (municipalities vs. municipalities, banks vs banks and and municipalities vs. banks). The proposed method scored remarkable results in terms of both precision and recall, clearly outperforming threshold-based competitors

    Analysis of out-of-town expenditures and tourist trips using credit card transaction data

    Get PDF
    Credit card transaction data contains a vast amount of valuable information that can indicate consumer behaviour patterns and mark out human mobility. In this study we analyse the transactions carried out by a sample of 10.000 Istanbul-based customers of a Turkish bank to scrutinize expenditures incurred out of Istanbul. In our preliminary descriptive analysis, we examine the relation between demographic attributes and spending measures, as well as investigate the extent to which the population and the number of points of interest imply higher or lower credit card expenditure by visitors. We develop a methodology to extract tourist trips from consecutive credit card transactions. Subsequently, we implement a hierarchical clustering method to evaluate what the purpose of these trips might have been. Our results indicate 5 clusters of purpose: ’Leisure’, ’Business’, ’Acquisition’, ’Visiting Friends and Relative’ and ’Package Holiday’. The same clustering method is applied to segment provinces of Turkey based on which product and service categories visitors prefer. We deploy a number of predictive models to estimate tourist expenditure and whether a person would embark on a trip in the upcoming months. The predictive power of these models are generally moderate; nevertheless, several of the most useful predictors are behavioural or are related to previous trips, factors that have not been considered in literatur

    IMPACTS OF URBAN DEVELOPMENT PATTERN ON RUNOFF PEAK FLOWS AND STREAMFLOW FLASHINESS OF PERI-URBAN CATCHMENTS: ASSESSING THE PERFORMANCE OF PHYSICAL AND DATA-DRIVEN MODELS FOR REAL-TIME ENSEMBLE FLOOD FORECASTING

    Get PDF
    Urban growth is a global phenomenon, and the associated impacts on hydrology from land development are expected to increase, especially in peri-urban catchments, which are newly developing catchments in proximity of growing cities. In northern climates, hydrologic response of peri-urban catchments change with the water budget and climatic conditions. As a result, runoff response of northern peri-urban catchments can vary immensely across seasons. During warm seasons, the evapotranspiration (ET) and infiltration rates are high, so urban floods are expected to occur during high intensity, low duration storm events. During cold seasons and below freezing temperatures, surficial soils are typically frozen and nearly impervious. In addition, the ET rate is low throughout winter. Therefore, the difference in runoff response between peri-urban and natural catchments is least in winter. Furthermore, winter snow redistribution by plowing and endogenous urban heat affect the snowmelt timing and frequency. Due to the limited availability of data on snow removal and redistribution activities in northern peri-urban catchments, cold-season hydrologic modeling for peri-urban catchments remains a challenging task in urban hydrology. Research on the cold season hydrologic response of peri-urban catchments are mostly limited to Finland, Sweden, and Canada. The resulting research gap on seasonal change in hydrologic response of peri-urban catchments is common to many northern settings. In the first phase of this study, I use intensive discharge monitoring records at several peri-urban catchments near Syracuse, NY to calculate and compare seasonal runoff peak flows among several peri-urban catchments. These are selected to provide a range of drainage area and imperviousness to clarify the impact of urban development and catchment size on seasonal hydrologic behavior of peri-urban catchments. It is well understood that greater peak flows and higher stream flashiness are associated with increased surface imperviousness and storm location. However, the effect of the distribution of impervious areas on runoff peak flow response and stream flashiness of peri-urban catchments has not been well studied. In the second phase of this dissertation, I define a new geometric index, Relative Nearness of Imperviousness to the Catchment Outlet (RNICO), to correlate imperviousness distribution of peri-urban catchments with runoff peak flows and stream flashiness. The study sites for this phase of the study include ninety peri-urban catchments in proximity of 9 large US cities: New York, NY (NYC), Syracuse, NY, Baltimore, MD, Portland, OR, Chicago, IL, Austin, TX, Houston, TX, San Francisco, CA, and Los Angeles, CA. Based on RNICO, all development patterns are divided into 3 classes: upstream, centralized, and downstream. Analysis results showed an obvious increase in runoff peak flows and decrease in time to peak as the centroid of imperviousness moves downstream. This indicates that RNICO is an effective tool for classifying urban development patterns and for macroscale understanding of the hydrologic behavior of small peri-urban catchments, despite the complexity of urban drainage systems. Results for nine cities show strong positive correlations between RNICO and runoff peak flows and stream flashiness index for small peri-urban catchments. However, the area threshold used to distinguish small and large catchments differs slightly by location. For example, for Chicago, IL, NYC, NY, Baltimore, MD, Houston, TX, and Austin, TX area threshold values of 55, 40, 50, 42, and 32 km2 emerged, runoff peak flows in catchments with drainage area below these values were positively correlated to RNCIO. This first phase of this study suggests that RNICO is a stronger predictor of runoff peak flow and stream-flow regime in humid northern and southern US study sites, compared to more arid western US study sites. This difference is likely due to the greater precipitation rates and greater antecedent soil moisture contents for humid climates. The extent of urban infrastructure is less likely to control the effectiveness of RNICO for predicting runoff peak flows and R-B flashiness index for the selected study sites, due to the relatively similar urban development level within the peri-urban study catchments. Consistent forecast of peak flows across scales in flood hydrographs remains a challenge for most hydrologic models. Urbanization increases the magnitude and frequency of peak flows, often challenging the forecast ability for real-time flood prediction. Following advances in satellite and ground-based meteorological observations, global and continental real-time ensemble flood forecasting systems use a variety of physical hydrology models to predict urban peak flows. Artificial intelligence (AI) models provide an alternative approach to physical hydrology models for real-time flood forecasting. Despite recent advances in AI techniques for hydrologic prediction, ensemble stream-flow prediction by these methods has been limited. In addition, application of AI models for flood forecasting has been limited to large river basins, with very limited research on use of AI models for small peri-urban catchments. Flood forecasting in small urban catchments can be a critical task to urban safety due to the short time of concentration and quick precipitation runoff response. AI flood forecasting models typically apply upstream streamflow measurements to forecast downstream flood discharge. Therefore, the storm direction may change the flood travel time and time to peak, which challenges accurate flood forecasting. For example, if the storm direction is upstream through an AI model trained on the upstream gage data may fail to accurately predict peak flow magnitude and timing, at the outlet, this is due to the quicker runoff response of the downstream gage compared to the upstream station. There has been very limited focus on the impact of storm direction on peak flow response of urban catchments and available literature are limited to lab-scale prototypes and rainfall simulators. These may not fully represent real-world flooding scenarios. Therefore, the impact of storm direction on flood forecasting performance of peri-urban catchments is another important research gap in real-time urban flood forecasting. In the third phase of my dissertation project, I initially assess the impact of storm direction on the flood forecasting performance of an Adaptive Neuro Fuzzy Inference System (ANFIS) at a peri-urban catchment in proximity of Syracuse, NY. Next, I compare the relative utility of physical hydrology and AI approaches to predict flood hydrograph in peri-urban catchments. For this comparison, I selected ANFIS, and Sacramento Soil Moisture Accounting Model (SAC-SMA) for real-time ensemble re-forecasting of streamflow in several small to medium size suburban catchments near NYC for Hurricane Irene and a smaller storm event. The SAC-SMA model is a physical hydrology model that was initially developed by Burnash et al. (1973). The National Oceanic and Atmospheric Administration (NOAA) selected the SAC-SMA lumped model as a comparison baseline for participating distributed hydrologic models in the Distributed Model Intercomparison Project (DMIP), which aimed to identify the most suitable model for National Weather Service (NWS) streamflow prediction across the US (http://www.nws.noaa.gov/ohd/hrl/dmip/). More importantly, the NWS is currently using the lumped form of SAC-SMA for ensemble flood forecasting across the US (Emerton et al., 2016). For these reasons, I chose to employ a lumped version of SAC-SMA in my dissertation project. SAC-SMA performed well for both large and small events and for lead times of three to 24 hours, but ANFIS predicted the Hurricane Irene flood discharge well only for short lead times in small study catchments. ANFIS had reasonable percent bias (PBIAS) for predicting the small storm event for all lead times, indicating the utility of ANFIS for small events. In addition, the accuracy of both SAC-SMA and ANFIS models for ensemble flood prediction did not change significantly with catchment size and imperviousness. Overall, results of the third phase of this study suggest that the lumped SAC-SMA model may be a reliable option for local urban flood forecasting for evacuation plan lead time up to 24 hours. Due to the uncertainties in future climatic conditions, my study emphasizes the importance of using physical hydrology models for real-time flood forecasting of large events in small urban catchments. This recommendation is based on the finding that the performance of data-driven models may greatly decrease with the storm scale if the training period includes storms of magnitude less than storms in the validation period

    Using Flickr to identify and connect tourism Points of Interest: The case of Lisbon, Porto and Faro

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsUnderstanding the movement of tourists helps not only the management of cities but also to enhance the most attractive places. The growth of people in social media allows us to have greater access to information about user preferences, reviews, and shared moments. Information can be used to study tourist activity. Here, it is used geo-tagged photographs from the social media platform Flickr, to identify the locations of tourists’ Points of Interest in Lisbon, Porto and Faro and quantify their relationship from the user’s co-occurrence in the identified points. The results show that, using standard clustering methods, it is possible to identify likely candidate Points of Interest. The association of the Points of Interest from users’ social media activity (i.e., posting of photos) results in a non-trivial network that breaks geographical proximity. It was found that, in all the cities under study, historical places (such as churches and cathedrals), viewpoints and beaches are captured

    Sustainable Development of Real Estate

    Get PDF
    Research, theoretical and practical tasks of sustainable real estate development process are revised in detail in this monograph; particular examples are presented as well. The concept of modern real estate development model and a developer is discussed, peculiarities of the development of built environment and real estate objects are analyzed, as well as assessment methods, models and management of real estate and investments in order to increase the object value. Theoretical and practical analyses, presented in the monograph, prove that intelligent and augmented reality technologies allow business managers to reach higher results in work quality, organize a creative team of developers, which shall present more qualitative products for the society. The edition presents knowledge on economic, legal, technological, technical, organizational, social, cultural, ethical, psychological and environmental, as well as its management aspects, which are important for the development of real estate: publicly admitted sustainable development principles, urban development and aesthetic values, territory planning, participation of society and heritage protection. It is admitted that economical crises are inevitable, and the provided methods shall help to decrease possible loss. References to the most modern world scientific literature sources are presented in the monograph. The monograph is prepared for the researchers, MSc and PhD students of construction economics and real estate development. The book may be useful for other researchers, MSc and PhD students of economics, management and other specialities, as well as business specialist of real estate business. The publication of monograph was funded by European Social Fund according to project No. VP1-2.2-ŠMM-07-K-02-060 Development and Implementation of Joint Master’s Study Programme “Sustainable Development of the Built Environment”

    Proceedings of the 11th Toulon-Verona International Conference on Quality in Services

    Get PDF
    The Toulon-Verona Conference was founded in 1998 by prof. Claudio Baccarani of the University of Verona, Italy, and prof. Michel Weill of the University of Toulon, France. It has been organized each year in a different place in Europe in cooperation with a host university (Toulon 1998, Verona 1999, Derby 2000, Mons 2001, Lisbon 2002, Oviedo 2003, Toulon 2004, Palermo 2005, Paisley 2006, Thessaloniki 2007, Florence, 2008). Originally focusing on higher education institutions, the research themes have over the years been extended to the health sector, local government, tourism, logistics, banking services. Around a hundred delegates from about twenty different countries participate each year and nearly one thousand research papers have been published over the last ten years, making of the conference one of the major events in the field of quality in services

    Comparative Analysis of Student Learning: Technical, Methodological and Result Assessing of PISA-OECD and INVALSI-Italian Systems .

    Get PDF
    PISA is the most extensive international survey promoted by the OECD in the field of education, which measures the skills of fifteen-year-old students from more than 80 participating countries every three years. INVALSI are written tests carried out every year by all Italian students in some key moments of the school cycle, to evaluate the levels of some fundamental skills in Italian, Mathematics and English. Our comparison is made up to 2018, the last year of the PISA-OECD survey, even if INVALSI was carried out for the last edition in 2022. Our analysis focuses attention on the common part of the reference populations, which are the 15-year-old students of the 2nd class of secondary schools of II degree, where both sources give a similar picture of the students
    corecore