388 research outputs found

    Essays on data augmentation: the value of additional information

    Get PDF

    The impact of geographical factors on churn prediction: an application to an insurance company in Madrid's urban area

    Get PDF
    Geography has previously been noted as a decisive factor in business literature. This paper provides evidence of the significant role geography plays in customer lapse behaviour in an urban environment. This novel approach is based on the idea that the customers who cancel all policies and leave the company are not randomly distributed; rather, a mimetic performance of close individuals is noted. The physical proximity of the customer to the geographical focus (strategical centre, as insurance offices) and the interaction with nearby customer are spatial factors that increase (or decrease) the probability of churning. An empirical analysis using more than 7000 spatially georeferenced offline customers of a Spanish insurance company in the urban area of Madrid (Spain) demonstrated that the customer''s proximity to offices of such insurance company under study decreases the probability of churning, whereas high lapse risk was detected in customers in the surroundings of the company''s competitor branches. In addition, we identified spatial autocorrelation in churn probability, thus demonstrating that the probability of churn of a customer increases if nearby customers churn

    Data analytics 2016: proceedings of the fifth international conference on data analytics

    Get PDF

    Load forecast on a Micro Grid level through Machine Learning algorithms

    Get PDF
    As Micro Redes constituem um sector em crescimento da indústria energética, representando uma mudança de paradigma, desde as remotas centrais de geração até à produção mais localizada e distribuída. A capacidade de isolamento das principais redes elétricas e atuar de forma independente tornam as Micro Redes em sistemas resilientes, capazes de conduzir operações flexíveis em paralelo com a prestação de serviços que tornam a rede mais competitiva. Como tal, as Micro Redes fornecem energia limpa eficiente de baixo custo, aprimoram a coordenação dos ativos e melhoram a operação e estabilidade da rede regional de eletricidade, através da capacidade de resposta dinâmica aos recursos energéticos. Para isso, necessitam de uma coordenação de gestão inteligente que equilibre todas as tecnologias ao seu dispor. Daqui surge a necessidade de recorrer a modelos de previsão de carga e de produção robustos e de confiança, que interligam a alocação dos recursos da rede perante as necessidades emergentes. Sendo assim, foi desenvolvida a metodologia HALOFMI, que tem como principal objetivo a criação de um modelo de previsão de carga para 24 horas. A metodologia desenvolvida é constituída, numa primeira fase, por uma abordagem híbrida de multinível para a criação e escolha de atributos, que alimenta uma rede neuronal (Multi-Layer Perceptron) sujeita a um ajuste de híper-parâmetros. Posto isto, numa segunda fase são testados dois modos de aplicação e gestão de dados para a Micro Rede. A metodologia desenvolvida é aplicada em dois casos de estudo: o primeiro é composto por perfis de carga agregados correspondentes a dados de clientes em Baixa Tensão Normal e de Unidades de Produção e Autoconsumo (UPAC). Este caso de estudo apresenta-se como um perfil de carga elétrica regular e com contornos muito suaves. O segundo caso de estudo diz respeito a uma ilha turística e representa um perfil irregular de carga, com variações bruscas e difíceis de prever e apresenta um desafio maior em termos de previsão a 24-horas A partir dos resultados obtidos, é avaliado o impacto da integração de uma seleção recursiva inteligente de atributos, seguido por uma viabilização do processo de redução da dimensão de dados para o operador da Micro Rede, e por fim uma comparação de estimadores usados no modelo de previsão, através de medidores de erros na performance do algoritmo.Micro Grids constitute a growing sector of the energetic industry, representing a paradigm shift from the central power generation plans to a more distributed generation. The capacity to work isolated from the main electric grid make the MG resilient system, capable of conducting flexible operations while providing services that make the network more competitive. Additionally, Micro Grids supply clean and efficient low-cost energy, enhance the flexible assets coordination and improve the operation and stability of the of the local electric grid, through the capability of providing a dynamic response to the energetic resources. For that, it is required an intelligent coordination which balances all the available technologies. With this, rises the need to integrate accurate and robust load and production forecasting models into the MG management platform, thus allowing a more precise coordination of the flexible resource according to the emerging demand needs. For these reasons, the HALOFMI methodology was developed, which focus on the creation of a precise 24-hour load forecast model. This methodology includes firstly, a hybrid multi-level approach for the creation and selection of features. Then, these inputs are fed to a Neural Network (Multi-Layer Perceptron) with hyper-parameters tuning. In a second phase, two ways of data operation are compared and assessed, which results in the viability of the network operating with a reduced number of training days without compromising the model's performance. Such process is attained through a sliding window application. Furthermore, the developed methodology is applied in two case studies, both with 15-minute timesteps: the first one is composed by aggregated load profiles of Standard Low Voltage clients, including production and self-consumption units. This case study presents regular and very smooth load profile curves. The second case study concerns a touristic island and represents an irregular load curve with high granularity with abrupt variations. From the attained results, it is evaluated the impact of integrating a recursive intelligent feature selection routine, followed by an assessment on the sliding window application and at last, a comparison on the errors coming from different estimators for the model, through several well-defined performance metrics

    A COMPREHENSIVE GEOSPATIAL KNOWLEDGE DISCOVERY FRAMEWORK FOR SPATIAL ASSOCIATION RULE MINING

    Get PDF
    Continuous advances in modern data collection techniques help spatial scientists gain access to massive and high-resolution spatial and spatio-temporal data. Thus there is an urgent need to develop effective and efficient methods seeking to find unknown and useful information embedded in big-data datasets of unprecedentedly large size (e.g., millions of observations), high dimensionality (e.g., hundreds of variables), and complexity (e.g., heterogeneous data sources, space–time dynamics, multivariate connections, explicit and implicit spatial relations and interactions). Responding to this line of development, this research focuses on the utilization of the association rule (AR) mining technique for a geospatial knowledge discovery process. Prior attempts have sidestepped the complexity of the spatial dependence structure embedded in the studied phenomenon. Thus, adopting association rule mining in spatial analysis is rather problematic. Interestingly, a very similar predicament afflicts spatial regression analysis with a spatial weight matrix that would be assigned a priori, without validation on the specific domain of application. Besides, a dependable geospatial knowledge discovery process necessitates algorithms supporting automatic and robust but accurate procedures for the evaluation of mined results. Surprisingly, this has received little attention in the context of spatial association rule mining. To remedy the existing deficiencies mentioned above, the foremost goal for this research is to construct a comprehensive geospatial knowledge discovery framework using spatial association rule mining for the detection of spatial patterns embedded in geospatial databases and to demonstrate its application within the domain of crime analysis. It is the first attempt at delivering a complete geo-spatial knowledge discovery framework using spatial association rule mining
    corecore