4,604 research outputs found

    A STUDY ON LITERACY AND USAGE BEHAVIOUR OF CREDIT CARDS USERS IN INDIA

    Get PDF
    Purpose of the study: This study aims to find credit card literacy (henceforward CCL) and credit card usage behavior (henceforward CCUB) in India. Methodology: A survey was conducted on 400 respondents who were using a credit card in India. The questionnaire used for collecting data consisted of three sections; demographic information, CCL, and CCUB. To check the CCL, the customers were asked to rate their awareness of the terms and conditions of the credit card providers, while CCUB was measured using five questions. Main findings: CCL is found to be 34% and the results of logistic regression show that CCL and demographic factors influence the CCUB. Implications of this study: An understanding of the CCUB will be helpful in controlling excessive debt and high-interest payments. Novelty/Originality of this study: This paper gives a unique insight into CCL and CCUB in India

    Data Science for Finance: Targeted Learning from (Big) Data to Economic Stability and Financial Risk Management

    Get PDF
    A thesis submitted in partial fulfillment of the requirements for the degree of Doctor in Information Management, specialization in Statistics and EconometricsThe modelling, measurement, and management of systemic financial stability remains a critical issue in most countries. Policymakers, regulators, and managers depend on complex models for financial stability and risk management. The models are compelled to be robust, realistic, and consistent with all relevant available data. This requires great data disclosure, which is deemed to have the highest quality standards. However, stressed situations, financial crises, and pandemics are the source of many new risks with new requirements such as new data sources and different models. This dissertation aims to show the data quality challenges of high-risk situations such as pandemics or economic crisis and it try to theorize the new machine learning models for predictive and longitudes time series models. In the first study (Chapter Two) we analyzed and compared the quality of official datasets available for COVID-19 as a best practice for a recent high-risk situation with dramatic effects on financial stability. We used comparative statistical analysis to evaluate the accuracy of data collection by a national (Chinese Center for Disease Control and Prevention) and two international (World Health Organization; European Centre for Disease Prevention and Control) organizations based on the value of systematic measurement errors. We combined excel files, text mining techniques, and manual data entries to extract the COVID-19 data from official reports and to generate an accurate profile for comparisons. The findings show noticeable and increasing measurement errors in the three datasets as the pandemic outbreak expanded and more countries contributed data for the official repositories, raising data comparability concerns and pointing to the need for better coordination and harmonized statistical methods. The study offers a COVID-19 combined dataset and dashboard with minimum systematic measurement errors and valuable insights into the potential problems in using databanks without carefully examining the metadata and additional documentation that describe the overall context of data. In the second study (Chapter Three) we discussed credit risk as the most significant source of risk in banking as one of the most important sectors of financial institutions. We proposed a new machine learning approach for online credit scoring which is enough conservative and robust for unstable and high-risk situations. This Chapter is aimed at the case of credit scoring in risk management and presents a novel method to be used for the default prediction of high-risk branches or customers. This study uses the Kruskal-Wallis non-parametric statistic to form a conservative credit-scoring model and to study its impact on modeling performance on the benefit of the credit provider. The findings show that the new credit scoring methodology represents a reasonable coefficient of determination and a very low false-negative rate. It is computationally less expensive with high accuracy with around 18% improvement in Recall/Sensitivity. Because of the recent perspective of continued credit/behavior scoring, our study suggests using this credit score for non-traditional data sources for online loan providers to allow them to study and reveal changes in client behavior over time and choose the reliable unbanked customers, based on their application data. This is the first study that develops an online non-parametric credit scoring system, which can reselect effective features automatically for continued credit evaluation and weigh them out by their level of contribution with a good diagnostic ability. In the third study (Chapter Four) we focus on the financial stability challenges faced by insurance companies and pension schemes when managing systematic (undiversifiable) mortality and longevity risk. For this purpose, we first developed a new ensemble learning strategy for panel time-series forecasting and studied its applications to tracking respiratory disease excess mortality during the COVID-19 pandemic. The layered learning approach is a solution related to ensemble learning to address a given predictive task by different predictive models when direct mapping from inputs to outputs is not accurate. We adopt a layered learning approach to an ensemble learning strategy to solve the predictive tasks with improved predictive performance and take advantage of multiple learning processes into an ensemble model. In this proposed strategy, the appropriate holdout for each model is specified individually. Additionally, the models in the ensemble are selected by a proposed selection approach to be combined dynamically based on their predictive performance. It provides a high-performance ensemble model to automatically cope with the different kinds of time series for each panel member. For the experimental section, we studied more than twelve thousand observations in a portfolio of 61-time series (countries) of reported respiratory disease deaths with monthly sampling frequency to show the amount of improvement in predictive performance. We then compare each country’s forecasts of respiratory disease deaths generated by our model with the corresponding COVID-19 deaths in 2020. The results of this large set of experiments show that the accuracy of the ensemble model is improved noticeably by using different holdouts for different contributed time series methods based on the proposed model selection method. These improved time series models provide us proper forecasting of respiratory disease deaths for each country, exhibiting high correlation (0.94) with Covid-19 deaths in 2020. In the fourth study (Chapter Five) we used the new ensemble learning approach for time series modeling, discussed in the previous Chapter, accompany by K-means clustering for forecasting life tables in COVID-19 times. Stochastic mortality modeling plays a critical role in public pension design, population and public health projections, and in the design, pricing, and risk management of life insurance contracts and longevity-linked securities. There is no general method to forecast the mortality rate applicable to all situations especially for unusual years such as the COVID-19 pandemic. In this Chapter, we investigate the feasibility of using an ensemble of traditional and machine learning time series methods to empower forecasts of age-specific mortality rates for groups of countries that share common longevity trends. We use Generalized Age-Period-Cohort stochastic mortality models to capture age and period effects, apply K-means clustering to time series to group countries following common longevity trends, and use ensemble learning to forecast life expectancy and annuity prices by age and sex. To calibrate models, we use data for 14 European countries from 1960 to 2018. The results show that the ensemble method presents the best robust results overall with minimum RMSE in the presence of structural changes in the shape of time series at the time of COVID-19. In this dissertation’s conclusions (Chapter Six), we provide more detailed insights about the overall contributions of this dissertation on the financial stability and risk management by data science, opportunities, limitations, and avenues for future research about the application of data science in finance and economy

    From Social Data Mining to Forecasting Socio-Economic Crisis

    Full text link
    Socio-economic data mining has a great potential in terms of gaining a better understanding of problems that our economy and society are facing, such as financial instability, shortages of resources, or conflicts. Without large-scale data mining, progress in these areas seems hard or impossible. Therefore, a suitable, distributed data mining infrastructure and research centers should be built in Europe. It also appears appropriate to build a network of Crisis Observatories. They can be imagined as laboratories devoted to the gathering and processing of enormous volumes of data on both natural systems such as the Earth and its ecosystem, as well as on human techno-socio-economic systems, so as to gain early warnings of impending events. Reality mining provides the chance to adapt more quickly and more accurately to changing situations. Further opportunities arise by individually customized services, which however should be provided in a privacy-respecting way. This requires the development of novel ICT (such as a self- organizing Web), but most likely new legal regulations and suitable institutions as well. As long as such regulations are lacking on a world-wide scale, it is in the public interest that scientists explore what can be done with the huge data available. Big data do have the potential to change or even threaten democratic societies. The same applies to sudden and large-scale failures of ICT systems. Therefore, dealing with data must be done with a large degree of responsibility and care. Self-interests of individuals, companies or institutions have limits, where the public interest is affected, and public interest is not a sufficient justification to violate human rights of individuals. Privacy is a high good, as confidentiality is, and damaging it would have serious side effects for society.Comment: 65 pages, 1 figure, Visioneer White Paper, see http://www.visioneer.ethz.c

    Advances in Data Mining Knowledge Discovery and Applications

    Get PDF
    Advances in Data Mining Knowledge Discovery and Applications aims to help data miners, researchers, scholars, and PhD students who wish to apply data mining techniques. The primary contribution of this book is highlighting frontier fields and implementations of the knowledge discovery and data mining. It seems to be same things are repeated again. But in general, same approach and techniques may help us in different fields and expertise areas. This book presents knowledge discovery and data mining applications in two different sections. As known that, data mining covers areas of statistics, machine learning, data management and databases, pattern recognition, artificial intelligence, and other areas. In this book, most of the areas are covered with different data mining applications. The eighteen chapters have been classified in two parts: Knowledge Discovery and Data Mining Applications

    International Conference on Computer Science and Communication Engineering

    Get PDF
    UBT Annual International Conference is the 8th international interdisciplinary peer reviewed conference which publishes works of the scientists as well as practitioners in the area where UBT is active in Education, Research and Development. The UBT aims to implement an integrated strategy to establish itself as an internationally competitive, research-intensive university, committed to the transfer of knowledge and the provision of a world-class education to the most talented students from all background. The main perspective of the conference is to connect the scientists and practitioners from different disciplines in the same place and make them be aware of the recent advancements in different research fields, and provide them with a unique forum to share their experiences. It is also the place to support the new academic staff for doing research and publish their work in international standard level. This conference consists of sub conferences in different fields like: – Computer Science and Communication Engineering– Management, Business and Economics– Mechatronics, System Engineering and Robotics– Energy Efficiency Engineering– Information Systems and Security– Architecture – Spatial Planning– Civil Engineering , Infrastructure and Environment– Law– Political Science– Journalism , Media and Communication– Food Science and Technology– Pharmaceutical and Natural Sciences– Design– Psychology– Education and Development– Fashion– Music– Art and Digital Media– Dentistry– Applied Medicine– Nursing This conference is the major scientific event of the UBT. It is organizing annually and always in cooperation with the partner universities from the region and Europe. We have to thank all Authors, partners, sponsors and also the conference organizing team making this event a real international scientific event. Edmond Hajrizi, President of UBTUBT – Higher Education Institutio

    Exploration space of human-data interaction

    Get PDF
    Data is everywhere. Starting with the invention of writing, representation artifacts brought the data to observable state which led to natural establishment of an interaction form between human and data. In the human-data interaction (HDI) environment, data representations and analytic systems act as an intermediary role. I suggest a new de nition for HDI in which this interaction is conceptualized as a communication model over a set of media. The interaction occurs with the exchange of messages originated from both human and data. Timing and content of the messages are employed to facilitate objective evaluation of properties of analytic system in question. To systematically investigate the complex nature of HDI, my methodology postulates the phenomenon as a high-dimensional space in which data analytic systems could be positioned based on their properties. Evaluation of the properties are performed based on solid de nitions of the dimensions. I de ne ve properties for data analytic systems, namely, responsiveness, communication media level, unit task diversity, closeness factor, and progressiveness level, and demonstrate how these properties could be objectively calculated. I visually explore the HDI space in which data analytic systems reported in my thesis are plotted on a two-dimensional Cartesian system whose axes are responsiveness and communication media level. Visually identi able patterns in this plot, which I call realms, are characterized by quantitative and qualitative analysis of objective, behavioral, and subjective data collected during the user interaction with the corresponding analytic system

    Open the Jail Cell Doors, HAL: A Guarded Embrace of Pretrial Risk Assessment Instruments

    Get PDF
    In recent years, criminal justice reformers have focused their attention on pretrial detention as a uniquely solvable contributor to the horrors of modern mass incarceration. While reform of bail practices can take many forms, one of the most pioneering and controversial techniques is the adoption of actuarial models to inform pretrial decision-making. These models are designed to supplement or replace the unpredictable and discriminatory status quo of judicial discretion at arraignment. This Note argues that policymakers should experiment with risk assessment instruments as a component of their bail reform efforts, but only if appropriate safeguards are in place. Concerns for protecting individual constitutional rights, mitigating racial disparities, and avoiding the drawbacks of machine learning are the key challenges facing reformers and jurisdictions adopting pretrial risk assessment instruments. Absent proper precautions, risk assessment instruments can reinforce, rather than alleviate, modern criminal justice disparities. Drawing from a case study of New Jersey’s recent bail reform program, this Note examines the efficacy, impact, and pitfalls of risk assessment instrument adoption. Finally, this Note offers a broad framework for policymakers seeking to thoughtfully experiment with risk assessment instruments in their own jurisdictions

    Determinants of online leisure travel planning decision processes :a segmented approach

    Get PDF
    D.B.A. ThesisThere is an abundance of information sources on the Internet that consumers use to plan and book their travel. This information reflects the fact that travel comprises a significant part of the business conducted through the web. Consumers are sometimes faced with a complex task of making purchasing decisions in the dynamic and fast-paced medium of the Internet. In spite of the importance of travel and the intricacies of the decision process, an integrated framework that identifies the various determinants of the online leisure travel planning decision process and how they interact, is largely absent in travel literature. This study aims to make a contribution by extracting from relevant literature useful elements that could comprise such a framework. It also uses several phases of qualitative research to refine the framework, and then a quantitative assessment of data collected from an online questionnaire completed by 1,198 respondents to test specific components of the framework that deal with online travel booking intention. In the final model building stage, three logistic regression models were compared. The first is a parsimonious one containing key determinants that lead to online travel booking intention. These determinants emerged from theoretical frameworks of the theory of reasoned action and innovation adoption theory. The second Model used strictly involvement, motivation, and knowledge variables that are thought to influence online booking intention. The third Model included a combination of relevant predictor variables from the other two Models. The relationship between various demographics and online travel booking intention was investigated yielding some interesting insights. Consequently, this study recommends these demographic variables be considered in segmenting travelers to find those more likely to book online. The determinants of online leisure travel booking decision processes could be used in conjunction with demographic variables to more accurately predict leisure travel website usage

    Feature selection strategies for improving data-driven decision support in bank telemarketing

    Get PDF
    The usage of data mining techniques to unveil previously undiscovered knowledge has been applied in past years to a wide number of domains, including banking and marketing. Raw data is the basic ingredient for successfully detecting interesting patterns. A key aspect of raw data manipulation is feature engineering and it is related with the correct characterization or selection of relevant features (or variables) that conceal relations with the target goal. This study is particularly focused on feature engineering, aiming at the unfolding features that best characterize the problem of selling long-term bank deposits through telemarketing campaigns. For the experimental setup, a case-study from a Portuguese bank, ranging the 2008-2013 year period and encompassing the recent global financial crisis, was addressed. To assess the relevance of such problem, a novel literature analysis using text mining and the latent Dirichlet allocation algorithm was conducted, confirming the existence of a research gap for bank telemarketing. Starting from a dataset containing typical telemarketing contacts and client information, research followed three different and complementary strategies: first, by enriching the dataset with social and economic context features; then, by including customer lifetime value related features; finally, by applying a divide and conquer strategy for splitting the problem in smaller fractions, leading to optimized sub-problems. Each of the three approaches improved previous results in terms of model metrics related to prediction performance. The relevance of the proposed features was evaluated, confirming the obtained models as credible and valuable for telemarketing campaign managers.A utilização de técnicas de data mining para a descoberta de conhecimento tem sido aplicada nos últimos anos a uma grande variedade de domínios, incluindo banca e marketing. Os dados no seu estado primitivo constituem o ingrediente básico para a deteção de padrões de informação. Um aspeto chave da manipulação de dados em bruto consiste na "engenharia de atributos", que compreende uma correta definição e seleção de atributos relevantes (ou variáveis) que se relacionem com o alvo da descoberta de conhecimento. Este trabalho foca-se numa abordagem de "engenharia de atributos" para definir as variáveis que melhor caraterizam o problema de vender depósitos bancários a prazo através de campanhas de telemarketing. Sendo um estudo empírico, foi utilizado um caso de estudo de um banco português, abrangendo o período 2008-2013, que inclui os efeitos da crise financeira internacional. Para aferir da importância deste problema, foi realizada uma inovadora análise da literatura recorrendo a text mining e ao algoritmo latent Dirichlet allocation, confirmando a existência de uma lacuna nesta matéria. Utilizando como base um conjunto de dados de contactos de telemarketing e informação sobre os clientes, três estratégias diferentes e complementares foram propostas: primeiro, os dados foram enriquecidos com atributos socioeconómicos; posteriormente, foram adicionadas características associadas ao valor do cliente ao longo do seu tempo de vida; finalmente, o problema foi dividido em problemas mais específicos, permitindo abordagens otimizadas a cada subproblema. Cada abordagem melhorou as métricas associadas à capacidade preditiva do modelo. Adicionalmente, a relevância dos atributos foi avaliada, confirmando os modelos obtidos como credíveis e valiosos para gestores de campanhas de telemarketing
    • …
    corecore