Search CORE

92 research outputs found

Visa trial of international trade: evidence from support vector machines and neural networks

Author: Akman Engin
Karaman Abdullah
Kuzey Cemil
Publication venue: Murray State\u27s Digital Commons
Publication date: 31/01/2020
Field of study

International trade depends on networking, interaction and in-person meetings which stimulate cross-border travels. The countries are seeking policies to encourage inbound mobility to support bilateral trade, tourism, and foreign direct investments. Some nations have been implementing liberal visa regimes as an important part of facilitating policies in view of security concerns. Turkey has been among the nations introducing liberal visa policies to support trade in the last decade and recorded significant increases in the volumes of exports. In this paper, we employed machine learning methodologies, Support vector machines (SVM) and Neural networks (NN), to investigate the facilitating impact of liberal visa policies on bilateral trade, using the export data from Turkey for the period of 2000–2014. The research disentangled the variables that have the strongest impact on trade utilizing SVM and NN models and exhibited that visa policies have significant impacts on the bilateral trade. More relaxed visa policies are recommended for the countries in the pursuit of increasing exports

Murray State University

A Machine Learning Approach to Revenue Generation within the Professional Hair Care Industry

Author: Eliasen Linda
Sepenu Alexander K
Publication venue: SMU Scholar
Publication date: 02/06/2022
Field of study

The cosmetic and beauty industry continues to grow and evolve to satisfy its patrons. In the United States, the industry is heavily science-driven, innovative, and fast-paced, suggesting that to remain productive and profitable, companies must seek smart alternatives to their current modus operandi or risk losing out on this multi-billion-dollar industry to fierce competition. In this paper, the authors seek to utilize machine learning models such as clustering and regression to improve the efficiency of current sales and customer segmentation models to help HairCo (pseudonym for confidentiality), a professional hair products manufacturer, strategize their marketing and sales efforts for revenue growth. The present challenge facing HairCo is the lack of models that learn from aggregated data centered on the buying behavior, demographic, and other publicly available data of end consumers tied to historical sales data of their customers, i.e., salons and stylists. The proposed clustering and regression models achieved notably improved results using the aggregated data in comparison to models solely using internal company-provided data. Recommendations on which features are most important from both models that improve customer profiling and predicting sales were presented. With these results, HairCo can increase its revenue and expand its market share

Southern Methodist University

SMU Digital Repository

Analyzing Granger causality in climate data with time series classification methods

Author: Decubber Stijn
Demuzere Matthias
Miralles Diego
Papagiannopoulou Christina
Verhoest Niko
Waegeman Willem
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Attribution studies in climate science aim for scientifically ascertaining the influence of climatic variations on natural or anthropogenic factors. Many of those studies adopt the concept of Granger causality to infer statistical cause-effect relationships, while utilizing traditional autoregressive models. In this article, we investigate the potential of state-of-the-art time series classification techniques to enhance causal inference in climate science. We conduct a comparative experimental study of different types of algorithms on a large test suite that comprises a unique collection of datasets from the area of climate-vegetation dynamics. The results indicate that specialized time series classification methods are able to improve existing inference procedures. Substantial differences are observed among the methods that were tested

Ghent University Academic Bibliography

Data mining in computational finance

Author: Jadhav Swati
Publication venue
Publication date: 01/12/2017
Field of study

Computational finance is a relatively new discipline whose birth can be traced back to early 1950s. Its major objective is to develop and study practical models focusing on techniques that apply directly to financial analyses. The large number of decisions and computationally intensive problems involved in this discipline make data mining and machine learning models an integral part to improve, automate, and expand the current processes. One of the objectives of this research is to present a state-of-the-art of the data mining and machine learning techniques applied in the core areas of computational finance. Next, detailed analysis of public and private finance datasets is performed in an attempt to find interesting facts from data and draw conclusions regarding the usefulness of features within the datasets. Credit risk evaluation is one of the crucial modern concerns in this field. Credit scoring is essentially a classification problem where models are built using the information about past applicants to categorise new applicants as ‘creditworthy’ or ‘non-creditworthy’. We appraise the performance of a few classical machine learning algorithms for the problem of credit scoring. Typically, credit scoring databases are large and characterised by redundant and irrelevant features, making the classification task more computationally-demanding. Feature selection is the process of selecting an optimal subset of relevant features. We propose an improved information-gain directed wrapper feature selection method using genetic algorithms and successfully evaluate its effectiveness against baseline and generic wrapper methods using three benchmark datasets. One of the tasks of financial analysts is to estimate a company’s worth. In the last piece of work, this study predicts the growth rate for earnings of companies using three machine learning techniques. We employed the technique of lagged features, which allowed varying amounts of recent history to be brought into the prediction task, and transformed the time series forecasting problem into a supervised learning problem. This work was applied on a private time series dataset

Cranfield CERES

Quantitative Methods for Economics and Finance

Author
Publication venue: 'MDPI AG'
Publication date: 01/05/2021
Field of study

This book is a collection of papers for the Special Issue “Quantitative Methods for Economics and Finance” of the journal Mathematics. This Special Issue reflects on the latest developments in different fields of economics and finance where mathematics plays a significant role. The book gathers 19 papers on topics such as volatility clusters and volatility dynamic, forecasting, stocks, indexes, cryptocurrencies and commodities, trade agreements, the relationship between volume and price, trading strategies, efficiency, regression, utility models, fraud prediction, or intertemporal choice

Directory of Open Access Books (DOAB)

Improving Demand Forecasting: The Challenge of Forecasting Studies Comparability and a Novel Approach to Hierarchical Time Series Forecasting

Author: Bauer Markus
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 29/07/2023
Field of study

Bedarfsprognosen sind in der Wirtschaft unerlässlich. Anhand des erwarteten Kundenbe-darfs bestimmen Firmen beispielsweise welche Produkte sie entwickeln, wie viele Fabri-ken sie bauen, wie viel Personal eingestellt wird oder wie viel Rohmaterial geordert wer-den muss. Fehleinschätzungen bei Bedarfsprognosen können schwerwiegende Auswir-kungen haben, zu Fehlentscheidungen führen, und im schlimmsten Fall den Bankrott einer Firma herbeiführen. Doch in vielen Fällen ist es komplex, den tatsächlichen Bedarf in der Zukunft zu antizipie-ren. Die Einflussfaktoren können vielfältig sein, beispielsweise makroökonomische Ent-wicklung, das Verhalten von Wettbewerbern oder technologische Entwicklungen. Selbst wenn alle Einflussfaktoren bekannt sind, sind die Zusammenhänge und Wechselwirkun-gen häufig nur schwer zu quantifizieren. Diese Dissertation trägt dazu bei, die Genauigkeit von Bedarfsprognosen zu verbessern. Im ersten Teil der Arbeit wird im Rahmen einer überfassenden Übersicht über das gesamte Spektrum der Anwendungsfelder von Bedarfsprognosen ein neuartiger Ansatz eingeführt, wie Studien zu Bedarfsprognosen systematisch verglichen werden können und am Bei-spiel von 116 aktuellen Studien angewandt. Die Vergleichbarkeit von Studien zu verbes-sern ist ein wesentlicher Beitrag zur aktuellen Forschung. Denn anders als bspw. in der Medizinforschung, gibt es für Bedarfsprognosen keine wesentlichen vergleichenden quan-titativen Meta-Studien. Der Grund dafür ist, dass empirische Studien für Bedarfsprognosen keine vereinheitlichte Beschreibung nutzen, um ihre Daten, Verfahren und Ergebnisse zu beschreiben. Wenn Studien hingegen durch systematische Beschreibung direkt miteinan-der verglichen werden können, ermöglicht das anderen Forschern besser zu analysieren, wie sich Variationen in Ansätzen auf die Prognosegüte auswirken – ohne die aufwändige Notwendigkeit, empirische Experimente erneut durchzuführen, die bereits in Studien beschrieben wurden. Diese Arbeit führt erstmals eine solche Systematik zur Beschreibung ein. Der weitere Teil dieser Arbeit behandelt Prognoseverfahren für intermittierende Zeitreihen, also Zeitreihen mit wesentlichem Anteil von Bedarfen gleich Null. Diese Art der Zeitreihen erfüllen die Anforderungen an Stetigkeit der meisten Prognoseverfahren nicht, weshalb gängige Verfahren häufig ungenügende Prognosegüte erreichen. Gleichwohl ist die Rele-vanz intermittierender Zeitreihen hoch – insbesondere Ersatzteile weisen dieses Bedarfs-muster typischerweise auf. Zunächst zeigt diese Arbeit in drei Studien auf, dass auch die getesteten Stand-der-Technik Machine Learning Ansätze bei einigen bekannten Datensät-zen keine generelle Verbesserung herbeiführen. Als wesentlichen Beitrag zur Forschung zeigt diese Arbeit im Weiteren ein neuartiges Verfahren auf: Der Similarity-based Time Series Forecasting (STSF) Ansatz nutzt ein Aggregation-Disaggregationsverfahren basie-rend auf einer selbst erzeugten Hierarchie statistischer Eigenschaften der Zeitreihen. In Zusammenhang mit dem STSF Ansatz können alle verfügbaren Prognosealgorithmen eingesetzt werden – durch die Aggregation wird die Stetigkeitsbedingung erfüllt. In Expe-rimenten an insgesamt sieben öffentlich bekannten Datensätzen und einem proprietären Datensatz zeigt die Arbeit auf, dass die Prognosegüte (gemessen anhand des Root Mean Square Error RMSE) statistisch signifikant um 1-5% im Schnitt gegenüber dem gleichen Verfahren ohne Einsatz von STSF verbessert werden kann. Somit führt das Verfahren eine wesentliche Verbesserung der Prognosegüte herbei. Zusammengefasst trägt diese Dissertation zum aktuellen Stand der Forschung durch die zuvor genannten Verfahren wesentlich bei. Das vorgeschlagene Verfahren zur Standardi-sierung empirischer Studien beschleunigt den Fortschritt der Forschung, da sie verglei-chende Studien ermöglicht. Und mit dem STSF Verfahren steht ein Ansatz bereit, der zuverlässig die Prognosegüte verbessert, und dabei flexibel mit verschiedenen Arten von Prognosealgorithmen einsetzbar ist. Nach dem Erkenntnisstand der umfassenden Literatur-recherche sind keine vergleichbaren Ansätze bislang beschrieben worden

KITopen

Big Data, machine learning and challenges of high dimensionality in financial administration

Author: Yaohao Peng
Publication venue
Publication date: 09/08/2019
Field of study

Tese (doutorado)—Universidade de Brasília, Faculdade de Economia, Administração e Contabilidade e Gestão Pública, Programa de Pós-Graduação em Administração, 2019.A presente tese discute a emergência do Big Data e do aprendizado de máquinas em vários aspectos da administração de empresas, enfatizando as contribuições metodológicas deste paradigma baseado no raciocínio indutivo em finanças e os benefícios desta abordagem em relação a ferramentas econométricas e métodos tradicionais de análise de dados. Os fundamentos estatísticos do aprendizado de máquina são introduzidos e os desafios da alta dimensionalidade em problemas financeiros são analisados, incluindo as implicações práticas da incorporação de não- linearidades, a regularização do nível de complexidade adicional e a previsão em dados de alta frequência. Finalmente, três aplicações empíricas foram propostas, relativas, respectivamente, à previsão de volatilidade, à alocação de portfólio e à previsão da direção do preço de ações; Nessas aplicações, diferentes modelos de aprendizado de máquina foram explorados, e os insights dos resultados foram discutidos à luz da teoria financeira e das evidências empíricas.Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES).This thesis discusses the emergence of Big Data and machine learning and their applications in various aspects of Business Administration, emphasizing the methodological contributions of this inductive-based paradigm in finance and the improvements of this approach over econometric tools and traditionally well established methods of data analysis. The statistical foundations of machine learning are introduced and the challenges of high-dimensionality in finance problems are analyzed, including the practical implications of nonlinearity incorporation, regularization of the additional complexity level and forecasting for high-frequency data. Finally, three empirical applications are proposed, concerning respectively on volatility forecasting, portfolio allocation, and stock price direction prediction; in those applications, different machine learning models are explored, and the insights from the results were discussed in light of both the finance theory and the empirical evidences

Repositório Institucional da Universidade de Brasília

Essays on Predictive Analytics in E-Commerce

Author: Urbanke Patrick
Publication venue
Publication date: 29/06/2016
Field of study

Die Motivation für diese Dissertation ist dualer Natur: Einerseits ist die Dissertation methodologisch orientiert und entwickelt neue statistische Ansätze und Algorithmen für maschinelles Lernen. Gleichzeitig ist sie praktisch orientiert und fokussiert sich auf den konkreten Anwendungsfall von Produktretouren im Onlinehandel. Die “data explosion”, veursacht durch die Tatsache, dass die Kosten für das Speichern und Prozessieren großer Datenmengen signifikant gesunken sind (Bhimani and Willcocks, 2014), und die neuen Technologien, die daraus resultieren, stellen die größte Diskontinuität für die betriebliche Praxis und betriebswirtschaftliche Forschung seit Entwicklung des Internets dar (Agarwal and Dhar, 2014). Insbesondere die Business Intelligence (BI) wurde als wichtiges Forschungsthema für Praktiker und Akademiker im Bereich der Wirtschaftsinformatik (WI) identifiziert (Chen et al., 2012). Maschinelles Lernen wurde erfolgreich auf eine Reihe von BI-Problemen angewandt, wie zum Beispiel Absatzprognose (Choi et al., 2014; Sun et al., 2008), Prognose von Windstromerzeugung (Wan et al., 2014), Prognose des Krankheitsverlaufs von Patienten eines Krankenhauses (Liu et al., 2015), Identifikation von Betrug Abbasi et al., 2012) oder Recommender-Systeme (Sahoo et al., 2012). Allerdings gibt es nur wenig Forschung, die sich mit Fragestellungen um maschinelles Lernen mit spezifischen Bezug zu BI befasst: Obwohl existierende Algorithmen teilweise modifiziert werden, um sie auf ein bestimmtes Problem anzupassen (Abbasi et al., 2010; Sahoo et al., 2012), beschränkt sich die WI-Forschung im Allgemeinen darauf, existierende Algorithmen, die für andere Fragestellungen als BI entwickelt wurden, auf BI-Fragestellungen anzuwenden (Abbasi et al., 2010; Sahoo et al., 2012). Das erste wichtige Ziel dieser Dissertation besteht darin, einen Beitrag dazu zu leisten, diese Lücke zu schließen. Diese Dissertation fokussiert sich auf das wichtige BI-Problem von Produktretouren im Onlinehandel für eine Illustration und praktische Anwendung der vorgeschlagenen Konzepte. Viele Onlinehändler sind nicht profitabel (Rigby, 2014) und Produktretouren sind eine wichtige Ursache für dieses Problem (Grewal et al., 2004). Neben Kostenaspekten sind Produktretouren aus ökologischer Sicht problematisch. In der Logistikforschung ist es weitestgehend Konsens, dass die “letzte Meile” der Zulieferkette, nämlich dann wenn das Produkt an die Haustür des Kunden geliefert wird, am CO2-intensivsten ist (Browne et al., 2008; Halldórsson et al., 2010; Song et al., 2009). Werden Produkte retourniert, wird dieser energieintensive Schritt wiederholt, wodurch sich die Nachhaltigkeit und Umweltfreundlichkeit des Geschäftsmodells von Onlinehändlern relativ zum klassischen Vertrieb reduziert. Allerdings können Onlinehändler Produktretouren nicht einfach verbieten, da sie einen wichtigen Teil ihres Geschäftsmodells darstellen: So hat die Möglichkeit, Produkte zu retournieren positive Auswirkungen auf Kundenzufriedenheit (Cassill, 1998), Kaufverhalten (Wood, 2001), künftiges Kaufverhalten (Petersen and Kumar, 2009) und emotianale Reaktionen der Kunden (Suwelack et al., 2011). Ein vielversprechender Ansatz besteht darin, sich auf impulsives und kompulsives (LaRose, 2001) sowie betrügerisches Kaufverhalten zu fokussieren (Speights and Hilinski, 2005; Wachter et al., 2012). In gegenwärtigen akademschen Literatur zu dem Thema gibt es keine solchen Strategien. Die meisten Strategien unterscheiden nicht zwischen gewollten und ungewollten Retouren (Walsh et al., 2014). Das zweite Ziel dieser Dissertation besteht daher darin, die Basis für eine Strategie von Prognose und Intervention zu entwickeln, mit welcher Konsumverhalten mit hoher Retourenwahrscheinlichkeit im Vorfeld erkannt und rechtzeitig interveniert werden kann. In dieser Dissertation werden mehrere Prognosemodelle entwickelt, auf Basis welcher demonstriert wird, dass die Strategie, unter der Annahme moderat effektiver Interventionsstrategien, erhebliche Kosteneinsparungen mit sich bringt

Georg-August-University Göttingen

Data-Driven Framework for Understanding & Modeling Ride-Sourcing Transportation Systems

Author: Kelleny Bishoy
Publication venue: ODU Digital Commons
Publication date: 01/05/2022
Field of study

Ride-sourcing transportation services offered by transportation network companies (TNCs) like Uber and Lyft are disrupting the transportation landscape. The growing demand on these services, along with their potential short and long-term impacts on the environment, society, and infrastructure emphasize the need to further understand the ride-sourcing system. There were no sufficient data to fully understand the system and integrate it within regional multimodal transportation frameworks. This can be attributed to commercial and competition reasons, given the technology-enabled and innovative nature of the system. Recently, in 2019, the City of Chicago the released an extensive and complete ride-sourcing trip-level data for all trips made within the city since November 1, 2018. The data comprises the trip ends (pick-up and drop-off locations), trip timestamps, trip length and duration, fare including tipping amounts, and whether the trip was authorized to be shared (pooled) with another passenger or not. Therefore, the main goal of this dissertation is to develop a comprehensive data-driven framework to understand and model the system using this data from Chicago, in a reproducible and transferable fashion. Using data fusion approach, sociodemographic, economic, parking supply, transit availability and accessibility, built environment and crime data are collected from open sources to develop this framework. The framework is predicated on three pillars of analytics: (1) explorative and descriptive analytics, (2) diagnostic analytics, and (3) predictive analytics. The dissertation research framework also provides a guide on the key spatial and behavioral explanatory variables shaping the utility of the mode, driving the demand, and governing the interdependencies between the demand’s willingness to share and surge price. Thus, the key findings can be readily challenged, verified, and utilized in different geographies. In the explorative and descriptive analytics, the ride-sourcing system’s spatial and temporal dimensions of the system are analyzed to achieve two objectives: (1) explore, reveal, and assess the significance of spatial effects, i.e., spatial dependence and heterogeneity, in the system behavior, and (2) develop a behavioral market segmentation and trend mining of the willingness to share. This is linked to the diagnostic analytics layer, as the revealed spatial effects motivates the adoption of spatial econometric models to analytically identify the ride-sourcing system determinants. Multiple linear regression (MLR) is used as a benchmark model against spatial error model (SEM), spatially lagged X (SLX) model, and geographically weighted regression (GWR) model. Two innovative modeling constructs are introduced deal with the ride-sourcing system’s spatial effects and multicollinearity: (1) Calibrated Spatially Lagged X Ridge Model (CSLXR) and Calibrated Geographically Weighted Ridge Regression (CGWRR) in the diagnostic analytics layer. The identified determinants in the diagnostic analytics layer are then fed into the predictive analytics one to develop an interpretable machine learning (ML) modeling framework. The system’s annual average weekday origin-destination (AAWD OD) flow is modeled using the following state-of-the-art ML models: (1) Multilayer Perceptron (MLP) Regression, (2) Support Vector Machines Regression (SVR), and (3) Tree-based ensemble learning methods, i.e., Random Forest Regression (RFR) and Extreme Gradient Boosting (XGBoost). The innovative modeling construct of CGWRR developed in the diagnostic analytics is then validated in a predictive context and is found to outperform the state-of-the-art ML models in terms of testing score of 0.914, in comparison to 0.906 for XGBoost, 0.84 for RFR, 0.89 for SVR, and 0.86 for MLP. The CGWRR exhibits outperformance as well in terms of the root mean squared error (RMSE) and mean average error (MAE). The findings of this dissertation partially bridge the gap between the practice and the research on ride-sourcing transportation systems understanding and integration. The empirical findings made in the descriptive and explorative analytics can be further utilized by regional agencies to fill practice and policymaking gaps on regulating ride-sourcing services using corridor or cordon toll, optimally allocating standing areas to minimize deadheading, especially during off-peak periods, and promoting the ride-share willingness in disadvantage communities. The CGWRR provides a reliable modeling and simulation tool to researchers and practitioners to integrate the ride-sourcing system in multimodal transportation modeling frameworks, simulation testbed for testing long-range impacts of policies on ride-sourcing, like improved transit supply, congestions pricing, or increased parking rates, and to plan ahead for similar futuristic transportation modes, like the shared autonomous vehicles

Old Dominion University