Search CORE

7,101 research outputs found

Comparison of Logistic Regression and Classification Trees to Forecast Short Term Defaults on Repeat Consumer Loans

Author: Naicker Keeland
Publication venue: Department of Finance and Tax
Publication date: 08/03/2022
Field of study

This dissertation highlights the performance comparison between two popular contemporary consumer loan credit scoring techniques, namely logistic regression and classification trees. Literature has shown logistic regression to perform better than classification trees in terms of predictiveness and robustness when forecasting consumer loan default events over standard twelve-month outcome periods. One of the major shortcomings with classification trees is its tendency to overfit data eroding its robustness, making it vulnerable to underlying population characteristic shifts. Classification trees remains a popular technique due to its ease of application (algorithm machine learning basis) and model interpretation. Past research has found classification trees to perform marginally better than logistic regression with respect to predictiveness and robustness when modelling short term consumer credit default outcomes related to previously unseen new customer credit loan applications. This dissertation independently tested this finding on reloan consumer loan data, repeat customers who renewed loan facilities at a significant South African micro lender. This dissertation tests the finding if the classification tree technique would outperform logistic regression when modelling this new type of loan data. Credit scoring models were built and tested for each respective technique across identical data sets with the intent to eliminate bias. Robustness tests were constructed via careful iterative data splits. Performance tests measuring predictiveness and robustness were conducted via the weighted sums of squared error evaluation approach. Results reveal logistic regression to outperform classification trees on predictiveness and robustness across the designed uniform iterative data splits, which suggests that logistic regression remains the superior technique when modelling short term credit default outcomes on reloan consumer loan data

Cape Town University OpenUCT

Advanced Signal Processing and Adaptive Learning Methods

Author: Delić Vlado
Pokrajac David
Stamenković Zoran
Publication venue: New York, NY [u.a.] : Hindawi Publ. Corp.
Publication date: 01/01/2019
Field of study

[No abstract available

Repositorium für Naturwissenschaften und Technik

Measurements, Models, Systems and Design

Author: Adamski M. Węgrzyn, M. Węgrzyn, A.
Barkalov A. Titarenko, L.
Benysek G. Jarnut, M. Rusiński, J.
Fedyczak Z. Szcześniak, P. Kaniewski, J.
Furmankiewicz L. Kozioł, M. Kłosiński, R.
Gałkowski K. Paszke, W. Sulikowski, B.
Gielerak R. Kuriata, E. Sawerwain, M. Pawłowski, K.
Kempski A. Smoleński, R. Kot, E.
Korbicz J. Witczak, M. Patan, K. Janczak, A. Mrugalski, M.
Korotyeyev I. Kasperek, R.
Michta E. Markowski, A.
Miczulski W. szulim, R.
Nikiel S. Steć, P.
Obuchowicz A. Pieczyński, A. Kowal, M. Prętki, P
Olencki A. Szmytkiewicz, J. Urbański, K.
Popławski A. Zając, W.
Rybski R. Kaczmarek J. Lal-Jadziak, J.
Uciński D. Patan, M. Kuczewski, B.
Publication venue: Wydawnictwa Komunikacji i Łączności, Warszawa
Publication date: 01/01/2007
Field of study

531 s.

Zielonogórska Biblioteka Cyfrowa (Digital Library of Zielona Gora)

Creating Full Individual-level Location Timelines from Sparse Social Media Data

Author: Chon Yohan
Jurgens David
Litman Todd
Sadilek Adam
Yu A. J.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 22/11/2019
Field of study

In many domain applications, a continuous timeline of human locations is critical; for example for understanding possible locations where a disease may spread, or the flow of traffic. While data sources such as GPS trackers or Call Data Records are temporally-rich, they are expensive, often not publicly available or garnered only in select locations, restricting their wide use. Conversely, geo-located social media data are publicly and freely available, but present challenges especially for full timeline inference due to their sparse nature. We propose a stochastic framework, Intermediate Location Computing (ILC) which uses prior knowledge about human mobility patterns to predict every missing location from an individual's social media timeline. We compare ILC with a state-of-the-art RNN baseline as well as methods that are optimized for next-location prediction only. For three major cities, ILC predicts the top 1 location for all missing locations in a timeline, at 1 and 2-hour resolution, with up to 77.2% accuracy (up to 6% better accuracy than all compared methods). Specifically, ILC also outperforms the RNN in settings of low data; both cases of very small number of users (under 50), as well as settings with more users, but with sparser timelines. In general, the RNN model needs a higher number of users to achieve the same performance as ILC. Overall, this work illustrates the tradeoff between prior knowledge of heuristics and more data, for an important societal problem of filling in entire timelines using freely available, but sparse social media data.Comment: 10 pages, 8 figures, 2 table

arXiv.org e-Print Archive

Crossref

Neural networks in multiphase reactors data mining: feature selection, prior knowledge, and model design

Author: Tarca Adi-Laurentiu
Publication venue
Publication date: 11/04/2018
Field of study

Les réseaux de neurones artificiels (RNA) suscitent toujours un vif intérêt dans la plupart des domaines d’ingénierie non seulement pour leur attirante « capacité d’apprentissage » mais aussi pour leur flexibilité et leur bonne performance, par rapport aux approches classiques. Les RNA sont capables «d’approximer» des relations complexes et non linéaires entre un vecteur de variables d’entrées x et une sortie y. Dans le contexte des réacteurs multiphasiques le potentiel des RNA est élevé car la modélisation via la résolution des équations d’écoulement est presque impossible pour les systèmes gaz-liquide-solide. L’utilisation des RNA dans les approches de régression et de classification rencontre cependant certaines difficultés. Un premier problème, général à tous les types de modélisation empirique, est celui de la sélection des variables explicatives qui consiste à décider quel sous-ensemble xs ⊂ x des variables indépendantes doit être retenu pour former les entrées du modèle. Les autres difficultés à surmonter, plus spécifiques aux RNA, sont : le sur-apprentissage, l’ambiguïté dans l’identification de l’architecture et des paramètres des RNA et le manque de compréhension phénoménologique du modèle résultant. Ce travail se concentre principalement sur trois problématiques dans l’utilisation des RNA: i) la sélection des variables, ii) l’utilisation de la connaissance apriori, et iii) le design du modèle. La sélection des variables, dans le contexte de la régression avec des groupes adimensionnels, a été menée avec les algorithmes génétiques. Dans le contexte de la classification, cette sélection a été faite avec des méthodes séquentielles. Les types de connaissance a priori que nous avons insérés dans le processus de construction des RNA sont : i) la monotonie et la concavité pour la régression, ii) la connectivité des classes et des coûts non égaux associés aux différentes erreurs, pour la classification. Les méthodologies développées dans ce travail ont permis de construire plusieurs modèles neuronaux fiables pour les prédictions de la rétention liquide et de la perte de charge dans les colonnes garnies à contre-courant ainsi que pour la prédiction des régimes d’écoulement dans les colonnes garnies à co-courant.Artificial neural networks (ANN) have recently gained enormous popularity in many engineering fields, not only for their appealing “learning ability, ” but also for their versatility and superior performance with respect to classical approaches. Without supposing a particular equational form, ANNs mimic complex nonlinear relationships that might exist between an input feature vector x and a dependent (output) variable y. In the context of multiphase reactors the potential of neural networks is high as the modeling by resolution of first principle equations to forecast sought key hydrodynamics and transfer characteristics is intractable. The general-purpose applicability of neural networks in regression and classification, however, poses some subsidiary difficulties that can make their use inappropriate for certain modeling problems. Some of these problems are general to any empirical modeling technique, including the feature selection step, in which one has to decide which subset xs ⊂ x should constitute the inputs (regressors) of the model. Other weaknesses specific to the neural networks are overfitting, model design ambiguity (architecture and parameters identification), and the lack of interpretability of resulting models. This work addresses three issues in the application of neural networks: i) feature selection ii) prior knowledge matching within the models (to answer to some extent the overfitting and interpretability issues), and iii) the model design. Feature selection was conducted with genetic algorithms (yet another companion from artificial intelligence area), which allowed identification of good combinations of dimensionless inputs to use in regression ANNs, or with sequential methods in a classification context. The type of a priori knowledge we wanted the resulting ANN models to match was the monotonicity and/or concavity in regression or class connectivity and different misclassification costs in classification. Even the purpose of the study was rather methodological; some resulting ANN models might be considered contributions per se. These models-- direct proofs for the underlying methodologies-- are useful for predicting liquid hold-up and pressure drop in counter-current packed beds and flow regime type in trickle beds

CorpusUL

Exploring the Application of Wearable Movement Sensors in People with Knee Osteoarthritis

Author: Tan Jay-Shian
Publication venue: Curtin University
Publication date: 01/01/2022
Field of study

People with knee osteoarthritis have difficulty with functional activities, such as walking or get into/out of a chair. This thesis explored the clinical relevance of biomechanics and how wearable sensor technology may be used to assess how people move when their clinician is unable to directly observe them, such as at home or work. The findings of this thesis suggest that artificial intelligence can be used to process data from sensors to provide clinically important information about how people perform troublesome activities

espace@Curtin

Datadriven Human Intention Analysis : Supported by Virtual Reality and Eye Tracking

Author: Pettersson Julius
Publication venue
Publication date: 01/01/2021
Field of study

The ability to determine an upcoming action or what decision a human is about to take, can be useful in multiple areas, for example in manufacturing where humans working with collaborative robots, where knowing the intent of the operator could provide the robot with important information to help it navigate more safely. Another field that could benefit from a system that provides information regarding human intentions is the field of psychological testing where such a system could be used as a platform for new research or be one way to provide information in the diagnostic process. The work presented in this thesis investigates the potential use of virtual reality as a safe, customizable environment to collect gaze and movement data, eye tracking as the non-invasive system input that gives insight into the human mind, and deep machine learning as the tool that analyzes the data. The thesis defines an experimental procedure that can be used to construct a virtual reality based testing system that gathers gaze and movement data, carries out a test study to gather data from human participants, and implements an artificial neural network in order to analyze human behaviour. This is followed by four studies that gives evidence to the decisions that were made in the experimental procedure and shows the potential uses of such a system

Chalmers Research

Gaze Based Human Intention Analysis

Author: Pettersson Julius
Publication venue
Publication date: 01/01/2023
Field of study

The ability to determine an upcoming action or what decision a human is about to take, can be useful in multiple areas, for example during human-robot collaboration in manufacturing, where knowing the intent of the operator could provide the robot with important information to help it navigate more safely. Another field that could benefit from a system that provides information regarding human intentions is the field of psychological testing where such a system could be used as a platform for new research or be one way to provide information in the diagnostic process. The work presented in this thesis investigates the potential use of virtual reality as a safe, measurable, and customizable environment to collect gaze and movement data, eye tracking as the non-invasive system input that gives insight into the human mind, and deep machine learning as one tool to analyze the data. The thesis defines an experimental procedure that can be used to construct a virtual reality based testing system that gathers gaze and movement data, carry out a test study to gather data from human participants, and implement artificial neural networks in order to analyze human behaviour. This is followed by two studies that gives evidence to the decisions that were made in the experimental procedure and shows the potential uses of such a system

Chalmers Research