375 research outputs found

    Identification of various user group in university counseling scene using machine learning algorithm

    Get PDF
    Department of Biomedical Engineering (Human Factors Engineering)There are increasing need for improving university students' mental health. Still, the quantitative and qualitative abilities of the university counseling center are insufficient to handle the increasing need and severity. In addition, it is difficult for individual counseling centers to solve such problems due to lack of numbers in counselors and low budget. Therefore, to solve current problems by increasing the service efficiency of the university counseling center service, identification of various user groups was done using machine learning algorithm based on initial stage data of counseling service. To be specific, user service, additional clinical and latent group was identified to reduce counselor???s burden at initial stage of service and provide reference for clinical decision and future service planning. This study utilized data acquired from UNIST healthcare center and analyzed in two major steps. First, service (counseling or clinical treatment with drug use) and clinical (suicidal-risk and potential dropout) group classification was done using supervised learning algorithms and identified important feature in classifying each group. Then, the latent groups reflecting the detailed characteristics of users was analyzed by using latent class analysis. Laten user group identification detected 5 different latent groups (lower risk, lower/moderate risk, moderate risk, higher risk with sleep issue, sleep problem group) and their distinctive characteristics. Current study successfully detected meaningful reference for university counseling service using data focused on the initial stage of the service. Analyzing service effectiveness and using it as reference for converting counseling service into clinical treatment according to current study results will increase overall service effectiveness provided to individual users. The result of suicidal-risk group classification identified similar features from prior researches without using additional screening tools. Interestingly, dropout group classification results identified features that were not found in prior research which can be used in future service planning to prevent user dropout during the service. After the classifications of various user groups were conducted, improving the result of ensemble modeling using stacking classifier was done to achieve higher performance in type 2 error of classification results. Latent group identification found sub-groups that can be applicable to existing counseling services and possible customized clinical approaches can be provided to individual latent groups. Further studies including 1) improving machine learning algorithm performance by developing data collection methods that reflect user characteristics 2) better, service effectivity analysis using overall service records and 3) applying studied researches to other university counseling centers will also contribute to reducing individual counselor???s burden and enhancing university counseling center service effectiveness.ope

    Big Data - Supply Chain Management Framework for Forecasting: Data Preprocessing and Machine Learning Techniques

    Full text link
    This article intends to systematically identify and comparatively analyze state-of-the-art supply chain (SC) forecasting strategies and technologies. A novel framework has been proposed incorporating Big Data Analytics in SC Management (problem identification, data sources, exploratory data analysis, machine-learning model training, hyperparameter tuning, performance evaluation, and optimization), forecasting effects on human-workforce, inventory, and overall SC. Initially, the need to collect data according to SC strategy and how to collect them has been discussed. The article discusses the need for different types of forecasting according to the period or SC objective. The SC KPIs and the error-measurement systems have been recommended to optimize the top-performing model. The adverse effects of phantom inventory on forecasting and the dependence of managerial decisions on the SC KPIs for determining model performance parameters and improving operations management, transparency, and planning efficiency have been illustrated. The cyclic connection within the framework introduces preprocessing optimization based on the post-process KPIs, optimizing the overall control process (inventory management, workforce determination, cost, production and capacity planning). The contribution of this research lies in the standard SC process framework proposal, recommended forecasting data analysis, forecasting effects on SC performance, machine learning algorithms optimization followed, and in shedding light on future research

    Predicting potential customer needs and wants for agile design and manufacture in an industry 4.0 environment

    Get PDF
    Manufacturing is currently experiencing a paradigm shift in the way that products are designed, produced and serviced. Such changes are brought about mainly by the extensive use of the Internet and digital technologies. As a result of this shift, a new industrial revolution is emerging, termed “Industry 4.0” (i4), which promises to accommodate mass customisation at a mass production cost. For i4 to become a reality, however, multiple challenges need to be addressed, highlighting the need for design for agile manufacturing and, for this, a framework capable of integrating big data analytics arising from the service end, business informatics through the manufacturing process, and artificial intelligence (AI) for the entire manufacturing value chain. This thesis attempts to address these issues, with a focus on the need for design for agile manufacturing. First, the state of the art in this field of research is reviewed on combining cutting-edge technologies in digital manufacturing with big data analysed to support agile manufacturing. Then, the work is focused on developing an AI-based framework to address one of the customisation issues in smart design and agile manufacturing, that is, prediction of potential customer needs and wants. With this framework, an AI-based approach is developed to predict design attributes that would help manufacturers to decide the best virtual designs to meet emerging customer needs and wants predictively. In particular, various machine learning approaches are developed to help explain at least 85% of the design variance when building a model to predict potential customer needs and wants. These approaches include k-means clustering, self-organizing maps, fuzzy k-means clustering, and decision trees, all supporting a vector machine to evaluate and extract conscious and subconscious customer needs and wants. A model capable of accurately predicting customer needs and wants for at least 85% of classified design attributes is thus obtained. Further, an analysis capable of determining the best design attributes and features for predicting customer needs and wants is also achieved. As the information analysed can be utilized to advise the selection of desired attributes, it is fed back in a closed-loop of the manufacturing value chain: design → manufacture → management/service → → → design... For this, a total of 4 case studies are undertaken to test and demonstrate the efficacy and effectiveness of the framework developed. These case studies include: 1) an evaluation model of consumer cars with multiple attributes including categorical and numerical ones; 2) specifications of automotive vehicles in terms of various characteristics including categorical and numerical instances; 3) fuel consumptions of various car models and makes, taking into account a desire for low fuel costs and low CO2 emissions; and 4) computer parts design for recommending the best design attributes when buying a computer. The results show that the decision trees, as a machine learning approach, work best in predicting customer needs and wants for smart design. With the tested framework and methodology, this thesis overall presents a holistic attempt to addressing the missing gap between manufacture and customisation, that is meeting customer needs and wants. Effective ways of achieving customization for i4 and smart manufacturing are identified. This is achieved through predicting potential customer needs and wants and applying the prediction at the product design stage for agile manufacturing to meet individual requirements at a mass production cost. Such agility is one key element in realising Industry 4.0. At the end, this thesis contributes to improving the process of analysing the data to predict potential customer needs and wants to be used as inputs to customizing product designs agilely

    2020 SDSU Data Science Symposium Program

    Get PDF
    https://openprairie.sdstate.edu/ds_symposium_programs/1002/thumbnail.jp

    Game-Theoretic and Machine-Learning Techniques for Cyber-Physical Security and Resilience in Smart Grid

    Get PDF
    The smart grid is the next-generation electrical infrastructure utilizing Information and Communication Technologies (ICTs), whose architecture is evolving from a utility-centric structure to a distributed Cyber-Physical System (CPS) integrated with a large-scale of renewable energy resources. However, meeting reliability objectives in the smart grid becomes increasingly challenging owing to the high penetration of renewable resources and changing weather conditions. Moreover, the cyber-physical attack targeted at the smart grid has become a major threat because millions of electronic devices interconnected via communication networks expose unprecedented vulnerabilities, thereby increasing the potential attack surface. This dissertation is aimed at developing novel game-theoretic and machine-learning techniques for addressing the reliability and security issues residing at multiple layers of the smart grid, including power distribution system reliability forecasting, risk assessment of cyber-physical attacks targeted at the grid, and cyber attack detection in the Advanced Metering Infrastructure (AMI) and renewable resources. This dissertation first comprehensively investigates the combined effect of various weather parameters on the reliability performance of the smart grid, and proposes a multilayer perceptron (MLP)-based framework to forecast the daily number of power interruptions in the distribution system using time series of common weather data. Regarding evaluating the risk of cyber-physical attacks faced by the smart grid, a stochastic budget allocation game is proposed to analyze the strategic interactions between a malicious attacker and the grid defender. A reinforcement learning algorithm is developed to enable the two players to reach a game equilibrium, where the optimal budget allocation strategies of the two players, in terms of attacking/protecting the critical elements of the grid, can be obtained. In addition, the risk of the cyber-physical attack can be derived based on the successful attack probability to various grid elements. Furthermore, this dissertation develops a multimodal data-driven framework for the cyber attack detection in the power distribution system integrated with renewable resources. This approach introduces the spare feature learning into an ensemble classifier for improving the detection efficiency, and implements the spatiotemporal correlation analysis for differentiating the attacked renewable energy measurements from fault scenarios. Numerical results based on the IEEE 34-bus system show that the proposed framework achieves the most accurate detection of cyber attacks reported in the literature. To address the electricity theft in the AMI, a Distributed Intelligent Framework for Electricity Theft Detection (DIFETD) is proposed, which is equipped with Benford’s analysis for initial diagnostics on large smart meter data. A Stackelberg game between utility and multiple electricity thieves is then formulated to model the electricity theft actions. Finally, a Likelihood Ratio Test (LRT) is utilized to detect potentially fraudulent meters

    Predicting Account Receivables Outcomes with Machine-Learning

    Get PDF
    Project Work presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceThe Account Receivables (AR) of a company are considered an important determinant of a company’s Cash Flow – the backbone of a company’s financial performance or health. It has been proved that by efficiently managing the money owed by customers for goods and services (AR), a company can avoid financial difficulties and even stabilize results in moments of extreme volatility. The aim of this project is to use machine-learning and data visualization techniques to predict invoice outcomes and provide useful information and a solution using analytics to the collection management team. Specifically, this project demonstrates how supervised learning models can classify with high accuracy whether a newly created invoice will be paid earlier, on-time or later than the contracted due date. It is also studied how to predict the magnitude of the delayed payments by classifying them into interesting, delayed categories for the business: up to 1 month late, from 1 to 3 months late and delayed for more than 3 months. The developed models use real-life data from a multinational company in the manufacturing and automation industries and can predict payments with higher accuracy than the baseline achieved by the business

    Design, implementation and realization of an integrated platform dedicated to e-public health, for analysing health data and supporting the management control in healthcare companies.

    Get PDF
    In healthcare, the information is a fundamental aspect and the human body is the major source of every kind of data: the challenge is to benefit from this huge amount of unstructured data by applying technologic solutions, called Big Data Analysis, that allows the management of data and the extraction of information through informatic systems. This thesis aims to introduce a technologic solution made up of two open source platforms: Power BI and Knime Analytics Platform. First, the importance, the role and the processes of business intelligence and machine learning in healthcare will be discussed; secondly, the platforms will be described, particularly enhancing their feasibility and capacities. Then, the clinical specialties, where they have been applied, will be shown by highlighting the international literature that have been produced: neurology, cardiology, oncology, fetal-monitoring and others. An application in the current pandemic situation due to SARS-CoV-2 will be described by using more than 50000 records: a cascade of 3 platforms helping health facilities to deal with the current worldwide pandemic. Finally, the advantages, the disadvantages, the limitations and the future developments in this framework will be discussed while the architectural technologic solution containing a data warehouse, a platform to collect data, two platforms to analyse health and management data and the possible applications will be shown

    Essays in financial technology: banking efficiency and application of machine learning models in Supply Chain Finance and credit risk assessment

    Get PDF
    The financial landscape is undergoing a significant transformation, driven by technological innovations that are reshaping traditional banking practices. This thesis examines the evolving relationship between financial technology (FinTech) and banking, specifically addressing the credit risk aspects within the domains of Supply Chain Finance (SCF) and peer-to-peer (P2P) lending. FinTech has experienced rapid growth and innovation over the past decade. It encompasses a wide range of technologies and services that aim to enhance and streamline financial processes, disrupt traditional banking models, and offer new solutions to consumers and businesses. The status of FinTech and banking is assessed through an extensive review of the current literature and empirical data. Accordingly, FinTech development has significantly impacted the financial landscape, driving innovation, competition, and customer expectations while it has exposed inefficiencies within traditional banking, it has also compelled banks to evolve and embrace technological advancements. The impact of FinTech on traditional banking models, customer behaviours, and market competition is aimed to be explored. This investigation highlights the challenges and opportunities that arise as FinTech disrupts and reshapes the banking sector, emphasizing its potential to enhance efficiency, accessibility, and customer experiences. As Chapter 3 focuses on an empirical analysis of the impact of FinTech on the operating efficiency of commercial banks in China. Further, in the context of credit risk, the thesis focuses on SCF and P2P lending, two prominent areas influenced by FinTech innovation. SCF has witnessed substantial transformation with the infusion of FinTech solutions. Digital platforms have streamlined the flow of funds within complex supply networks, enhancing the liquidity of suppliers and optimizing working capital for buyers. However, this transformation introduces new credit risk challenges. As suppliers' financial data becomes more accessible, the need for accurate risk assessment and predictive modelling becomes paramount. The integration of big data analytics, machine learning, and artificial intelligence (AI) holds the promise of refining credit risk evaluation by offering real-time insights into supplier financial health, thereby improving lending decisions and reducing defaults. Similarly, P2P lending has redefined the borrowing and lending landscape, enabling direct connections between individual borrowers and lenders. While P2P lending platforms offer speed, convenience, and access to credit for previously underserved segments, they also grapple with credit risk concerns. Evaluating the creditworthiness of individual borrowers without sufficient credit history demands innovative risk assessment methodologies. The emergence of data issues, such as imbalanced data issues, feature selection, and data processing, presents challenges in building accurate credit risk profiles for P2P lending participants. FinTech solutions play a pivotal role in creating and implementing these alternative risk assessment models. Note that, few studies in the literature investigate the benchmark of the advanced method of solving the credit risk assessment in emerging financial services. This thesis aims to address this research gap by evaluating the effectiveness of credit risk assessment models in these FinTech-driven contexts, considering both traditional methodologies and novel data-driven approaches. Chapter 4 investigates the credit risk assessment issue in Digital Supply Chain Finance (DSCF) with the Machine Learning approach and Chapter 5 emphasises the issue of data imbalance of credit risk assessment in P2P Lending. By addressing these gaps and issues, this thesis aims to contribute to the broader discourse on FinTech's role in shaping the future of banking. The findings have implications for financial institutions, policymakers, and regulators seeking to harness the benefits of FinTech while mitigating associated risks. Ultimately, this study offers insights into navigating the evolving landscape of credit risk in SCF and P2P lending within the context of an increasingly technology-driven financial ecosystem

    Data-Driven Models, Techniques, and Design Principles for Combatting Healthcare Fraud

    Get PDF
    In the U.S., approximately 700billionofthe700 billion of the 2.7 trillion spent on healthcare is linked to fraud, waste, and abuse. This presents a significant challenge for healthcare payers as they navigate fraudulent activities from dishonest practitioners, sophisticated criminal networks, and even well-intentioned providers who inadvertently submit incorrect billing for legitimate services. This thesis adopts Hevner’s research methodology to guide the creation, assessment, and refinement of a healthcare fraud detection framework and recommended design principles for fraud detection. The thesis provides the following significant contributions to the field:1. A formal literature review of the field of fraud detection in Medicaid. Chapters 3 and 4 provide formal reviews of the available literature on healthcare fraud. Chapter 3 focuses on defining the types of fraud found in healthcare. Chapter 4 reviews fraud detection techniques in literature across healthcare and other industries. Chapter 5 focuses on literature covering fraud detection methodologies utilized explicitly in healthcare.2. A multidimensional data model and analysis techniques for fraud detection in healthcare. Chapter 5 applies Hevner et al. to help develop a framework for fraud detection in Medicaid that provides specific data models and techniques to identify the most prevalent fraud schemes. A multidimensional schema based on Medicaid data and a set of multidimensional models and techniques to detect fraud are presented. These artifacts are evaluated through functional testing against known fraud schemes. This chapter contributes a set of multidimensional data models and analysis techniques that can be used to detect the most prevalent known fraud types.3. A framework for deploying outlier-based fraud detection methods in healthcare. Chapter 6 proposes and evaluates methods for applying outlier detection to healthcare fraud based on literature review, comparative research, direct application on healthcare claims data, and known fraudulent cases. A method for outlier-based fraud detection is presented and evaluated using Medicaid dental claims, providers, and patients.4. Design principles for fraud detection in complex systems. Based on literature and applied research in Medicaid healthcare fraud detection, Chapter 7 offers generalized design principles for fraud detection in similar complex, multi-stakeholder systems.<br/
    corecore