712 research outputs found

    Artificial Intelligence Technology

    Get PDF
    This open access book aims to give our readers a basic outline of today’s research and technology developments on artificial intelligence (AI), help them to have a general understanding of this trend, and familiarize them with the current research hotspots, as well as part of the fundamental and common theories and methodologies that are widely accepted in AI research and application. This book is written in comprehensible and plain language, featuring clearly explained theories and concepts and extensive analysis and examples. Some of the traditional findings are skipped in narration on the premise of a relatively comprehensive introduction to the evolution of artificial intelligence technology. The book provides a detailed elaboration of the basic concepts of AI, machine learning, as well as other relevant topics, including deep learning, deep learning framework, Huawei MindSpore AI development framework, Huawei Atlas computing platform, Huawei AI open platform for smart terminals, and Huawei CLOUD Enterprise Intelligence application platform. As the world’s leading provider of ICT (information and communication technology) infrastructure and smart terminals, Huawei’s products range from digital data communication, cyber security, wireless technology, data storage, cloud computing, and smart computing to artificial intelligence

    Artificial Intelligence Technology

    Get PDF
    This open access book aims to give our readers a basic outline of today’s research and technology developments on artificial intelligence (AI), help them to have a general understanding of this trend, and familiarize them with the current research hotspots, as well as part of the fundamental and common theories and methodologies that are widely accepted in AI research and application. This book is written in comprehensible and plain language, featuring clearly explained theories and concepts and extensive analysis and examples. Some of the traditional findings are skipped in narration on the premise of a relatively comprehensive introduction to the evolution of artificial intelligence technology. The book provides a detailed elaboration of the basic concepts of AI, machine learning, as well as other relevant topics, including deep learning, deep learning framework, Huawei MindSpore AI development framework, Huawei Atlas computing platform, Huawei AI open platform for smart terminals, and Huawei CLOUD Enterprise Intelligence application platform. As the world’s leading provider of ICT (information and communication technology) infrastructure and smart terminals, Huawei’s products range from digital data communication, cyber security, wireless technology, data storage, cloud computing, and smart computing to artificial intelligence

    A Model for Investigating Internal Control Weaknesses

    Get PDF
    Scandals in corporate finance in the early 2000s and subsequent policy changes led corporate executives to adopt a more risk-based approach in corporate governance. Therefore, identification and assessment of risks became extremely important. Risk assessment poses a particular challenge for auditors due to the highly complex structure and processes of internal control systems. Extant research in this area mostly focused on probabilistic models and expert systems that capture and model heuristic knowledge. However, evidence suggests that knowledge of the structure of the internal control system is also essential. There is relatively little research that focuses on the modeling of the structural aspects of financial processes and their internal control systems as a means of helping corporate executives and auditors perform their respective tasks of risk management and assessment. This article proposes an approach to risk management and assessment in internal control systems that models the structure and financial processes of an internal control system. The model uses a directed graph to represent the various elements in an internal control system, such as financial statement assertions, control activities, financial processes, and the causal relationships that exist among these elements. The article demonstrates the usefulness of the model by presenting and discussing algorithms based on this model to help corporate executives manage risk and to help internal and external auditors assess risk, for designing substantive testing and for tracing sources of errors

    An artificial intelligence-based collaboration approach in industrial IoT manufacturing : key concepts, architectural extensions and potential applications

    Get PDF
    The digitization of manufacturing industry has led to leaner and more efficient production, under the Industry 4.0 concept. Nowadays, datasets collected from shop floor assets and information technology (IT) systems are used in data-driven analytics efforts to support more informed business intelligence decisions. However, these results are currently only used in isolated and dispersed parts of the production process. At the same time, full integration of artificial intelligence (AI) in all parts of manufacturing systems is currently lacking. In this context, the goal of this manuscript is to present a more holistic integration of AI by promoting collaboration. To this end, collaboration is understood as a multi-dimensional conceptual term that covers all important enablers for AI adoption in manufacturing contexts and is promoted in terms of business intelligence optimization, human-in-the-loop and secure federation across manufacturing sites. To address these challenges, the proposed architectural approach builds on three technical pillars: (1) components that extend the functionality of the existing layers in the Reference Architectural Model for Industry 4.0; (2) definition of new layers for collaboration by means of human-in-the-loop and federation; (3) security concerns with AI-powered mechanisms. In addition, system implementation aspects are discussed and potential applications in industrial environments, as well as business impacts, are presented

    Advanced document data extraction techniques to improve supply chain performance

    Get PDF
    In this thesis, a novel machine learning technique to extract text-based information from scanned images has been developed. This information extraction is performed in the context of scanned invoices and bills used in financial transactions. These financial transactions contain a considerable amount of data that must be extracted, refined, and stored digitally before it can be used for analysis. Converting this data into a digital format is often a time-consuming process. Automation and data optimisation show promise as methods for reducing the time required and the cost of Supply Chain Management (SCM) processes, especially Supplier Invoice Management (SIM), Financial Supply Chain Management (FSCM) and Supply Chain procurement processes. This thesis uses a cross-disciplinary approach involving Computer Science and Operational Management to explore the benefit of automated invoice data extraction in business and its impact on SCM. The study adopts a multimethod approach based on empirical research, surveys, and interviews performed on selected companies.The expert system developed in this thesis focuses on two distinct areas of research: Text/Object Detection and Text Extraction. For Text/Object Detection, the Faster R-CNN model was analysed. While this model yields outstanding results in terms of object detection, it is limited by poor performance when image quality is low. The Generative Adversarial Network (GAN) model is proposed in response to this limitation. The GAN model is a generator network that is implemented with the help of the Faster R-CNN model and a discriminator that relies on PatchGAN. The output of the GAN model is text data with bonding boxes. For text extraction from the bounding box, a novel data extraction framework consisting of various processes including XML processing in case of existing OCR engine, bounding box pre-processing, text clean up, OCR error correction, spell check, type check, pattern-based matching, and finally, a learning mechanism for automatizing future data extraction was designed. Whichever fields the system can extract successfully are provided in key-value format.The efficiency of the proposed system was validated using existing datasets such as SROIE and VATI. Real-time data was validated using invoices that were collected by two companies that provide invoice automation services in various countries. Currently, these scanned invoices are sent to an OCR system such as OmniPage, Tesseract, or ABBYY FRE to extract text blocks and later, a rule-based engine is used to extract relevant data. While the system’s methodology is robust, the companies surveyed were not satisfied with its accuracy. Thus, they sought out new, optimized solutions. To confirm the results, the engines were used to return XML-based files with text and metadata identified. The output XML data was then fed into this new system for information extraction. This system uses the existing OCR engine and a novel, self-adaptive, learning-based OCR engine. This new engine is based on the GAN model for better text identification. Experiments were conducted on various invoice formats to further test and refine its extraction capabilities. For cost optimisation and the analysis of spend classification, additional data were provided by another company in London that holds expertise in reducing their clients' procurement costs. This data was fed into our system to get a deeper level of spend classification and categorisation. This helped the company to reduce its reliance on human effort and allowed for greater efficiency in comparison with the process of performing similar tasks manually using excel sheets and Business Intelligence (BI) tools.The intention behind the development of this novel methodology was twofold. First, to test and develop a novel solution that does not depend on any specific OCR technology. Second, to increase the information extraction accuracy factor over that of existing methodologies. Finally, it evaluates the real-world need for the system and the impact it would have on SCM. This newly developed method is generic and can extract text from any given invoice, making it a valuable tool for optimizing SCM. In addition, the system uses a template-matching approach to ensure the quality of the extracted information

    Decision support continuum paradigm for cardiovascular disease: Towards personalized predictive models

    Get PDF
    Clinical decision making is a ubiquitous and frequent task physicians make in their daily clinical practice. Conventionally, physicians adopt a cognitive predictive modelling process (i.e. knowledge and experience learnt from past lecture, research, literature, patients, etc.) for anticipating or ascertaining clinical problems based on clinical risk factors that they deemed to be most salient. However, with the inundation of health data and the confounding characteristics of diseases, more effective clinical prediction approaches are required to address these challenges. Approximately a few century ago, the first major transformation of medical practice took place as science-based approaches emerged with compelling results. Now, in the 21st century, new advances in science will once again transform healthcare. Data science has been postulated as an important component in this healthcare reform and has received escalating interests for its potential for ‘personalizing’ medicine. The key advantages of having personalized medicine include, but not limited to, (1) more effective methods for disease prevention, management and treatment, (2) improved accuracy for clinical diagnosis and prognosis, (3) provide patient-oriented personal health plan, and (4) cost containment. In view of the paramount importance of personalized predictive models, this thesis proposes 2 novel learning algorithms (i.e. an immune-inspired algorithm called the Evolutionary Data-Conscious Artificial Immune Recognition System, and a neural-inspired algorithm called the Artificial Neural Cell System for classification) and 3 continuum-based paradigms (i.e. biological, time and age continuum) for enhancing clinical prediction. Cardiovascular disease has been selected as the disease under investigation as it is an epidemic and major health concern in today’s world. We believe that our work has a meaningful and significant impact to the development of future healthcare system and we look forward to the wide adoption of advanced medical technologies by all care centres in the near future.Open Acces

    Analyzing Social and Stylometric Features to Identify Spear phishing Emails

    Full text link
    Spear phishing is a complex targeted attack in which, an attacker harvests information about the victim prior to the attack. This information is then used to create sophisticated, genuine-looking attack vectors, drawing the victim to compromise confidential information. What makes spear phishing different, and more powerful than normal phishing, is this contextual information about the victim. Online social media services can be one such source for gathering vital information about an individual. In this paper, we characterize and examine a true positive dataset of spear phishing, spam, and normal phishing emails from Symantec's enterprise email scanning service. We then present a model to detect spear phishing emails sent to employees of 14 international organizations, by using social features extracted from LinkedIn. Our dataset consists of 4,742 targeted attack emails sent to 2,434 victims, and 9,353 non targeted attack emails sent to 5,912 non victims; and publicly available information from their LinkedIn profiles. We applied various machine learning algorithms to this labeled data, and achieved an overall maximum accuracy of 97.76% in identifying spear phishing emails. We used a combination of social features from LinkedIn profiles, and stylometric features extracted from email subjects, bodies, and attachments. However, we achieved a slightly better accuracy of 98.28% without the social features. Our analysis revealed that social features extracted from LinkedIn do not help in identifying spear phishing emails. To the best of our knowledge, this is one of the first attempts to make use of a combination of stylometric features extracted from emails, and social features extracted from an online social network to detect targeted spear phishing emails.Comment: Detection of spear phishing using social media feature

    Application of modern statistical methods in worldwide health insurance

    Get PDF
    With the increasing availability of internal and external data in the (health) insurance industry, the demand for new data insights from analytical methods is growing. This dissertation presents four examples of the application of advanced regression-based prediction techniques for claims and network management in health insurance: patient segmentation for and economic evaluation of disease management programs, fraud and abuse detection and medical quality assessment. Based on different health insurance datasets, it is shown that tailored models and newly developed algorithms, like Bayesian latent variable models, can optimize the business steering of health insurance companies. By incorporating and structuring medical and insurance knowledge these tailored regression approaches can at least compete with machine learning and artificial intelligence methods while being more transparent and interpretable for the business users. In all four examples, methodology and outcomes of the applied approaches are discussed extensively from an academic perspective. Various comparisons to analytical and market best practice methods allow to also judge the added value of the applied approaches from an economic perspective.Mit der wachsenden Verfügbarkeit von internen und externen Daten in der (Kranken-) Versicherungsindustrie steigt die Nachfrage nach neuen Erkenntnissen gewonnen aus analytischen Verfahren. In dieser Dissertation werden vier Anwendungsbeispiele für komplexe regressionsbasierte Vorhersagetechniken im Schaden- und Netzwerkmanagement von Krankenversicherungen präsentiert: Patientensegmentierung für und ökonomische Auswertung von Gesundheitsprogrammen, Betrugs- und Missbrauchserkennung und Messung medizinischer Behandlungsqualität. Basierend auf verschiedenen Krankenversicherungsdatensätzen wird gezeigt, dass maßgeschneiderte Modelle und neu entwickelte Algorithmen, wie bayesianische latente Variablenmodelle, die Geschäftsteuerung von Krankenversicherern optimieren können. Durch das Einbringen und Strukturieren von medizinischem und versicherungstechnischem Wissen können diese maßgeschneiderten Regressionsansätze mit Methoden aus dem maschinellen Lernen und der künstlichen Intelligenz zumindest mithalten. Gleichzeitig bieten diese Ansätze dem Businessanwender ein höheres Maß an Transparenz und Interpretierbarkeit. In allen vier Beispielen werden Methodik und Ergebnisse der angewandten Verfahren ausführlich aus einer akademischen Perspektive diskutiert. Verschiedene Vergleiche mit analytischen und marktüblichen Best-Practice-Methoden erlauben es, den Mehrwert der angewendeten Ansätze auch aus einer ökonomischen Perspektive zu bewerten

    Human-aware application of data science techniques

    Get PDF
    In recent years there has been an increase in the use of artificial intelligence and other data-based techniques to automate decision-making in companies, and discover new knowledge in research. In many cases, all this has been performed using very complex algorithms (so-called black-box algorithms), which are capable of detecting very complex patterns, but unfortunately remain nearly uninterpretable. Recently, many researchers and regulatory institutions have begun to raise awareness of their use. On the one hand, the subjects who depend on these decisions are increasingly questioning their use, as they may be victims of biases or erroneous predictions. On the other hand, companies and institutions that use these algorithms want to understand what their algorithm does, extract new knowledge, and prevent errors and improve their predictions in general. All this has meant that researchers have started to focus on the interpretability of their algorithms (for example, through explainable algorithms), and regulatory institutions have started to regulate the use of the data to ensure ethical aspects such as accountability or fairness. This thesis brings together three data science projects in which black-box predictive machine learning has been implemented to make predictions: - The development of an NTL detection system for an international utility company from Spain (Naturgy). We combine a black-box algorithm and an explanatory algorithm to guarantee our system's accuracy, transparency, and robustness. Moreover, we focus our efforts on empowering the stakeholder to play an active role in the model training process. - A collaboration with the University of Padova to provide explainability to a Deep Learning-based KPI system currently implemented by the MyInvenio company. - A collaboration between the author of the thesis and the Universitat de Barcelona to implement an AI solution (a black-box algorithm combined with an explanatory algorithm) to a social science problem. The unique characteristics of each project allow us to offer in this thesis a comprehensive analysis of the challenges and problems that exist in order to achieve a fair, transparent, unbiased and generalizable use of data in a data science project. With the feedback arising from the research carried out to provide satisfactory solutions to these three projects, we aim to: - Understand the reasons why a prediction model can be regarded as unfair or untruthful, making the model not generalisable, and the consequences from a technical point of view in terms of low accuracy of the model, but also how this can affect us as a society. - Determine and correct (or at least mitigate) the situations that cause the problems in terms of robustness and fairness of our data. - Assess the difference between the interpretable algorithms and black-box algorithms. Also, evaluate how well the explanatory algorithms can explain the predictions made by the predictive algorithms. - Highlight what the stakeholder's role in guaranteeing a robust model is and how to convert a data-driven approach to solve a predictive problem into a data-informed approach, where the data patterns and the human knowledge are combined to maximize profit.En els últims anys s'ha produït un augment de l'ús de la intel·ligència artificial i altres tècniques basades en dades per automatitzar la presa de decisions en les empreses, i descobrir nous coneixements en la recerca. En molts casos, tot això s'ha realitzat utilitzant algorismes molt complexos (anomenats algorismes de caixa negra), que són capaços de detectar patrons molt complexos, però, per desgràcia, continuen sent gairebé ininterpretables. Recentment, molts investigadors i institucions reguladores han començat a conscienciar sobre el seu ús. D'una banda, els subjectes que depenen d'aquestes decisions estan qüestionant cada vegada més el seu ús, ja que poden ser víctimes de prejudicis o prediccions errònies. D'altra banda, les empreses i institucions que utilitzen aquests algoritmes volen entendre el que fa el seu algorisme, extreure nous coneixements i prevenir errors i millorar les seves prediccions en general. Tot això ha fet que els investigadors hagin començat a centrar-se en la interpretació dels seus algorismes (per exemple, mitjançant algorismes explicables), i les institucions reguladores han començat a regular l'ús de les dades per garantir aspectes ètics com la rendició de comptes o la justícia. Aquesta tesi reuneix tres projectes de ciència de dades en els quals s'ha implementat aprenentatge automàtic amb algorismes de caixa negra per fer prediccions: - El desenvolupament d'un sistema de detecció de NTL (Non-Technical Losses, pèrdues d'energia no tècniques) per a una empresa internacional del sector de l'energia d'Espanya (Naturgy). Aquest sistema combina un algorisme de caixa negra i un algorisme explicatiu per garantir la precisió, la transparència i la robustesa del nostre sistema. A més, centrem els nostres esforços en la capacitació dels treballadors de l'empresa (els "stakeholders") per a exercir un paper actiu en el procés de formació dels models. - Una col·laboració amb la Universitat de Padova per proporcionar l'explicabilitat a un sistema KPI basat en Deep Learning actualment implementat per l'empresa MyInvenio. - Una col·laboració de l'autor de la tesi amb la Universitat de Barcelona per implementar una solució d'AI (un algorisme de caixa negra combinat amb un algorisme explicatiu) a un problema de ciències socials. Les característiques úniques de cada projecte ens permeten oferir en aquesta tesi una anàlisi exhaustiva dels reptes i problemes que existeixen per a aconseguir un ús just, transparent, imparcial i generalitzable de les dades en un projecte de ciència de dades. Amb el feedback obtingut de la recerca realitzada per a oferir solucions satisfactòries a aquests tres projectes, el nostre objectiu és: - Entendre les raons per les quals un model de predicció pot considerar-se injust o poc fiable, fent que el model no sigui generalitzable, i les conseqüències des d'un punt de vista tècnic en termes de baixa precisió del model, però també com pot afectar-nos com a societat. - Determinar i corregir (o almenys mitigar) les situacions que causen els problemes en termes de robustesa i imparcialitat de les nostres dades. - Avaluar la diferència entre els algorismes interpretables i els algorismes de caixa negra. A més, avaluar com els algorismes explicatius poden explicar les prediccions fetes pels algorismes predictius. - Ressaltar el paper de les parts interessades ("Stakeholders") per a garantir un model robust i com convertir un enfocament únicament basat en les dades per resoldre un problema predictiu en un enfocament basat en les dades però complementat amb altres coneixements, on els patrons de dades i el coneixement humà es combinen per maximitzar els beneficis.Postprint (published version
    • …
    corecore