8 research outputs found

    Neuro-Symbolic Verification of Deep Neural Networks

    Get PDF

    FIN-DM: finantsteenuste andmekaeve protsessi mudel

    Get PDF
    Andmekaeve hõlmab reeglite kogumit, protsesse ja algoritme, mis võimaldavad ettevõtetel iga päev kogutud andmetest rakendatavaid teadmisi ammutades suurendada tulusid, vähendada kulusid, optimeerida tooteid ja kliendisuhteid ning saavutada teisi eesmärke. Andmekaeves ja -analüütikas on vaja hästi määratletud metoodikat ja protsesse. Saadaval on mitu andmekaeve ja -analüütika standardset protsessimudelit. Kõige märkimisväärsem ja laialdaselt kasutusele võetud standardmudel on CRISP-DM. Tegu on tegevusalast sõltumatu protsessimudeliga, mida kohandatakse sageli sektorite erinõuetega. CRISP-DMi tegevusalast lähtuvaid kohandusi on pakutud mitmes valdkonnas, kaasa arvatud meditsiini-, haridus-, tööstus-, tarkvaraarendus- ja logistikavaldkonnas. Seni pole aga mudelit kohandatud finantsteenuste sektoris, millel on omad valdkonnapõhised erinõuded. Doktoritöös käsitletakse seda lünka finantsteenuste sektoripõhise andmekaeveprotsessi (FIN-DM) kavandamise, arendamise ja hindamise kaudu. Samuti uuritakse, kuidas kasutatakse andmekaeve standardprotsesse eri tegevussektorites ja finantsteenustes. Uurimise käigus tuvastati mitu tavapärase raamistiku kohandamise stsenaariumit. Lisaks ilmnes, et need meetodid ei keskendu piisavalt sellele, kuidas muuta andmekaevemudelid tarkvaratoodeteks, mida saab integreerida organisatsioonide IT-arhitektuuri ja äriprotsessi. Peamised finantsteenuste valdkonnas tuvastatud kohandamisstsenaariumid olid seotud andmekaeve tehnoloogiakesksete (skaleeritavus), ärikesksete (tegutsemisvõime) ja inimkesksete (diskrimineeriva mõju leevendus) aspektidega. Seejärel korraldati tegelikus finantsteenuste organisatsioonis juhtumiuuring, mis paljastas 18 tajutavat puudujääki CRISP- DMi protsessis. Uuringu andmete ja tulemuste abil esitatakse doktoritöös finantsvaldkonnale kohandatud CRISP-DM nimega FIN-DM ehk finantssektori andmekaeve protsess (Financial Industry Process for Data Mining). FIN-DM laiendab CRISP-DMi nii, et see toetab privaatsust säilitavat andmekaevet, ohjab tehisintellekti eetilisi ohte, täidab riskijuhtimisnõudeid ja hõlmab kvaliteedi tagamist kui osa andmekaeve elutsüklisData mining is a set of rules, processes, and algorithms that allow companies to increase revenues, reduce costs, optimize products and customer relationships, and achieve other business goals, by extracting actionable insights from the data they collect on a day-to-day basis. Data mining and analytics projects require well-defined methodology and processes. Several standard process models for conducting data mining and analytics projects are available. Among them, the most notable and widely adopted standard model is CRISP-DM. It is industry-agnostic and often is adapted to meet sector-specific requirements. Industry- specific adaptations of CRISP-DM have been proposed across several domains, including healthcare, education, industrial and software engineering, logistics, etc. However, until now, there is no existing adaptation of CRISP-DM for the financial services industry, which has its own set of domain-specific requirements. This PhD Thesis addresses this gap by designing, developing, and evaluating a sector-specific data mining process for financial services (FIN-DM). The PhD thesis investigates how standard data mining processes are used across various industry sectors and in financial services. The examination identified number of adaptations scenarios of traditional frameworks. It also suggested that these approaches do not pay sufficient attention to turning data mining models into software products integrated into the organizations' IT architectures and business processes. In the financial services domain, the main discovered adaptation scenarios concerned technology-centric aspects (scalability), business-centric aspects (actionability), and human-centric aspects (mitigating discriminatory effects) of data mining. Next, an examination by means of a case study in the actual financial services organization revealed 18 perceived gaps in the CRISP-DM process. Using the data and results from these studies, the PhD thesis outlines an adaptation of CRISP-DM for the financial sector, named the Financial Industry Process for Data Mining (FIN-DM). FIN-DM extends CRISP-DM to support privacy-compliant data mining, to tackle AI ethics risks, to fulfill risk management requirements, and to embed quality assurance as part of the data mining life-cyclehttps://www.ester.ee/record=b547227

    Development of cue integration with reward-mediated learning

    Get PDF
    This thesis will first introduce in more detail the Bayesian theory and its use in integrating multiple information sources. I will briefly talk about models and their relation to the dynamics of an environment, and how to combine multiple alternative models. Following that I will discuss the experimental findings on multisensory integration in humans and animals. I start with psychophysical results on various forms of tasks and setups, that show that the brain uses and combines information from multiple cues. Specifically, the discussion will focus on the finding that humans integrate this information in a way that is close to the theoretical optimal performance. Special emphasis will be put on results about the developmental aspects of cue integration, highlighting experiments that could show that children do not perform similar to the Bayesian predictions. This section also includes a short summary of experiments on how subjects handle multiple alternative environmental dynamics. I will also talk about neurobiological findings of cells receiving input from multiple receptors both in dedicated brain areas but also primary sensory areas. I will proceed with an overview of existing theories and computational models of multisensory integration. This will be followed by a discussion on reinforcement learning (RL). First I will talk about the original theory including the two different main approaches model-free and model-based reinforcement learning. The important variables will be introduced as well as different algorithmic implementations. Secondly, a short review on the mapping of those theories onto brain and behaviour will be given. I mention the most in uential papers that showed correlations between the activity in certain brain regions with RL variables, most prominently between dopaminergic neurons and temporal difference errors. I will try to motivate, why I think that this theory can help to explain the development of near-optimal cue integration in humans. The next main chapter will introduce our model that learns to solve the task of audio-visual orienting. Many of the results in this section have been published in [Weisswange et al. 2009b,Weisswange et al. 2011]. The model agent starts without any knowledge of the environment and acts based on predictions of rewards, which will be adapted according to the reward signaling the quality of the performed action. I will show that after training this model performs similarly to the prediction of a Bayesian observer. The model can also deal with more complex environments in which it has to deal with multiple possible underlying generating models (perform causal inference). In these experiments I use di#erent formulations of Bayesian observers for comparison with our model, and find that it is most similar to the fully optimal observer doing model averaging. Additional experiments using various alterations to the environment show the ability of the model to react to changes in the input statistics without explicitly representing probability distributions. I will close the chapter with a discussion on the benefits and shortcomings of the model. The thesis continues whith a report on an application of the learning algorithm introduced before to two real world cue integration tasks on a robotic head. For these tasks our system outperforms a commonly used approximation to Bayesian inference, reliability weighted averaging. The approximation is handy because of its computational simplicity, because it relies on certain assumptions that are usually controlled for in a laboratory setting, but these are often not true for real world data. This chapter is based on the paper [Karaoguz et al. 2011]. Our second modeling approach tries to address the neuronal substrates of the learning process for cue integration. I again use a reward based training scheme, but this time implemented as a modulation of synaptic plasticity mechanisms in a recurrent network of binary threshold neurons. I start the chapter with an additional introduction section to discuss recurrent networks and especially the various forms of neuronal plasticity that I will use in the model. The performance on a task similar to that of chapter 3 will be presented together with an analysis of the in uence of different plasticity mechanisms on it. Again benefits and shortcomings and the general potential of the method will be discussed. I will close the thesis with a general conclusion and some ideas about possible future work

    Privacy-preserving data analytics in cloud computing

    Get PDF
    The evolution of digital content and rapid expansion of data sources has raised the need for streamlined monitoring, collection, storage and analysis of massive, heterogeneous data to extract useful knowledge and support decision-making mechanisms. In this context, cloud computing o↵ers extensive, cost-e↵ective and on demand computing resources that improve the quality of services for users and also help service providers (enterprises, governments and individuals). Service providers can avoid the expense of acquiring and maintaining IT resources while migrating data and remotely managing processes including aggregation, monitoring and analysis in cloud servers. However, privacy and security concerns of cloud computing services, especially in storing sensitive data (e.g. personal, healthcare and financial) are major challenges to the adoption of these services. To overcome such barriers, several privacy-preserving techniques have been developed to protect outsourced data in the cloud. Cryptography is a well-known mechanism that can ensure data confidentiality in the cloud. Traditional cryptography techniques have the ability to protect the data through encryption in cloud servers and data owners can retrieve and decrypt data for their processing purposes. However, in this case, cloud users can use the cloud resources for data storage but they cannot take full advantage of cloud-based processing services. This raises the need to develop advanced cryptosystems that can protect data privacy, both while in storage and in processing in the cloud. Homomorphic Encryption (HE) has gained attention recently because it can preserve the privacy of data while it is stored and processed in the cloud servers and data owners can retrieve and decrypt their processed data to their own secure side. Therefore, HE o↵ers an end-to-end security mechanism that is a preferable feature in cloud-based applications. In this thesis, we developed innovative privacy-preserving cloud-based models based on HE cryptosystems. This allowed us to build secure and advanced analytic models in various fields. We began by designing and implementing a secure analytic cloud-based model based on a lightweight HE cryptosystem. We used a private resident cloud entity, called ”privacy manager”, as an intermediate communication server between data owners and public cloud servers. The privacy manager handles analytical tasks that cannot be accomplished by the lightweight HE cryptosystem. This model is convenient for several application domains that require real-time responses. Data owners delegate their processing tasks to the privacy manager, which then helps to automate analysis tasks without the need to interact with data owners. We then developed a comprehensive, secure analytical model based on a Fully Homomorphic Encryption (FHE), that has more computational capability than the lightweight HE. Although FHE can automate analysis tasks and avoid the use of the privacy manager entity, it also leads to massive computational overhead. To overcome this issue, we took the advantage of the massive cloud resources by designing a MapReduce model that massively parallelises HE analytical tasks. Our parallelisation approach significantly speeds up the performance of analysis computations based on FHE. We then considered distributed analytic models where the data is generated from distributed heterogeneous sources such as healthcare and industrial sensors that are attached to people or installed in a distributed-based manner. We developed a secure distributed analytic model by re-designing several analytic algorithms (centroid-based and distribution-based clustering) to adapt them into a secure distributed-based models based on FHE. Our distributed analytic model was developed not only for distributed-based applications, but also it eliminates FHE overhead obstacle by achieving high efficiency in FHE computations. Furthermore, the distributed approach is scalable across three factors: analysis accuracy, execution time and the amount of resources used. This scalability feature enables users to consider the requirements of their analysis tasks based on these factors (e.g. users may have limited resources or time constrains to accomplish their analysis tasks). Finally, we designed and implemented two privacy-preserving real-time cloud-based applications to demonstrate the capabilities of HE cryptosystems, in terms of both efficiency and computational capabilities for applications that require timely and reliable delivery of services. First, we developed a secure cloud-based billing model for a sensor-enabled smart grid infrastructure by using lightweight HE. This model handled billing analysis tasks for individual users in a secure manner without the need to interact with any trusted parties. Second, we built a real-time secure health surveillance model for smarter health communities in the cloud. We developed a secure change detection model based on an exponential smoothing technique to predict future changes in health vital signs based on FHE. Moreover, we built an innovative technique to parallelise FHE computations which significantly reduces computational overhead
    corecore