Search CORE

647 research outputs found

Information Extraction from Text – Dealing with Imprecise Data

Author: Celikyilmaz Asli
Turksen I. Burhan
Publication venue: 'IntechOpen'
Publication date: 01/02/2010
Field of study

Essays in applied microeconometrics with applications to risk-taking and savings decisions

Author: Marchand Steeve
Publication venue: Bibliotheque de l' Universite Laval
Publication date: 01/01/2018
Field of study

Cette thèse présente trois chapitres qui utilisent et développent des méthodes microéconométriques pour l’analyse de microdonnées en économique. Le premier chapitre étudie comment les interactions sociales entre entrepreneurs affectent la prise de décisions en face de risque. Pour ce faire, nous menons deux expériences permettant de mesurer le niveau d’aversion au risque avec de jeunes entrepreneurs ougandais. Entre les deux expériences, les entrepreneurs participent à une activité sociale dans laquelle ils peuvent partager leur connaissance et discuter entre eux. Nous recueillons des données sur la formation du réseau de pairs résultant de cette activité et sur les choix des participants avant et après l’activité. Nous trouvons que les participants ont tendance à faire des choix plus (moins) risqués dans la seconde expérience si les pairs avec qui ils ont discuté font en moyenne des choix plus (moins) risqués dans la première expérience. Ceci suggère que même les interactions sociales à court terme peuvent affecter la prise de décisions en face de risque. Nous constatons également que les participants qui font des choix (in)cohérents dans les expériences ont tendance à développer des relations avec des individus qui font des choix (in)cohérents, même en conditionnant sur des variables observables comme l’éducation et le genre, suggérant que les réseaux de pairs sont formés en fonction de caractéristiques difficilement observables liées à la capacité cognitive. Le deuxième chapitre étudie si les politiques de comptes d’épargne à avantages fiscaux au Canada conviennent à tous les individus étant donné l’évolution de leur revenu et les différences dans la fiscalité entre les provinces. Les deux principales formes de comptes d’épargne à avantages fiscaux, les TEE et les EET, imposent l’épargne à l’année de cotisation et de retrait respectivement. Ainsi, les rendements relatifs des deux véhicules d’épargne dépendent des taux d’imposition marginaux effectifs au cours de ces deux années, qui dépendent à leur tour de la dynamique des revenus. J’estime un modèle de dynamique des revenus à l’aide d’une base de données administrative longitudinale canadienne contenant des millions d’individus, ce qui permet une hétérogénéité substantielle dans l’évolution des revenus entre différents groupes. Le modèle est ensuite utilisé, conjointement avec un calculateur d’impôt et de transferts gouvernementaux, pour prédire comment les rendements des EET et des TEE varient entre ces groupes. Les résultats suggèrent que les comptes de type TEE génèrent en général des rendements plus élevés, en particulier pour les groupes à faible revenu. La comparaison des choix d’épargne optimaux prédits par le modèle avec les choix d’épargne observés dans les données suggère que les EET sont en général trop favorisées dans la population, surtout au Québec. Ces résultats ont d’importantes implications sur les politiques de « nudge »qui sont actuellement mises en oeuvre au Québec, obligeant les employeurs à inscrire automatiquement leurs employés dans des comptes d’épargne de type EET. Ceux-ci pourraient produire des rendements très faibles pour les personnes à faible revenu, qui sont connues pour être les plus sensibles au « nudge ». Enfin, le troisième chapitre étudie les problèmes méthodologiques qui surviennent fréquemment dans les modèles de régression par discontinuité (RD). Il considère plus précisément le problème des erreurs d’arrondissement dans la variable déterminant le traitement, ce qui rend souvent la variable de traitement inobservable pour certaines observations autour du seuil. Alors que les chercheurs rejettent généralement ces observations, je montre qu’ils contiennent des informations importantes, car la distribution des résultats se divise en deux en fonction de l’effet du traitement. L’intégration de cette information dans des critères standard de sélection de modèles améliore la performance et permet d’éviter les biais de spécification. Cette méthode est prometteuse, en particulier pour améliorer les estimations des effets causaux dans les très grandes bases de données, où le nombre d’observations rejetées peut être très important, comme le LAD utilisé au chapitre 2.This thesis presents three chapters that use and develop microeconometric methods for microdata analysis in economics. The first chapter studies how social interactions influence entrepreneurs’ risk-taking decisions. We conduct two risk-taking experiments with young Ugandan entrepreneurs. Between the two experiments, the entrepreneurs participate in a networking activity where they build relationships and discuss with each other. We collect data on peer network formation and on participants’ choices before and after the networking activity. We find that participants tend to make more (less) risky choices in the second experiment if the peers they discuss with make on average more (less) risky choices in the first experiment. This suggests that even short term social interactions may affect risk-taking decisions. We also find that participants who make (in)consistent choices in the experiments tend to develop relationships with individuals who also make (in)consistent choices, even when controlling for observable variables such as education and gender, suggesting that peer networks are formed according to unobservable characteristics linked to cognitive ability. The second chapter studies whether tax-preferred saving accounts policies in Canada are suited to all individuals given they different income path and given differences in tax codes across provinces. The two main forms of tax-preferred saving accounts – TEE and EET – tax savings at the contribution and withdrawal years respectively. Thus the relative returns of the two saving vehicles depend on the effective marginal tax rates in these two years, which in turn depend on earning dynamics. This chapter estimates a model of earning dynamics on a Canadian longitudinal administrative database containing millions of individuals, allowing for substantial heterogeneity in the evolution of income across income groups. The model is then used, together with a tax and credit calculator, to predict how the returns of EET and TEE vary across these groups. The results suggest that TEE accounts yield in general higher returns, especially for low-income groups. Comparing optimal saving choices predicted by the model with observed saving choices in the data suggests that EET are over-chosen, especially in the province of Quebec. These results have important implications for “nudging” policies that are currently being implemented in Quebec, forcing employers to automatically enrol their employees in savings accounts similar to EET. These could yield very low returns for low-income individuals, which are known to be the most sensitive to nudging. Finally, the third chapter is concerned with methodological problems often arising in regression discontinuity designs (RDD). It considers the problem of rounding errors in the running variable of RDD, which often make the treatment variable unobservable for some observations around the threshold. While researchers usually discard these observations, I show that they contain valuable information because the outcome’s distribution splits in two as a function of the treatment effect. Integrating this information in standard data driven criteria helps in choosing the best model specification and avoid specification biases. This method is promising, especially for improving estimates of causal effects in very large database (where the number of observations discarded can be very large), such as the LAD used in Chapter 2

CorpusUL

De-interleaving of Radar Pulses for EW Receivers with an ELINT Application

Author: Aldossary Mohammad
Publication venue: Department of Electrical Engineering
Publication date: 01/01/2017
Field of study

De-interleaving is a critical function in Electronic Warfare (EW) that has not received much attention in the literature regarding on-line Electronic Intelligence (ELINT) application. In ELINT, on-line analysis is important in order to allow for efficient data collection and for support of operational decisions. This dissertation proposed a de-interleaving solution for use with ELINT/Electronic-Support-Measures (ESM) receivers for purposes of ELINT with on-line application. The proposed solution does not require complex integration with existing EW systems or modifications to their sub-systems. Before proposing the solution, on-line de-interleaving algorithms were surveyed. Density-based spatial clustering of applications with noise (DBSCAN) is a clustering algorithm that has not been used before in de-interleaving; in this dissertation, it has proved to be effective. DBSCAN was thus selected as a component of the proposed de-interleaving solution due to its advantages over other surveyed algorithms. The proposed solution relies primarily on the parameters of Angle of Arrival (AOA), Radio Frequency (RF), and Time of Arrival (TOA). The time parameter was utilized in resolving RF agility. The solution is a system that is composed of different building blocks. The solution handles complex radar environments that include agility in RF, Pulse Width (PW), and Pulse Repetition Interval (PRI)

Cape Town University OpenUCT

Fuzzy neural network pattern recognition algorithm for classification of the events in power system networks

Author: Vasilic Slavko
Publication venue: Texas A&M University
Publication date: 30/09/2004
Field of study

This dissertation introduces advanced artificial intelligence based algorithm for detecting and classifying faults on the power system transmission line. The proposed algorithm is aimed at substituting classical relays susceptible to possible performance deterioration during variable power system operating and fault conditions. The new concept relies on a principle of pattern recognition and detects the existence of the fault, identifies fault type, and estimates the transmission line faulted section. The approach utilizes self-organized, Adaptive Resonance Theory (ART) neural network, combined with fuzzy decision rule for interpretation of neural network outputs. Neural network learns the mapping between inputs and desired outputs through processing a set of example cases. Training of the neural network is based on the combined use of unsupervised and supervised learning methods. During training, a set of input events is transformed into a set of prototypes of typical input events. During application, real events are classified based on the interpretation of their matching to the prototypes through fuzzy decision rule. This study introduces several enhancements to the original version of the ART algorithm: suitable preprocessing of neural network inputs, improvement in the concept of supervised learning, fuzzyfication of neural network outputs, and utilization of on-line learning. A selected model of an actual power network is used to simulate extensive sets of scenarios covering a variety of power system operating conditions as well as fault and disturbance events. Simulation results show improved recognition capabilities compared to a previous version of ART neural network algorithm, Multilayer Perceptron (MLP) neural network algorithm, and impedance based distance relay. Simulation results also show exceptional robustness of the novel ART algorithm for all operating conditions and events studied, as well as superior classification capabilities compared to the other solutions. Consequently, it is demonstrated that the proposed ART solution may be used for accurate, high-speed distinction among faulted and unfaulted events, and estimation of fault type and fault section

Texas A&M Repository

Data Science: Measuring Uncertainties

Author
Publication venue: 'MDPI AG'
Publication date: 11/01/2022
Field of study

With the increase in data processing and storage capacity, a large amount of data is available. Data without analysis does not have much value. Thus, the demand for data analysis is increasing daily, and the consequence is the appearance of a large number of jobs and published articles. Data science has emerged as a multidisciplinary field to support data-driven activities, integrating and developing ideas, methods, and processes to extract information from data. This includes methods built from different knowledge areas: Statistics, Computer Science, Mathematics, Physics, Information Science, and Engineering. This mixture of areas has given rise to what we call Data Science. New solutions to the new problems are reproducing rapidly to generate large volumes of data. Current and future challenges require greater care in creating new solutions that satisfy the rationality for each type of problem. Labels such as Big Data, Data Science, Machine Learning, Statistical Learning, and Artificial Intelligence are demanding more sophistication in the foundations and how they are being applied. This point highlights the importance of building the foundations of Data Science. This book is dedicated to solutions and discussions of measuring uncertainties in data analysis problems

Directory of Open Access Books (DOAB)

Machine Learning

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Machine Learning can be defined in various ways related to a scientific domain concerned with the design and development of theoretical and implementation tools that allow building systems with some Human Like intelligent behavior. Machine learning addresses more specifically the ability to improve automatically through experience

Directory of Open Access Books (DOAB)

On robust and adaptive soft sensors.

Author: Kadlec Petr
Publication venue
Publication date
Field of study

In process industries, there is a great demand for additional process information such as the product quality level or the exact process state estimation. At the same time, there is a large amount of process data like temperatures, pressures, etc. measured and stored every moment. This data is mainly measured for process control and monitoring purposes but its potential reaches far beyond these applications. The task of soft sensors is the maximal exploitation of this potential by extracting and transforming the latent information from the data into more useful process knowledge. Theoretically, achieving this goal should be straightforward since the process data as well as the tools for soft sensor development in the form of computational learning methods, are both readily available. However, contrary to this evidence, there are still several obstacles which prevent soft sensors from broader application in the process industry. The identification of the sources of these obstacles and proposing a concept for dealing with them is the general purpose of this work. The proposed solution addressing the issues of current soft sensors is a conceptual architecture for the development of robust and adaptive soft sensing algorithms. The architecture reflects the results of two review studies that were conducted during this project. The first one focuses on the process industry aspects of soft sensor development and application. The main conclusions of this study are that soft sensor development is currently being done in a non-systematic, ad-hoc way which results in a large amount of manual work needed for their development and maintenance. It is also found that a large part of the issues can be related to the process data upon which the soft sensors are built. The second review study dealt with the same topic but this time it was biased towards the machine learning viewpoint. The review focused on the identification of machine learning tools, which support the goals of this work. The machine learning concepts which are considered are: (i) general regression techniques for building of soft sensors; (ii) ensemble methods; (iii) local learning; (iv) meta-learning; and (v) concept drift detection and handling. The proposed architecture arranges the above techniques into a three-level hierarchy, where the actual prediction-making models operate at the bottom level. Their predictions are flexibly merged by applying ensemble methods at the next higher level. Finally from the top level, the underlying algorithm is managed by means of metalearning methods. The architecture has a modular structure that allows new pre-processing, predictive or adaptation methods to be plugged in. Another important property of the architecture is that each of the levels can be equipped with adaptation mechanisms, which aim at prolonging the lifetime of the resulting soft sensors. The relevance of the architecture is demonstrated by means of a complex soft sensing algorithm, which can be seen as its instance. This algorithm provides mechanisms for autonomous selection of data preprocessing and predictive methods and their parameters. It also includes five different adaptation mechanisms, some of which can be applied on a sample-by-sample basis without any requirement to store the on-line data. Other, more complex ones are started only on-demand if the performance of the soft sensor drops below a defined level. The actual soft sensors are built by applying the soft sensing algorithm to three industrial data sets. The different application scenarios aim at the analysis of the fulfilment of the defined goals. It is shown that the soft sensors are able to follow changes in dynamic environment and keep a stable performance level by exploiting the implemented adaptation mechanisms. It is also demonstrated that, although the algorithm is rather complex, it can be applied to develop simple and transparent soft sensors. In another experiment, the soft sensors are built without any manual model selection or parameter tuning, which demonstrates the ability of the algorithm to reduce the effort required for soft sensor development. However, if desirable, the algorithm is at the same time very flexible and provides a number of parameters that can be manually optimised. Evidence of the ability of the algorithm to deploy soft sensors with minimal training data and as such to provide the possibility to save the time consuming and costly training data collection is also given in this work

Bournemouth University Research Online

The Vessel Schedule Recovery Problem:Disruption management in liner shipping

Author: Brouer Berit Dangaard
Dirksen Jakob
Pisinger David
Plum Christian Edinger Munk
Vaaben Bo
Publication venue
Publication date: 01/01/2012
Field of study

Online Research Database In Technology

Multivariate and qualitative data-analysis for monitoring, diagnosis and control of sequencing batch reactors for wastewater treatment

Author: Villez Kris
Publication venue: Ghent University. Faculty of Bioscience Engineering
Publication date: 01/01/2007
Field of study

Ghent University Academic Bibliography