4,548 research outputs found

    Development of the BIRD: a metadata modelling approach for the purpose of harmonising supervisory reporting at the European Central Bank - Directorate of general statistics: master and metadata

    Get PDF
    Internship Report presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Information Analysis and ManagementThe work presented is a report documenting the work completed during an Internship at the European Central Bank (ECB), located in Frankfurt Germany from the 15th March 2019 – 15th March 2020. The internship took place in the Directorate of General Statistics (DG-S), specifically in the Master and Metadata section of the Analytical Credit and Master data division (MAM). It will be a continuation of the ECB Internal Banks’ Integrated Reporting Dictionary (BIRD) project as well as management of ECB’s centralised metadata repository, known as the Single Data Dictionary (SDD). The purpose of the dictionary and BIRD Project is to provide the banks with a harmonized data model that describes precisely the data that should be extracted from the banks' internal IT systems to derive reports demanded by supervisory authorities, like the ECB. In this report, I will provide a basis for understanding the work undertaken in the team, focussing of the technical aspect of relational database modelling and metadata repositories and their role in big data analytical processing systems, current reporting requirements and methods used by the central banking institutions, which coincide with the processes set out by the European Banking Authority (EBA). This report will also provide an in-depth look into the structure of the database, as well as the principles followed to create the data model. It will also document the process of how the SDD is maintained and updated to meet changing needs. The report also includes the process undertaken by the BIRD team and supporting members on the banking community to introduce new reporting frameworks into the data model. During this period, the framework for the Financial Reporting (FinRep) standards was included, through a collaborative effort between banking representatives and the master and metadata team

    Foundations of Programming Languages

    Get PDF
    This clearly written textbook provides an accessible introduction to the three programming paradigms of object-oriented/imperative, functional, and logic programming. Highly interactive in style, the text encourages learning through practice, offering test exercises for each topic covered. Review questions and programming projects are also presented, to help reinforce the concepts outside of the classroom. This updated and revised new edition features new material on the Java implementation of the JCoCo virtual machine

    Exploring Sparse Spatial Relation in Graph Inference for Text-Based VQA

    Full text link
    Text-based visual question answering (TextVQA) faces the significant challenge of avoiding redundant relational inference. To be specific, a large number of detected objects and optical character recognition (OCR) tokens result in rich visual relationships. Existing works take all visual relationships into account for answer prediction. However, there are three observations: (1) a single subject in the images can be easily detected as multiple objects with distinct bounding boxes (considered repetitive objects). The associations between these repetitive objects are superfluous for answer reasoning; (2) two spatially distant OCR tokens detected in the image frequently have weak semantic dependencies for answer reasoning; and (3) the co-existence of nearby objects and tokens may be indicative of important visual cues for predicting answers. Rather than utilizing all of them for answer prediction, we make an effort to identify the most important connections or eliminate redundant ones. We propose a sparse spatial graph network (SSGN) that introduces a spatially aware relation pruning technique to this task. As spatial factors for relation measurement, we employ spatial distance, geometric dimension, overlap area, and DIoU for spatially aware pruning. We consider three visual relationships for graph learning: object-object, OCR-OCR tokens, and object-OCR token relationships. SSGN is a progressive graph learning architecture that verifies the pivotal relations in the correlated object-token sparse graph, and then in the respective object-based sparse graph and token-based sparse graph. Experiment results on TextVQA and ST-VQA datasets demonstrate that SSGN achieves promising performances. And some visualization results further demonstrate the interpretability of our method.Comment: Accepted by TIP 202

    A framework for the analysis and evaluation of enterprise models

    Get PDF
    Bibliography: leaves 264-288.The purpose of this study is the development and validation of a comprehensive framework for the analysis and evaluation of enterprise models. The study starts with an extensive literature review of modelling concepts and an overview of the various reference disciplines concerned with enterprise modelling. This overview is more extensive than usual in order to accommodate readers from different backgrounds. The proposed framework is based on the distinction between the syntactic, semantic and pragmatic model aspects and populated with evaluation criteria drawn from an extensive literature survey. In order to operationalize and empirically validate the framework, an exhaustive survey of enterprise models was conducted. From this survey, an XML database of more than twenty relatively large, publicly available enterprise models was constructed. A strong emphasis was placed on the interdisciplinary nature of this database and models were drawn from ontology research, linguistics, analysis patterns as well as the traditional fields of data modelling, data warehousing and enterprise systems. The resultant database forms the test bed for the detailed framework-based analysis and its public availability should constitute a useful contribution to the modelling research community. The bulk of the research is dedicated to implementing and validating specific analysis techniques to quantify the various model evaluation criteria of the framework. The aim for each of the analysis techniques is that it can, where possible, be automated and generalised to other modelling domains. The syntactic measures and analysis techniques originate largely from the disciplines of systems engineering, graph theory and computer science. Various metrics to measure model hierarchy, architecture and complexity are tested and discussed. It is found that many are not particularly useful or valid for enterprise models. Hence some new measures are proposed to assist with model visualization and an original "model signature" consisting of three key metrics is proposed.Perhaps the most significant contribution ofthe research lies in the development and validation of a significant number of semantic analysis techniques, drawing heavily on current developments in lexicography, linguistics and ontology research. Some novel and interesting techniques are proposed to measure, inter alia, domain coverage, model genericity, quality of documentation, perspicuity and model similarity. Especially model similarity is explored in depth by means of various similarity and clustering algorithms as well as ways to visualize the similarity between models. Finally, a number of pragmatic analyses techniques are applied to the models. These include face validity, degree of use, authority of model author, availability, cost, flexibility, adaptability, model currency, maturity and degree of support. This analysis relies mostly on the searching for and ranking of certain specific information details, often involving a degree of subjective interpretation, although more specific quantitative procedures are suggested for some of the criteria. To aid future researchers, a separate chapter lists some promising analysis techniques that were investigated but found to be problematic from methodological perspective. More interestingly, this chapter also presents a very strong conceptual case on how the proposed framework and the analysis techniques associated vrith its various criteria can be applied to many other information systems research areas. The case is presented on the grounds of the underlying isomorphism between the various research areas and illustrated by suggesting the application of the framework to evaluate web sites, algorithms, software applications, programming languages, system development methodologies and user interfaces

    Sequence-to-sequence learning for machine translation and automatic differentiation for machine learning software tools

    Full text link
    Cette thèse regroupe des articles d'apprentissage automatique et s'articule autour de deux thématiques complémentaires. D'une part, les trois premiers articles examinent l'application des réseaux de neurones artificiels aux problèmes du traitement automatique du langage naturel (TALN). Le premier article introduit une structure codificatrice-décodificatrice avec des réseaux de neurones récurrents pour traduire des segments de phrases de longueur variable. Le deuxième article analyse la performance de ces modèles de `traduction neuronale automatique' de manière qualitative et quantitative, tout en soulignant les difficultés posées par les phrases longues et les mots rares. Le troisième article s'adresse au traitement des mots rares et hors du vocabulaire commun en combinant des algorithmes de compression par dictionnaire et des réseaux de neurones récurrents. D'autre part, la deuxième partie de cette thèse fait abstraction de modèles particuliers de réseaux de neurones afin d'aborder l'infrastructure logicielle nécessaire à leur définition et entraînement. Les infrastructures modernes d'apprentissage profond doivent avoir la capacité d'exécuter efficacement des programmes d'algèbre linéaire et par tableaux, tout en étant capable de différentiation automatique (DA) pour calculer des dérivées multiples. Le premier article aborde les défis généraux posés par la conciliation de ces deux objectifs et propose la solution d'une représentation intermédiaire fondée sur les graphes. Le deuxième article attaque le même problème d'une manière différente: en implémentant un code source par bande dans un langage de programmation dynamique par tableau (Python et NumPy).This thesis consists of a series of articles that contribute to the field of machine learning. In particular, it covers two distinct and loosely related fields. The first three articles consider the use of neural network models for problems in natural language processing (NLP). The first article introduces the use of an encoder-decoder structure involving recurrent neural networks (RNNs) to translate from and to variable length phrases and sentences. The second article contains a quantitative and qualitative analysis of the performance of these `neural machine translation' models, laying bare the difficulties posed by long sentences and rare words. The third article deals with handling rare and out-of-vocabulary words in neural network models by using dictionary coder compression algorithms and multi-scale RNN models. The second half of this thesis does not deal with specific neural network models, but with the software tools and frameworks that can be used to define and train them. Modern deep learning frameworks need to be able to efficiently execute programs involving linear algebra and array programming, while also being able to employ automatic differentiation (AD) in order to calculate a variety of derivatives. The first article provides an overview of the difficulties posed in reconciling these two objectives, and introduces a graph-based intermediate representation that aims to tackle these difficulties. The second article considers a different approach to the same problem, implementing a tape-based source-code transformation approach to AD on a dynamically typed array programming language (Python and NumPy)

    A Model Driven Approach to Model Transformations

    Get PDF
    The OMG's Model Driven Architecture (MDA) initiative has been the focus of much attention in both academia and industry, due to its promise of more rapid and consistent software development through the increased use of models. In order for MDA to reach its full potential, the ability to manipulate and transform models { most obviously from the Platform Independent Model (PIM) to the Platform Specific Models (PSM) { is vital. Recognizing this need, the OMG issued a Request For Proposals (RFP) largely concerned with finding a suitable mechanism for trans- forming models. This paper outlines the relevant background material, summarizes the approach taken by the QVT-Partners (to whom the authors belong), presents a non-trivial example using the QVT-Partners approach, and finally sketches out what the future holds for model transformations
    • …
    corecore