6 research outputs found

    Automatic error localisation for categorical, continuous and integer data

    Get PDF
    Data collected by statistical offices generally contain errors, which have to be corrected before reliable data can be published. This correction process is referred to as statistical data editing. At statistical offices, certain rules, so-called edits, are often used during the editing process to determine whether a record is consistent or not. Inconsistent records are considered to contain errors, while consistent records are considered error-free. In this article we focus on automatic error localisation based on the Fellegi-Holt paradigm, which says that the data should be made to satisfy all edits by changing the fewest possible number of fields. Adoption of this paradigm leads to a mathematical optimisation problem. We propose an algorithm for solving this optimisation problem for a mix of categorical, continuous and integer-valued data. We also propose a heuristic procedure based on the exact algorithm. For five realistic data sets involving only integer-valued variables we evaluate the performance of this heuristic procedure.Peer Reviewe

    Integer Affine Transformations of Parametric Z-polytopes and Applications to Loop Nest Optimization

    Get PDF
    The polyhedral model is a well-known compiler optimization framework for the analysis and transformation of affine loop nests. We present a new method concerning a difficult geometric operation that is raised by this model: the integer affine transformation of parametric Z-polytopes. The result of such a transformation is given by a worst-case exponential union of Z-polytopes. We also propose a polynomial algorithm (for fixed dimension), to count points in arbitrary unions of a fixed number of parametric Z-polytopes. We implemented these algorithms and compared them to other existing algorithms, for a set of applications to loop nest analysis and optimization

    Types for DSP Assembler Programs

    Get PDF

    Processing of Erroneous and Unsafe Data

    Get PDF
    Statistical offices have to overcome many problems before they can publish reliable data. Two of these problems are examined in this thesis. The first problem is the occurrence of errors in the collected data. Due to these errors publication figures cannot be directly based on the collected data. Before publication the errors in the data have to be localised and corrected. In this thesis we focus on the localisation of errors in a mix of categorical and numerical data. The problem is formulated as a mathematical optimisation problem. Several new algorithms for solving this problem are proposed, and computational results of the most promising algorithms are compared to each other. The second problem that is examined in this thesis is the occurrence of unsafe data, i.e. data that would reveal too much sensitive information about individual respondents. Before publication of data, such unsafe data need to be protected. In the thesis we examine various aspects of the protection of unsafe data.Statistische bureaus dienen tal van problemen te overwinnen voordat zij de resultaten van hun onderzoeken kunnen publiceren. In het proefschrift wordt ingegaan op twee van deze problemen. Het eerste probleem is dat verzamelde gegevens foutief kunnen zijn. Door de mogelijke aanwezigheid van fouten in de gegevens moeten deze gegevens eerst worden gecontroleerd en indien nodig worden gecorrigeerd voordat tot publicatie van resultaten wordt overgegaan. In het proefschrift wordt vooral aandacht besteed aan het opsporen van de foutieve gegevens. Door te veronderstellen dat er zo min mogelijk fouten zijn gemaakt kan het opsporen van de foutieve waarden als een wiskundig optimaliseringsprobleem worden geformuleerd. In het proefschrift wordt een aantal methoden ontwikkeld om dit complexe probleem efficient op te lossen. Het tweede probleem dat in het proefschrift onderzocht wordt is dat geen gegevens gepubliceerd mogen worden die de privacy van individuele respondenten of kleine groepen respondenten schaden. Om gegevens van individuele of kleine groepen respondenten te beschermen moeten beveiligingsmaatregelen, zoals het niet publiceren van bepaalde informatie, worden getroffen. In het proefschrift wordt ingegaan op de wiskundige problemen die het beveiligen van gevoelige gegevens met zich mee brengt. Voor een aantal problemen, zoals het berekenen van het informatieverlies ten gevolge van het beveiligen van gevoelige gegevens en het minimaliseren van de informatie die niet gepubliceerd wordt, worden oplossingen beschreven

    Experiences with Constraint-based Array Dependence Analysis

    No full text
    Array data dependence analysis provides important information for optimization of scientific programs. Array dependence testing can be viewed as constraint analysis, although traditionally general-purpose constraint manipulation algorithms have been thought to be too slow for dependence analysis. We have explored the use of exact constraint analysis, based on Fourier's method, for array data dependence analysis. We have found these techniques can be used without a great impact on total compile time. Furthermore, the use of general-purpose algorithms has allowed us to address problems beyond traditional dependence analysis. In this paper, we summarize some of the constraint manipulation techniques we use for dependence analysis, and discuss some of the reasons for our performance results
    corecore