10 research outputs found

    Synthetic Data Generation for Statistical Testing

    Get PDF
    Usage-based statistical testing employs knowledge about the actual or anticipated usage profile of the system under test for estimating system reliability. For many systems, usage-based statistical testing involves generating synthetic test data. Such data must possess the same statistical characteristics as the actual data that the system will process during operation. Synthetic test data must further satisfy any logical validity constraints that the actual data is subject to. Targeting data-intensive systems, we propose an approach for generating synthetic test data that is both statistically representative and logically valid. The approach works by first generating a data sample that meets the desired statistical characteristics, without taking into account the logical constraints. Subsequently, the approach tweaks the generated sample to fix any logical constraint violations. The tweaking process is iterative and continuously guided toward achieving the desired statistical characteristics. We report on a realistic evaluation of the approach, where we generate a synthetic population of citizens' records for testing a public administration IT system. Results suggest that our approach is scalable and capable

    Model-Based Simulation of Legal Policies: Framework, Tool Support, and Validation

    Get PDF
    Simulation of legal policies is an important decision-support tool in domains such as taxation. The primary goal of legal policy simulation is predicting how changes in the law affect measures of interest, e.g., revenue. Legal policy simulation is currently implemented using a combination of spreadsheets and software code. Such a direct implementation poses a validation challenge. In particular, legal experts often lack the necessary software background to review complex spreadsheets and code. Consequently, these experts currently have no reliable means to check the correctness of simulations against the requirements envisaged by the law. A further challenge is that representative data for simulation may be unavailable, thus necessitating a data generator. A hard-coded generator is difficult to build and validate. We develop a framework for legal policy simulation that is aimed at addressing the challenges above. The framework uses models for specifying both legal policies and the probabilistic characteristics of the underlying population. We devise an automated algorithm for simulation data generation. We evaluate our framework through a case study on Luxembourg’s Tax Law

    A Model-Based Framework for Legal Policy Simulation and Compliance Checking

    Get PDF
    Information systems implementing requirements from laws and regulations, such as taxes and social benefits, need to be thoroughly verified to demonstrate their compliance. Several Verification and Validation (V&V) techniques, such as reliability testing, and modeling and simulation, can be used for assessing that such systems meet their legal. Typically, one has to model the expected (legal) behavior of the system in a form that can be executed (simulated), subject the resulting models and the system to the same input data, and then compare the observed behavior of the model simulation and system execution. Existing V&V techniques often rely on code and complex logical expressions with no intuitive appeal to legal experts for specifying the expected behavior of a given system. Subsequently, one has no practical way to validate with legal experts that the underlying legal requirements are indeed complete and constitute a faithful representation of what needs to be implemented. Further, manually defining the expected behavior of a system and its test oracles is a tedious and error-prone task. The challenge here is to find a suitable knowledge representation that can be understood by all the involved stakeholders, e.g., software engineers and legal experts, but that remains complete and precise enough to enable automated analysis such as simulation and testing. As real data is seldom accessible in highly regulated domains, V&V requires the generation of synthetic testing data that can be used to build confidence in the reliability of the system under test. In particular, such data has to be structurally and logically well-formed to raise meaningful failures that can help reasoning about the reliability of the system under test. Further, the data should exhibit as much as possible the actual or anticipated system usage to help mimic how the system would behave under realistic circumstances. Generating such data is not a trivial task as the underlying data schemas are usually large and subject to numerous complex domain-related logical constraints. In this thesis, we investigate the use of the Unified Modeling Language (UML) and model-driven technologies, e.g., model to code transformations, to facilitate V&V activities for information systems that have to conform to laws and regulations, while tackling the above challenges. All our technical solutions have been developed and empirically evaluated in close collaboration with a government administration. Concretely, the technical solutions covered by this thesis include: - A modeling notation and methodology for formalizing legal policies. We propose a modeling notation and methodology for building abstract interpretations of the law. Models built using our methodology are simple enough to be understood by the involved stakeholders and are, at the same time, detailed enough to enable automated V&V activities. - A model-based simulation framework. We develop a model-based framework and associated tool support for simulating legal policies, when formalized using the aforementioned modeling methodology. Simulation provides a comparison baseline of how a compliant system should behave. Further, simulation is a mean to support decision-making when considering legal changes. Specifically, we report on a sizable case study where we assess the anticipated economic implications of a given policy change in Luxembourg’s tax law. - A model-based generator of test cases for reliability testing. We develop a heuristic approach for generating valid and representative test cases (data). Our generator is scalable and produces high-quality test data that is suitable for testing the reliability of data-intensive systems, e.g., a tax management system

    A Model-Based Framework for Legal Policy Simulation and Legal Compliance Checking

    Get PDF
    Analyzing legal policies for many laws, such as taxes and social benefits, is a common way for governments to identify risks, e.g., risk of legal policies not achieving expected revenue. A typical analysis includes validation of policies and the verification of the systems implementing them. One efficient way to validate policies is simulation, e.g., by simulating whether a proposed law reform would realize target objectives. Once validated, policies are implemented into public administration procedures and eGovernment applications. Systems implementing legal policies also need to be analyzed and verified, e.g., through testing, to ensure that they are compliant with the underlying policies. Currently, legal policy analysis is conducted using a combination of spreadsheets and software code. Such strategy suffers mainly from being hard to use by legal experts due to the lack of adequate background. This is partly rooted in the fact that available techniques to formalize legal policies are based on complex logical expressions and code. The main goal of this research project, that this paper describes, is to narrow the aforementioned expertise gap by proposing convenient, systematic and automated techniques to support analysis of legal polices from their design to their implementation

    A Model-Based Framework for Probabilistic Simulation of Legal Policies

    Get PDF
    Legal policy simulation is an important decision-support tool in domains such as taxation. The primary goal of legal policy simulation is predicting how changes in the law affect measures of interest, e.g., revenue. Currently, legal policies are simulated via a combination of spreadsheets and software code. This poses a validation challenge both due to complexity reasons and due to legal experts lacking the expertise to understand software code. A further challenge is that representative data for simulation may be unavailable, thus necessitating a data generator. We develop a framework for legal policy simulation that is aimed at addressing these challenges. The framework uses models for specifying both legal policies and the probabilistic characteristics of the underlying population. We devise an automated algorithm for simulation data generation. We evaluate our framework through a case study on Luxembourg's Tax Law

    Using Models to Enable Compliance Checking against the GDPR: An Experience Report

    Get PDF
    The General Data Protection Regulation (GDPR) harmonizes data privacy laws and regulations across Europe. Through the GDPR, individuals are able to better control their personal data in the face of new technological developments. While the GDPR is highly advantageous to citizens, complying with it poses major challenges for organizations that control or process personal data. Since no automated solution with broad industrial applicability currently exists for GDPR compliance checking, organizations have no choice but to perform costly manual audits to ensure compliance. In this paper, we share our experience building a UML representation of the GDPR as a first step towards the development of future automated methods for assessing compliance with the GDPR. Given that a concrete implementation of the GDPR is affected by the national laws of the EU member states, GDPR’s expanding body of case laws and other contextual information, we propose a two-tiered representation of the GDPR: a generic tier and a specialized tier. The generic tier captures the concepts and principles of the GDPR that apply to all contexts, whereas the specialized tier describes a specific tailoring of the generic tier to a given context, including the contextual variations that may impact the interpretation and application of the GDPR. We further present the challenges we faced in our modeling endeavor, the lessons we learned from it, and future directions for research

    Modeling Data Protection and Privacy: Application and Experience with GDPR

    Get PDF
    In Europe and indeed worldwide, the Gen- eral Data Protection Regulation (GDPR) provides pro- tection to individuals regarding their personal data in the face of new technological developments. GDPR is widely viewed as the benchmark for data protection and privacy regulations that harmonizes data privacy laws across Europe. Although the GDPR is highly ben- e cial to individuals, it presents signi cant challenges for organizations monitoring or storing personal infor- mation. Since there is currently no automated solution with broad industrial applicability, organizations have no choice but to carry out expensive manual audits to ensure GDPR compliance. In this paper, we present a complete GDPR UML model as a rst step towards de- signing automated methods for checking GDPR compli- ance. Given that the practical application of the GDPR is infuenced by national laws of the EU Member States,we suggest a two-tiered description of the GDPR, generic and specialized. In this paper, we provide (1) the GDPR conceptual model we developed with complete trace- ability from its classes to the GDPR, (2) a glossary to help understand the model, (3) the plain-English de- scription of 35 compliance rules derived from GDPR along with their encoding in OCL, and (4) the set of 20 variations points derived from GDPR to specialize the generic model. We further present the challenges we faced in our modeling endeavor, the lessons we learned from it, and future directions for research
    corecore