255,458 research outputs found
Liveness-Driven Random Program Generation
Randomly generated programs are popular for testing compilers and program
analysis tools, with hundreds of bugs in real-world C compilers found by random
testing. However, existing random program generators may generate large amounts
of dead code (computations whose result is never used). This leaves relatively
little code to exercise a target compiler's more complex optimizations.
To address this shortcoming, we introduce liveness-driven random program
generation. In this approach the random program is constructed bottom-up,
guided by a simultaneous structural data-flow analysis to ensure that the
generator never generates dead code.
The algorithm is implemented as a plugin for the Frama-C framework. We
evaluate it in comparison to Csmith, the standard random C program generator.
Our tool generates programs that compile to more machine code with a more
complex instruction mix.Comment: Pre-proceedings paper presented at the 27th International Symposium
on Logic-Based Program Synthesis and Transformation (LOPSTR 2017), Namur,
Belgium, 10-12 October 2017 (arXiv:1708.07854
Verification of Autonomous Systems: Developmental Test and Evaluation of an Autonomous UAS Swarming Algorithm Combining Simulation, Formulation and Live Flight
This research was driven by the increase of autonomous systems in the current millennium and the challenging nature of testing and evaluating their performance. A review of the current literature revealed proposed methods for verifying autonomous systems, but few implementations. It exposed several gaps in the current verification and validation methods and suggested goals for filling them. Through the use of modeling, software in the loop (SITL), and flight test, this research verified an autonomous swarming algorithm for unmanned aerial systems (UAS) and validated an exemplar of a testing framework. Thirteen sets of three-vehicle swarm data produced over two days of flight testing provided a baseline algorithm analysis. During these tests, vehicle separation distances deviated an average of 5.61 meters from the ideal state, with separation distance violations \u3c 6:39% of the time. The swarm achieved a 0.27 m average deviation and 0.43% violation in the best cases. Average packet loss between vehicles was 4.94% at a 5 Hz update rate, with an optimal communication lag \u3c 0:04 seconds. The multi-faceted empirical analysis created through the pairing of qualitative and quantitative analysis provided a complete understanding of vehicle behavior. This analysis also identified various areas of improvement for the algorithm and testing framework. The outcomes of this research formed a baseline testing continuum that is adaptable to a variety of follow-on investigations into formal verification of autonomous systems
Balancing Test Accuracy and Security in Computerized Adaptive Testing
Computerized adaptive testing (CAT) is a form of personalized testing that
accurately measures students' knowledge levels while reducing test length.
Bilevel optimization-based CAT (BOBCAT) is a recent framework that learns a
data-driven question selection algorithm to effectively reduce test length and
improve test accuracy. However, it suffers from high question exposure and test
overlap rates, which potentially affects test security. This paper introduces a
constrained version of BOBCAT to address these problems by changing its
optimization setup and enabling us to trade off test accuracy for question
exposure and test overlap rates. We show that C-BOBCAT is effective through
extensive experiments on two real-world adult testing datasets.Comment: The 24th International Conference on Artificial Intelligence in
Education (AIED 2023
Deep learning based surrogate modeling and optimization for Microalgal biofuel production and photobioreactor design
Identifying optimal photobioreactor configurations and process operating conditions is
critical to industrialize microalgae-derived biorenewables. Traditionally, this was addressed
by testing numerous design scenarios from integrated physical models coupling
computational fluid dynamics and kinetic modelling. However, this approach presents
computational intractability and numerical instabilities when simulating large-scale systems,
causing time-intensive computing efforts and infeasibility in mathematical optimization.
Therefore, we propose an innovative data-driven surrogate modelling framework which
considerably reduces computing time from months to days by exploiting state-of-the-art deep
learning technology. The framework built upon a few simulated results from the physical
model to learn the sophisticated hydrodynamic and biochemical kinetic mechanisms; then
adopts a hybrid stochastic optimization algorithm to explore untested processes and find
optimal solutions. Through verification, this framework was demonstrated to have
comparable accuracy to the physical model. Moreover, multi-objective optimization was
incorporated to generate a Pareto-frontier for decision-making, advancing its applications in
complex biosystems modelling and optimization
Recommended from our members
Learning from Sequential User Data: Models and Sample-efficient Algorithms
Recent advances in deep learning have made learning representation from ever-growing datasets possible in the domain of vision, natural language processing (NLP), and robotics, among others. However, deep networks are notoriously data-hungry; for example, training language models with attention mechanisms sometimes requires trillions of parameters and tokens. In contrast, we can often access a limited number of samples in many tasks. It is crucial to learn models from these `limited\u27 datasets. Learning with limited datasets can take several forms. In this thesis, we study how to select data samples sequentially such that downstream task performance is maximized. Moreover, we study how to introduce prior knowledge in the deep networks to maximize prediction performance. We focus on four sequential tasks: computerized adaptive testing in psychometrics, sketching in recommender systems, knowledge tracing in computer-assisted education, and career path modeling in the labor market.
In the first two tasks, we devise novel sample-efficient algorithms to query a minimal number of sequential samples to improve future predictions. We propose a Bilevel Optimization-Based framework for computerized adaptive testing to learn a data-driven question selection algorithm that improves existing data selection policies. We also tackle the sketching problem in the recommender system, with the task of recommending the next item using a stored subset of prior data samples. In this setting, we develop a data-driven sequential selection algorithm that tackles evolving downstream task distribution. In the last two tasks, we devise novel neural models to introduce prior knowledge exploiting limited data samples. For knowledge tracing, we propose a novel neural architecture, inspired by cognitive and psychometric models, to improve the prediction of students\u27 future performance and utilize the labeled data samples efficiently. For career path modeling, we propose a novel and interpretable monotonic nonlinear state-space model to analyze online user professional profiles and provide actionable feedback and recommendations to users on how they can reach their career goals.
The data-driven differentiable data selection algorithms for the first two tasks open up future directions to query (a non-differentiable operation) a minimal number of samples optimally to maximize prediction performance. The structures, introduced in the neural architecture for the models in the last two tasks using prior knowledge, open up future directions to learn deep models augmented with prior knowledge using limited data samples
Data-Driven Dynamic Robust Resource Allocation: Application to Efficient Transportation
The transformation to smarter cities brings an array of emerging urbanization challenges. With the development of technologies such as sensor networks, storage devices, and cloud computing, we are able to collect, store, and analyze a large amount of data in real time. Modern cities have brought to life unprecedented opportunities and challenges for allocating limited resources in a data-driven way. Intelligent transportation system is one emerging research area, in which sensing data provides us opportunities for understanding spatial-temporal patterns of demand human and mobility. However, greedy or matching algorithms that only deal with known requests are far from efficient in the long run without considering demand information predicted based on data.
In this dissertation, we develop a data-driven robust resource allocation framework to consider spatial-temporally correlated demand and demand uncertainties, motivated by the problem of efficient dispatching of taxi or autonomous vehicles. We first present a receding horizon control (RHC) framework to dispatch taxis towards predicted demand; this framework incorporates both information from historical record data and real-time GPS location and occupancy status data. It also allows us to allocate resource from a globally optimal perspective in a longer time period, besides the local level greedy or matching algorithm for assigning a passenger pick-up location of each vacant vehicle. The objectives include reducing both current and anticipated future total idle driving distance and matching spatial-temporal ratio between demand and supply for service quality. We then present a robust optimization method to consider spatial-temporally correlated demand model uncertainties that can be expressed in closed convex sets. Uncertainty sets of demand vectors are constructed from data based on theories in hypothesis testing, and the sets provide a desired probabilistic guarantee level for the performance of dispatch solutions. To minimize the average resource allocation cost under demand uncertainties, we develop a general data-driven dynamic distributionally robust resource allocation model. An efficient algorithm for building demand uncertainty sets that compatible with various demand prediction methods is developed. We prove equivalent computationally tractable forms of the robust and distributionally robust resource allocation problems using strong duality. The resource allocation problem aims to balance the demand-supply ratio at different nodes of the network with minimum balancing and re-balancing cost, with decision variables on the denominator that has not been covered by previous work.
Trace-driven analysis with real taxi operational record data of San Francisco shows that the RHC framework reduces the average total idle distance of taxis by 52%, and evaluations with over 100GB of New York City taxi trip data show that robust and distributionally robust dispatch methods reduce the average total idle distance by 10% more compared with non-robust solutions. Besides increasing the service efficiency by reducing total idle driving distance, the resource allocation methods in this dissertation also reduce the demand-supply ratio mismatch error across the city
- …