1,306 research outputs found

    State-of-the-art on evolution and reactivity

    Get PDF
    This report starts by, in Chapter 1, outlining aspects of querying and updating resources on the Web and on the Semantic Web, including the development of query and update languages to be carried out within the Rewerse project. From this outline, it becomes clear that several existing research areas and topics are of interest for this work in Rewerse. In the remainder of this report we further present state of the art surveys in a selection of such areas and topics. More precisely: in Chapter 2 we give an overview of logics for reasoning about state change and updates; Chapter 3 is devoted to briefly describing existing update languages for the Web, and also for updating logic programs; in Chapter 4 event-condition-action rules, both in the context of active database systems and in the context of semistructured data, are surveyed; in Chapter 5 we give an overview of some relevant rule-based agents frameworks

    A Framework for the Automatic Physical Configuration and Tuning of a Mysql Community Server

    Get PDF
    Manual physical configuration and tuning of database servers, is a complicated task requiring a high level of expertise. Database administrators must consider numerous possibilities, to determine a candidate configuration for implementation. In recent times database vendors have responded to this problem, providing solutions which can automatically configure and tune their products. Poor configuration choices, resulting in performance degradation commonplace in manual configurations, have been significantly reduced in these solutions. However, no such solution exists for MySQL Community Server. This thesis, proposes a novel framework for automatically tuning a MySQL Community Server. A first iteration of the framework has been built and is presented in this paper together with its performance measurements

    Easy over Hard: A Case Study on Deep Learning

    Full text link
    While deep learning is an exciting new technique, the benefits of this method need to be assessed with respect to its computational cost. This is particularly important for deep learning since these learners need hours (to weeks) to train the model. Such long training time limits the ability of (a)~a researcher to test the stability of their conclusion via repeated runs with different random seeds; and (b)~other researchers to repeat, improve, or even refute that original work. For example, recently, deep learning was used to find which questions in the Stack Overflow programmer discussion forum can be linked together. That deep learning system took 14 hours to execute. We show here that applying a very simple optimizer called DE to fine tune SVM, it can achieve similar (and sometimes better) results. The DE approach terminated in 10 minutes; i.e. 84 times faster hours than deep learning method. We offer these results as a cautionary tale to the software analytics community and suggest that not every new innovation should be applied without critical analysis. If researchers deploy some new and expensive process, that work should be baselined against some simpler and faster alternatives.Comment: 12 pages, 6 figures, accepted at FSE201

    An Economic Analysis of Privacy Protection and Statistical Accuracy as Social Choices

    Get PDF
    Statistical agencies face a dual mandate to publish accurate statistics while protecting respondent privacy. Increasing privacy protection requires decreased accuracy. Recognizing this as a resource allocation problem, we propose an economic solution: operate where the marginal cost of increasing privacy equals the marginal benefit. Our model of production, from computer science, assumes data are published using an efficient differentially private algorithm. Optimal choice weighs the demand for accurate statistics against the demand for privacy. Examples from U.S. statistical programs show how our framework can guide decision-making. Further progress requires a better understanding of willingness-to-pay for privacy and statistical accuracy

    Data Science and Ebola

    Get PDF
    Data Science---Today, everybody and everything produces data. People produce large amounts of data in social networks and in commercial transactions. Medical, corporate, and government databases continue to grow. Sensors continue to get cheaper and are increasingly connected, creating an Internet of Things, and generating even more data. In every discipline, large, diverse, and rich data sets are emerging, from astrophysics, to the life sciences, to the behavioral sciences, to finance and commerce, to the humanities and to the arts. In every discipline people want to organize, analyze, optimize and understand their data to answer questions and to deepen insights. The science that is transforming this ocean of data into a sea of knowledge is called data science. This lecture will discuss how data science has changed the way in which one of the most visible challenges to public health is handled, the 2014 Ebola outbreak in West Africa.Comment: Inaugural lecture Leiden Universit

    Metaphysics of Internal Controls

    Get PDF
    A quality internal control system has been seen as a remedy for various corporate governance issues. Two pieces of legislation, the Foreign Corrupt Practices Act (FCPA) and the Sarbanes-Oxley Act (SOX) deal with very different corporate governance issues, but each argue for a similar remedy. Both the FCPA and the SOX legislation argue that improved (or proper) internal controls are necessary to root out bribery of foreign officials, in the case of the FCPA, and (in the case of SOX) to support the accurate preparation of financial statements. An issue that has yet to be resolved is that the quality of internal control systems is subject to subjective assessments of the internal control deficiencies and their impact. This paper presents a mathematical model of internal controls based on Gӧdel number of axioms. This results in the representation of quality internal controls in terms of an integer. This approach also allows for inferences about financial statements and various auditing judgements

    An Integrated Engineering-Computation Framework for Collaborative Engineering: An Application in Project Management

    Get PDF
    Today\u27s engineering applications suffer from a severe integration problem. Engineering, the entire process, consists of a myriad of individual, often complex, tasks. Most computer tools support particular tasks in engineering, but the output of one tool is different from the others\u27. Thus, the users must re-enter the relevant information in the format required by another tool. Moreover, usually in the development process of a new product/process, several teams of engineers with different backgrounds/responsibilities are involved, for example mechanical engineers, cost estimators, manufacturing engineers, quality engineers, and project manager. Engineers need a tool(s) to share technical and managerial information and to be able to instantly access the latest changes made by one member, or more, in the teams to determine right away the impacts of these changes in all disciplines (cost, time, resources, etc.). In other words, engineers need to participate in a truly collaborative environment for the achievement of a common objective, which is the completion of the product/process design project in a timely, cost effective, and optimal manner. In this thesis, a new framework that integrates the capabilities of four commercial software, Microsoft Excel™ (spreadsheet), Microsoft Project™ (project management), What\u27s Best! (an optimization add-in), and Visual Basic™ (programming language), with a state-of-the-art object-oriented database (knowledge medium), InnerCircle2000™ is being presented and applied to handle the Cost-Time Trade-Off problem in project networks. The result was a vastly superior solution over the conventional solution from the viewpoint of data handling, completeness of solution space, and in the context of a collaborative engineering-computation environment

    Too Trivial To Test? An Inverse View on Defect Prediction to Identify Methods with Low Fault Risk

    Get PDF
    Background. Test resources are usually limited and therefore it is often not possible to completely test an application before a release. To cope with the problem of scarce resources, development teams can apply defect prediction to identify fault-prone code regions. However, defect prediction tends to low precision in cross-project prediction scenarios. Aims. We take an inverse view on defect prediction and aim to identify methods that can be deferred when testing because they contain hardly any faults due to their code being "trivial". We expect that characteristics of such methods might be project-independent, so that our approach could improve cross-project predictions. Method. We compute code metrics and apply association rule mining to create rules for identifying methods with low fault risk. We conduct an empirical study to assess our approach with six Java open-source projects containing precise fault data at the method level. Results. Our results show that inverse defect prediction can identify approx. 32-44% of the methods of a project to have a low fault risk; on average, they are about six times less likely to contain a fault than other methods. In cross-project predictions with larger, more diversified training sets, identified methods are even eleven times less likely to contain a fault. Conclusions. Inverse defect prediction supports the efficient allocation of test resources by identifying methods that can be treated with less priority in testing activities and is well applicable in cross-project prediction scenarios.Comment: Submitted to PeerJ C
    corecore