1,306 research outputs found
State-of-the-art on evolution and reactivity
This report starts by, in Chapter 1, outlining aspects of querying and updating resources on
the Web and on the Semantic Web, including the development of query and update languages
to be carried out within the Rewerse project.
From this outline, it becomes clear that several existing research areas and topics are of
interest for this work in Rewerse. In the remainder of this report we further present state of
the art surveys in a selection of such areas and topics. More precisely: in Chapter 2 we give
an overview of logics for reasoning about state change and updates; Chapter 3 is devoted to briefly describing existing update languages for the Web, and also for updating logic programs;
in Chapter 4 event-condition-action rules, both in the context of active database systems and
in the context of semistructured data, are surveyed; in Chapter 5 we give an overview of some relevant rule-based agents frameworks
A Framework for the Automatic Physical Configuration and Tuning of a Mysql Community Server
Manual physical configuration and tuning of database servers, is a complicated task requiring a high level of expertise. Database administrators must consider numerous possibilities, to determine a candidate configuration for implementation. In recent times database vendors have responded to this problem, providing solutions which can automatically configure and tune their products. Poor configuration choices, resulting in performance degradation commonplace in manual configurations, have been significantly reduced in these solutions. However, no such solution exists for MySQL Community Server. This thesis, proposes a novel framework for automatically tuning a MySQL Community Server. A first iteration of the framework has been built and is presented in this paper together with its performance measurements
Easy over Hard: A Case Study on Deep Learning
While deep learning is an exciting new technique, the benefits of this method
need to be assessed with respect to its computational cost. This is
particularly important for deep learning since these learners need hours (to
weeks) to train the model. Such long training time limits the ability of (a)~a
researcher to test the stability of their conclusion via repeated runs with
different random seeds; and (b)~other researchers to repeat, improve, or even
refute that original work.
For example, recently, deep learning was used to find which questions in the
Stack Overflow programmer discussion forum can be linked together. That deep
learning system took 14 hours to execute. We show here that applying a very
simple optimizer called DE to fine tune SVM, it can achieve similar (and
sometimes better) results. The DE approach terminated in 10 minutes; i.e. 84
times faster hours than deep learning method.
We offer these results as a cautionary tale to the software analytics
community and suggest that not every new innovation should be applied without
critical analysis. If researchers deploy some new and expensive process, that
work should be baselined against some simpler and faster alternatives.Comment: 12 pages, 6 figures, accepted at FSE201
An Economic Analysis of Privacy Protection and Statistical Accuracy as Social Choices
Statistical agencies face a dual mandate to publish accurate statistics while protecting respondent privacy. Increasing privacy protection requires decreased accuracy. Recognizing this as a resource allocation problem, we propose an economic solution: operate where the marginal cost of increasing privacy equals the marginal benefit. Our model of production, from computer science, assumes data are published using an efficient differentially private algorithm. Optimal choice weighs the demand for accurate statistics against the demand for privacy. Examples from U.S. statistical programs show how our framework can guide decision-making. Further progress requires a better understanding of willingness-to-pay for privacy and statistical accuracy
Data Science and Ebola
Data Science---Today, everybody and everything produces data. People produce
large amounts of data in social networks and in commercial transactions.
Medical, corporate, and government databases continue to grow. Sensors continue
to get cheaper and are increasingly connected, creating an Internet of Things,
and generating even more data. In every discipline, large, diverse, and rich
data sets are emerging, from astrophysics, to the life sciences, to the
behavioral sciences, to finance and commerce, to the humanities and to the
arts. In every discipline people want to organize, analyze, optimize and
understand their data to answer questions and to deepen insights. The science
that is transforming this ocean of data into a sea of knowledge is called data
science. This lecture will discuss how data science has changed the way in
which one of the most visible challenges to public health is handled, the 2014
Ebola outbreak in West Africa.Comment: Inaugural lecture Leiden Universit
Metaphysics of Internal Controls
A quality internal control system has been seen as a remedy for various corporate governance issues. Two pieces of legislation, the Foreign Corrupt Practices Act (FCPA) and the Sarbanes-Oxley Act (SOX) deal with very different corporate governance issues, but each argue for a similar remedy. Both the FCPA and the SOX legislation argue that improved (or proper) internal controls are necessary to root out bribery of foreign officials, in the case of the FCPA, and (in the case of SOX) to support the accurate preparation of financial statements. An issue that has yet to be resolved is that the quality of internal control systems is subject to subjective assessments of the internal control deficiencies and their impact. This paper presents a mathematical model of internal controls based on Gӧdel number of axioms. This results in the representation of quality internal controls in terms of an integer. This approach also allows for inferences about financial statements and various auditing judgements
An Integrated Engineering-Computation Framework for Collaborative Engineering: An Application in Project Management
Today\u27s engineering applications suffer from a severe integration problem. Engineering, the entire process, consists of a myriad of individual, often complex, tasks. Most computer tools support particular tasks in engineering, but the output of one tool is different from the others\u27. Thus, the users must re-enter the relevant information in the format required by another tool. Moreover, usually in the development process of a new product/process, several teams of engineers with different backgrounds/responsibilities are involved, for example mechanical engineers, cost estimators, manufacturing engineers, quality engineers, and project manager. Engineers need a tool(s) to share technical and managerial information and to be able to instantly access the latest changes made by one member, or more, in the teams to determine right away the impacts of these changes in all disciplines (cost, time, resources, etc.). In other words, engineers need to participate in a truly collaborative environment for the achievement of a common objective, which is the completion of the product/process design project in a timely, cost effective, and optimal manner.
In this thesis, a new framework that integrates the capabilities of four commercial software, Microsoft Excel™ (spreadsheet), Microsoft Project™ (project management), What\u27s Best! (an optimization add-in), and Visual Basic™ (programming language), with a state-of-the-art object-oriented database (knowledge medium), InnerCircle2000™ is being presented and applied to handle the Cost-Time Trade-Off problem in project networks. The result was a vastly superior solution over the conventional solution from the viewpoint of data handling, completeness of solution space, and in the context of a collaborative engineering-computation environment
Too Trivial To Test? An Inverse View on Defect Prediction to Identify Methods with Low Fault Risk
Background. Test resources are usually limited and therefore it is often not
possible to completely test an application before a release. To cope with the
problem of scarce resources, development teams can apply defect prediction to
identify fault-prone code regions. However, defect prediction tends to low
precision in cross-project prediction scenarios.
Aims. We take an inverse view on defect prediction and aim to identify
methods that can be deferred when testing because they contain hardly any
faults due to their code being "trivial". We expect that characteristics of
such methods might be project-independent, so that our approach could improve
cross-project predictions.
Method. We compute code metrics and apply association rule mining to create
rules for identifying methods with low fault risk. We conduct an empirical
study to assess our approach with six Java open-source projects containing
precise fault data at the method level.
Results. Our results show that inverse defect prediction can identify approx.
32-44% of the methods of a project to have a low fault risk; on average, they
are about six times less likely to contain a fault than other methods. In
cross-project predictions with larger, more diversified training sets,
identified methods are even eleven times less likely to contain a fault.
Conclusions. Inverse defect prediction supports the efficient allocation of
test resources by identifying methods that can be treated with less priority in
testing activities and is well applicable in cross-project prediction
scenarios.Comment: Submitted to PeerJ C
- …