5 research outputs found
Matching Metamodels with Semantic Systems - An Experience Report
Abstract: Ontology and schema matching are well established techniques, which have been applied in various integration scenarios, e.g., web service composition and database integration. Consequently, matching tools enabling automatic matching of various kinds of schemas are available. In the field of model-driven engineering, in contrast to schema and ontology integration, the integration of modeling languages relies on manual tasks such as writing model transformation code, which is tedious and error-prone. Therefore, we propose the application of ontology and schema matching techniques for automatically exploring semantic correspondences between metamodels, which are currently the modeling language definitions of choice. The main focus of this paper is on reporting preliminary results and lessons learned by evaluating currently available ontology matching tools for their metamodel matching potential.
Significance Tests for the Measure of Raw Agreement
Significance tests for the measure of raw agreement are proposed. First, it is shown that the measure of raw agreement can be expressed as a proportionate reduction-in-error measure, sharing this characteristic with Cohen's Kappa and Brennan and Prediger's Kappa_n. Second, it is shown that the coefficient of raw agreement is linearly related to Brennan and Prediger's Kappa_n. Therefore, using the same base model for the estimation of expected cell frequencies as Brennan and Prediger's Kappa_n, one can devise significance tests for the measure of raw agreement. Two tests are proposed. The first uses Stouffer's Z, a probability pooler. The second test is the binomial test. A data example analyzes the agreement between two psychiatrists' diagnoses. The covariance structure of the agreement cells in a rater by rater table is described. Simulation studies show the performance and power functions of the test statistics. (author's abstract)Series: Research Report Series / Department of Statistics and Mathematic
Benchmarking Open-Source Tree Learners in R/RWeka
The two most popular classification tree algorithms in machine learning and statistics - C4.5 and CART - are compared in a benchmark experiment together with two other more recent constant-fit tree learners from the statistics literature (QUEST, conditional inference trees). The study assesses both misclassification error and model complexity on bootstrap replications of 18 different benchmark datasets. It is carried out in the R system for statistical computing, made possible by means of the RWeka package which interfaces R to the open-source machine learning toolbox Weka. Both algorithms are found to be competitive in terms of misclassification error - with the performance difference clearly varying across data sets. However, C4.5 tends to grow larger and thus more complex trees. (author's abstract)Series: Research Report Series / Department of Statistics and Mathematic