Search CORE

251 research outputs found

Semantic Flooding: Semantic Search across Distributed Lightweight Ontologies

Author: Giunchiglia Fausto
Hume Alethia
Kharkevich Uladzimir
Publication venue
Publication date: 01/07/2009
Field of study

Lightweight ontologies are trees where links between nodes codify the fact that a node lower in the hierarchy describes a topic (and contains documents about this topic) which is more specific than the topic of the node one level above. In turn, multiple lightweight ontologies can be connected by semantic links which represent mappings among them and which can be computed, e.g., by ontology matching. In this paper we describe how these two types of links can be used to define a semantic overlay network which can cover any number of peers and which can be flooded to perform a semantic search on documents, i.e., to perform semantic flooding. We have evaluated our approach by simulating a network of 10,000 peers containing classifications which are fragments of the DMoz web directory. The results are promising and show that, in our approach, only a relatively small number of peers needs to be queried in order to achieve high accuracy

Unitn-eprints Research

Temporal Information in Data Science: An Integrated Framework and its Applications

Author: Brunello Andrea
Publication venue: Universit\ue0 degli Studi di Udine
Publication date: 19/03/2020
Field of study

Data science is a well-known buzzword, that is in fact composed of two distinct keywords, i.e., data and science. Data itself is of great importance: each analysis task begins from a set of examples. Based on such a consideration, the present work starts with the analysis of a real case scenario, by considering the development of a data warehouse-based decision support system for an Italian contact center company. Then, relying on the information collected in the developed system, a set of machine learning-based analysis tasks have been developed to answer specific business questions, such as employee work anomaly detection and automatic call classification. Although such initial applications rely on already available algorithms, as we shall see, some clever analysis workflows had also to be developed. Afterwards, continuously driven by real data and real world applications, we turned ourselves to the question of how to handle temporal information within classical decision tree models. Our research brought us the development of J48SS, a decision tree induction algorithm based on Quinlan's C4.5 learner, which is capable of dealing with temporal (e.g., sequential and time series) as well as atemporal (such as numerical and categorical) data during the same execution cycle. The decision tree has been applied into some real world analysis tasks, proving its worthiness. A key characteristic of J48SS is its interpretability, an aspect that we specifically addressed through the study of an evolutionary-based decision tree pruning technique. Next, since a lot of work concerning the management of temporal information has already been done in automated reasoning and formal verification fields, a natural direction in which to proceed was that of investigating how such solutions may be combined with machine learning, following two main tracks. First, we show, through the development of an enriched decision tree capable of encoding temporal information by means of interval temporal logic formulas, how a machine learning algorithm can successfully exploit temporal logic to perform data analysis. Then, we focus on the opposite direction, i.e., that of employing machine learning techniques to generate temporal logic formulas, considering a natural language processing scenario. Finally, as a conclusive development, the architecture of a system is proposed, in which formal methods and machine learning techniques are seamlessly combined to perform anomaly detection and predictive maintenance tasks. Such an integration represents an original, thrilling research direction that may open up new ways of dealing with complex, real-world problems.Data science is a well-known buzzword, that is in fact composed of two distinct keywords, i.e., data and science. Data itself is of great importance: each analysis task begins from a set of examples. Based on such a consideration, the present work starts with the analysis of a real case scenario, by considering the development of a data warehouse-based decision support system for an Italian contact center company. Then, relying on the information collected in the developed system, a set of machine learning-based analysis tasks have been developed to answer specific business questions, such as employee work anomaly detection and automatic call classification. Although such initial applications rely on already available algorithms, as we shall see, some clever analysis workflows had also to be developed. Afterwards, continuously driven by real data and real world applications, we turned ourselves to the question of how to handle temporal information within classical decision tree models. Our research brought us the development of J48SS, a decision tree induction algorithm based on Quinlan's C4.5 learner, which is capable of dealing with temporal (e.g., sequential and time series) as well as atemporal (such as numerical and categorical) data during the same execution cycle. The decision tree has been applied into some real world analysis tasks, proving its worthiness. A key characteristic of J48SS is its interpretability, an aspect that we specifically addressed through the study of an evolutionary-based decision tree pruning technique. Next, since a lot of work concerning the management of temporal information has already been done in automated reasoning and formal verification fields, a natural direction in which to proceed was that of investigating how such solutions may be combined with machine learning, following two main tracks. First, we show, through the development of an enriched decision tree capable of encoding temporal information by means of interval temporal logic formulas, how a machine learning algorithm can successfully exploit temporal logic to perform data analysis. Then, we focus on the opposite direction, i.e., that of employing machine learning techniques to generate temporal logic formulas, considering a natural language processing scenario. Finally, as a conclusive development, the architecture of a system is proposed, in which formal methods and machine learning techniques are seamlessly combined to perform anomaly detection and predictive maintenance tasks. Such an integration represents an original, thrilling research direction that may open up new ways of dealing with complex, real-world problems

Archivio istituzionale della ricerca - Università degli Studi di Udine

Aligning Controlled vocabularies for enabling semantic matching in a distributed knowledge management system

Author: Morshed Ahsan
Publication venue: University of Trento
Publication date: 12/04/2010
Field of study

The underlying idea of the Semantic Web is that web content should be expressed not only in natural language but also in a language that can be unambiguously understood, interpreted and used by software agents, thus permitting them to find, share and integrate information more easily. The central notion of the Semantic Web's syntax are ontologies, shared vocabularies providing taxonomies of concepts, objects and relationships between them, which describe particular domains of knowledge. A vocabulary stores words, synonyms, word sense definitions (i.e. glosses), relations between word senses and concepts; such a vocabulary is generally referred to as the Controlled Vocabulary (CV) if choice or selection of terms are done by domain specialists. A facet is a distinct and dimensional feature of a concept or a term that allows a taxonomy, ontology or CV to be viewed or ordered in multiple ways, rather than in a single way. The facet is clearly defined, mutually exclusive, and composed of collectively exhaustive properties or characteristics of a domain. For example, a collection of rice might be represented using a name facet, place facet etc. This thesis presents a methodology for producing mappings between Controlled Vocabularies, based on a technique called \Hidden Semantic Matching". The \Hidden" word stands for it not relying on any sort of externally provided background knowledge. The sole exploited knowledge comes from the \semantic context" of the same CVs which are being matched. We build a facet for each concept of these CVs, considering more general concepts (broader terms), less general concepts (narrow terms) or related concepts (related terms).Together these form a concept facet (CF) which is then used to boost the matching process

Unitn-eprints PhD

Seventh Biennial Report : June 2003 - March 2005

Author
Publication venue: Max-Planck-Institut für Informatik
Publication date: 01/01/2005
Field of study

MPG.PuRe

Tools and Algorithms for the Construction and Analysis of Systems

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/02/2021
Field of study

This book is Open Access under a CC BY licence. The LNCS 11427 and 11428 proceedings set constitutes the proceedings of the 25th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2019, which took place in Prague, Czech Republic, in April 2019, held as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2019. The total of 42 full and 8 short tool demo papers presented in these volumes was carefully reviewed and selected from 164 submissions. The papers are organized in topical sections as follows: Part I: SAT and SMT, SAT solving and theorem proving; verification and analysis; model checking; tool demo; and machine learning. Part II: concurrent and distributed systems; monitoring and runtime verification; hybrid and stochastic systems; synthesis; symbolic verification; and safety and fault-tolerant systems

Directory of Open Access Books (DOAB)

Tools and Algorithms for the Construction and Analysis of Systems

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/04/2022
Field of study

This open access book constitutes the proceedings of the 28th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2022, which was held during April 2-7, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 46 full papers and 4 short papers presented in this volume were carefully reviewed and selected from 159 submissions. The proceedings also contain 16 tool papers of the affiliated competition SV-Comp and 1 paper consisting of the competition report. TACAS is a forum for researchers, developers, and users interested in rigorously based tools and algorithms for the construction and analysis of systems. The conference aims to bridge the gaps between different communities with this common interest and to support them in their quest to improve the utility, reliability, exibility, and efficiency of tools and algorithms for building computer-controlled systems

Directory of Open Access Books (DOAB)

Incorporating Generalized Quantifiers into Description Logic to Improve Source Selection

Crossref

Tools and Algorithms for the Construction and Analysis of Systems

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

This open access two-volume set constitutes the proceedings of the 26th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2020, which took place in Dublin, Ireland, in April 2020, and was held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2020. The total of 60 regular papers presented in these volumes was carefully reviewed and selected from 155 submissions. The papers are organized in topical sections as follows: Part I: Program verification; SAT and SMT; Timed and Dynamical Systems; Verifying Concurrent Systems; Probabilistic Systems; Model Checking and Reachability; and Timed and Probabilistic Systems. Part II: Bisimulation; Verification and Efficiency; Logic and Proof; Tools and Case Studies; Games and Automata; and SV-COMP 2020

OAPEN Library

Examining Second Language Reading: A Critical Review of the Singapore Cambridge General Certificate of Education Ordinary-Level Chinese Language Examination

Author: Cheong Yun-Yee
Publication venue: UCL (University College London)
Publication date: 28/11/2018
Field of study

This mixed methods study critically reviews how the Singapore-Cambridge General Certificate of Education Ordinary-Level Chinese Language Examination (GCE 1162) examines second language reading. The main research question asks, ‘To what degree have the intended measurement objectives of the GCE 1162 reading examination been achieved?’ Four sub research questions address issues of specifications and administration, test-taker characteristics, cognitive parameters and contextual parameters. Resources drawn on include Singapore Ministry of Education and Singapore Examinations and Assessment Board documents, specifically, examination information booklets, syllabuses, committee reports and annual reviews. Subject matter experts were appointed to analyse the reading comprehension passages and test items from 22 sets of GCE 1162 reading examination papers from 2006 to 2016. Semi-structured interviews were carried out with 22 stakeholders involved in coordination, test design, item construction, marking and reviewing. The interviewees included members of an elite policy group with privileged access to test specifications and procedures. Further interviews were carried out with secondary school Chinese language teachers and students, whose perspectives are seldom considered in validation processes. Opinions were also sought from experts in the field of Chinese as a second language, reading and assessment. The study begins with an account of the concepts of validity and reading constructs. Chapter 2 discusses the Singapore education and examination system, foregrounding the history of Chinese language education and the bilingual policy introduced in 1966. A methodology chapter follows. Chapters 4 to 8 address separately each of the four sub research questions in which claims, assumptions, supporting evidence and rebuttals are presented. The final chapter, Chapter 9, addresses a posteriori inferences, including scoring, criterion-related components, and washback and impact. A cautious conclusion is drawn, namely that the measurement quality of the GCE 1162 reading examination is at a moderately unsatisfactory level

UCL Discovery

Tools and Algorithms for the Construction and Analysis of Systems

Author
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

OAPEN Library