370,459 research outputs found
Explaining data patterns using background knowledge from Linked Data
When using data mining to find regularities in data, the obtained results (or patterns) need to be interpreted. The explanation of such patterns is achieved using the background knowledge which might be scattered among different sources. This intensive process is usually committed to the experts in the domain. With the rise of Linked Data and the increasing number of connected datasets, we assume that the access to this knowledge can be easier, faster and more automated. This PhD research aims to demonstrate whether Linked Data can be used to provide the background knowledge for pattern interpretation and how
Recommended from our members
Explaining Data Patterns using Knowledge from the Web of Data
Knowledge Discovery (KD) is a long-tradition field aiming at developing methodologies to detect hidden patterns and regularities in large datasets, using techniques from a wide range of domains, such as statistics, machine learning, pattern recognition or data visualisation. In most real world contexts, the interpretation and explanation of the discovered patterns is left to human experts, whose work is to use their background knowledge to analyse, refine and make the patterns understandable for the intended purpose. Explaining patterns is therefore an intensive and time-consuming process, where parts of the knowledge can remain unrevealed, especially when the experts lack some of the required background knowledge.
In this thesis, we investigate the hypothesis that such interpretation process can be facilitated by introducing background knowledge from the Web of (Linked) Data. In the last decade, many areas started publishing and sharing their domain-specific knowledge in the form of structured data, with the objective of encouraging information sharing, reuse and discovery. With a constantly increasing amount of shared and connected knowledge, we thus assume that the process of explaining patterns can become easier, faster, and more automated.
To demonstrate this, we developed Dedalo, a framework that automatically provides explanations to patterns of data using the background knowledge extracted from the Web of Data. We studied the elements required for a piece of information to be considered an explanation, identified the best strategies to automatically find the right piece of information in the Web of Data, and designed a process able to produce explanations to a given pattern using the background knowledge autonomously collected from the Web of Data.
The final evaluation of Dedalo involved users within an empirical study based on a real-world scenario. We demonstrated that the explanation process is complex when not being familiar with the domain of usage, but also that this can be considerably simplified when using the Web of Data as a source of background knowledge
Recommended from our members
Explaining clusters with inductive logic programming and linked data
Knowledge Discovery consists in discovering hidden regularities in large amounts of data using data mining techniques. The obtained patterns require an interpretation that is usually achieved using some background knowledge given by experts from several domains. On the other hand, the rise of Linked Data has increased the number of connected cross-disciplinary knowledge, in the form of RDF datasets, classes and relationships. Here we show how Linked Data can be used in an Inductive Logic Programming process, where they provide background knowledge for finding hypotheses regarding the unrevealed connections between items of a cluster. By using an example with clusters of books, we show how different Linked Data sources can be used to automatically generate rules giving an underlying explanation to such clusters
Understanding from Machine Learning Models
Simple idealized models seem to provide more understanding than opaque, complex, and hyper-realistic models. However, an increasing number of scientists are going in the opposite direction by utilizing opaque machine learning models to make predictions and draw inferences, suggesting that scientists are opting for models that have less potential for understanding. Are scientists trading understanding for some other epistemic or pragmatic good when they choose a machine learning model? Or are the assumptions behind why minimal models provide understanding misguided? In this paper, using the case of deep neural networks, I argue that it is not the complexity or black box nature of a model that limits how much understanding the model provides. Instead, it is a lack of scientific and empirical evidence supporting the link that connects a model to the target phenomenon that primarily prohibits understanding
Geoscience after IT: Part J. Human requirements that shape the evolving geoscience information system
The geoscience record is constrained by the limitations of human thought and of the technology for handling information. IT can lead us away from the tyranny of older technology, but to find the right path, we need to understand our own limitations. Language, images, data and mathematical models, are tools for expressing and recording our ideas. Backed by intuition, they enable us to think in various modes, to build knowledge from information and create models as artificial views of a real world. Markup languages may accommodate more flexible and better connected records, and the object-oriented approach may help to match IT more closely to our thought processes
The influence of school and teaching quality on children’s progress in primary school
This report investigates the way school and classroom processes affect the cognitive
progress and social/behavioural development of children between the ages of 6 (Year 1)
and 10 (Year 5) in primary schools in England.
The research is part of the larger longitudinal study of Effective Pre-School and Primary
Education (EPPE 3-11) funded by the Department for Children, Schools and Families
(DCSF) that is following children’s cognitive and social/behavioural development from
ages 3 to 11 years. The EPPE 3-11 study investigates both pre-school and primary
school influences on children’s attainment, progress and social/behavioural
development. This report describes the results of quantitative analyses based on a subsample
of 1160 EPPE children across Year 1 to 5 of primary education. The research
builds on the earlier analyses of children’s Reading and Mathematics attainments and
social/behavioural outcomes in Year 5 for the full EPPE 3-11 sample (see Sammons,
2007a; 2007b), by investigating relationships between children’s outcomes and
measures of classroom processes, collected through direct observation of Year 5
classes in 125 focal schools chosen from the larger EPPE 3-11 data set. The analyses
also explore patterns of association between children’s outcomes and broader measures
of overall school characteristics derived from teacher questionnaires and Ofsted
inspection reports for this sub-sample of schools
Connected innovation: an international comparative study that identifies mixed modes of innovation
This paper offers a new angle on innovation modalities by adopting a recently emerging approach towards identifying innovation typologies via exploratory data analysis techniques with the aim to tease out some underlying latent variables that represent coherent innovation strategies for groups of firms. Mixed modes of innovation include aspects of both user and open innovation, and are employed to inform on such concepts. The modes of innovation are developed by exploring micro-level innovation survey data across 18 countries. The contributions of the paper lie in (a) the identification of five core innovation modes that are found in almost all countries; and (b) examining – via regression analysis – the role of different modes in firm performance
- …