8,676 research outputs found
Empowering Knowledge Bases: a Machine Learning Perspective
The construction of Knowledge Bases requires quite often
the intervention of knowledge engineering and domain experts, resulting
in a time consuming task. Alternative approaches have been developed
for building knowledge bases from existing sources of information such
as web pages and crowdsourcing; seminal examples are NELL, DBPedia,
YAGO and several others. With the goal of building very large sources of
knowledge, as recently for the case of Knowledge Graphs, even more complex
integration processes have been set up, involving multiple sources of
information, human expert intervention, crowdsourcing. Despite signiā-
cant eāorts for making Knowledge Graphs as comprehensive and reliable
as possible, they tend to suāer of incompleteness and noise, due to the
complex building process. Nevertheless, even for highly human curated
knowledge bases, cases of incompleteness can be found, for instance with
disjointness axioms missing quite often. Machine learning methods have
been proposed with the purpose of reāning, enriching, completing and
possibly raising potential issues in existing knowledge bases while showing
the ability to cope with noise. The talk will concentrate on classes
of mostly symbol-based machine learning methods, speciācally focusing
on concept learning, rule learning and disjointness axioms learning problems,
showing how the developed methods can be exploited for enriching
existing knowledge bases. During the talk it will be highlighted as, a
key element of the illustrated solutions, is represented by the integration
of: background knowledge, deductive reasoning and the evidence coming
from the mass of the data. The last part of the talk will be devoted
to the presentation of an approach for injecting background knowledge
into numeric-based embedding models to be used for predictive tasks on
Knowledge Graphs
TLAD 2010 Proceedings:8th international workshop on teaching, learning and assesment of databases (TLAD)
This is the eighth in the series of highly successful international workshops on the Teaching, Learning and Assessment of Databases (TLAD 2010), which once again is held as a workshop of BNCOD 2010 - the 27th International Information Systems Conference. TLAD 2010 is held on the 28th June at the beautiful Dudhope Castle at the Abertay University, just before BNCOD, and hopes to be just as successful as its predecessors.The teaching of databases is central to all Computing Science, Software Engineering, Information Systems and Information Technology courses, and this year, the workshop aims to continue the tradition of bringing together both database teachers and researchers, in order to share good learning, teaching and assessment practice and experience, and further the growing community amongst database academics. As well as attracting academics from the UK community, the workshop has also been successful in attracting academics from the wider international community, through serving on the programme committee, and attending and presenting papers.This year, the workshop includes an invited talk given by Richard Cooper (of the University of Glasgow) who will present a discussion and some results from the Database Disciplinary Commons which was held in the UK over the academic year. Due to the healthy number of high quality submissions this year, the workshop will also present seven peer reviewed papers, and six refereed poster papers. Of the seven presented papers, three will be presented as full papers and four as short papers. These papers and posters cover a number of themes, including: approaches to teaching databases, e.g. group centered and problem based learning; use of novel case studies, e.g. forensics and XML data; techniques and approaches for improving teaching and student learning processes; assessment techniques, e.g. peer review; methods for improving students abilities to develop database queries and develop E-R diagrams; and e-learning platforms for supporting teaching and learning
TLAD 2010 Proceedings:8th international workshop on teaching, learning and assesment of databases (TLAD)
This is the eighth in the series of highly successful international workshops on the Teaching, Learning and Assessment of Databases (TLAD 2010), which once again is held as a workshop of BNCOD 2010 - the 27th International Information Systems Conference. TLAD 2010 is held on the 28th June at the beautiful Dudhope Castle at the Abertay University, just before BNCOD, and hopes to be just as successful as its predecessors.The teaching of databases is central to all Computing Science, Software Engineering, Information Systems and Information Technology courses, and this year, the workshop aims to continue the tradition of bringing together both database teachers and researchers, in order to share good learning, teaching and assessment practice and experience, and further the growing community amongst database academics. As well as attracting academics from the UK community, the workshop has also been successful in attracting academics from the wider international community, through serving on the programme committee, and attending and presenting papers.This year, the workshop includes an invited talk given by Richard Cooper (of the University of Glasgow) who will present a discussion and some results from the Database Disciplinary Commons which was held in the UK over the academic year. Due to the healthy number of high quality submissions this year, the workshop will also present seven peer reviewed papers, and six refereed poster papers. Of the seven presented papers, three will be presented as full papers and four as short papers. These papers and posters cover a number of themes, including: approaches to teaching databases, e.g. group centered and problem based learning; use of novel case studies, e.g. forensics and XML data; techniques and approaches for improving teaching and student learning processes; assessment techniques, e.g. peer review; methods for improving students abilities to develop database queries and develop E-R diagrams; and e-learning platforms for supporting teaching and learning
The Art of Data Science
To flourish in the new data-intensive environment of 21st century science, we
need to evolve new skills. These can be expressed in terms of the systemized
framework that formed the basis of mediaeval education - the trivium (logic,
grammar, and rhetoric) and quadrivium (arithmetic, geometry, music, and
astronomy). However, rather than focusing on number, data is the new keystone.
We need to understand what rules it obeys, how it is symbolized and
communicated and what its relationship to physical space and time is. In this
paper, we will review this understanding in terms of the technologies and
processes that it requires. We contend that, at least, an appreciation of all
these aspects is crucial to enable us to extract scientific information and
knowledge from the data sets which threaten to engulf and overwhelm us.Comment: 12 pages, invited talk at Astrostatistics and Data Mining in Large
Astronomical Databases workshop, La Palma, Spain, 30 May - 3 June 2011, to
appear in Springer Series on Astrostatistic
Accomplishments and challenges of protein ontology
Recent progress in proteomics, computational biology, and ontology development has presented an opportunity to investigate protein data sources from unique perspective that is, examining protein data sources through structure and hierarchy of Protein Ontology (PO). Various data mining algorithms and mathematical models provide methods for analysing protein data sources; however, there are two issues that need to be addressed: (1) the need for standards for defining protein data description and exchange and (2) eliminating errors which arise with the data integration methodologies for complex queries. Protein Ontology is designed to meet these needs by providing a structured protein data specification for Protein Data Representation. Protein Ontology is standard for representing protein data in a way that helps in defining data integration and data mining models for Protein Structure and Function. We report here our development of PO; a semantic heterogeneity framework based on relationships between PO concepts; and analysis of resultant PO Data of Human Proteins. We also talk in this paper briefly about our ongoing work of designing a trustworthy framework around PO
When Things Matter: A Data-Centric View of the Internet of Things
With the recent advances in radio-frequency identification (RFID), low-cost
wireless sensor devices, and Web technologies, the Internet of Things (IoT)
approach has gained momentum in connecting everyday objects to the Internet and
facilitating machine-to-human and machine-to-machine communication with the
physical world. While IoT offers the capability to connect and integrate both
digital and physical entities, enabling a whole new class of applications and
services, several significant challenges need to be addressed before these
applications and services can be fully realized. A fundamental challenge
centers around managing IoT data, typically produced in dynamic and volatile
environments, which is not only extremely large in scale and volume, but also
noisy, and continuous. This article surveys the main techniques and
state-of-the-art research efforts in IoT from data-centric perspectives,
including data stream processing, data storage models, complex event
processing, and searching in IoT. Open research issues for IoT data management
are also discussed
Impliance: A Next Generation Information Management Appliance
ably successful in building a large market and adapting to the changes of the
last three decades, its impact on the broader market of information management
is surprisingly limited. If we were to design an information management system
from scratch, based upon today's requirements and hardware capabilities, would
it look anything like today's database systems?" In this paper, we introduce
Impliance, a next-generation information management system consisting of
hardware and software components integrated to form an easy-to-administer
appliance that can store, retrieve, and analyze all types of structured,
semi-structured, and unstructured information. We first summarize the trends
that will shape information management for the foreseeable future. Those trends
imply three major requirements for Impliance: (1) to be able to store, manage,
and uniformly query all data, not just structured records; (2) to be able to
scale out as the volume of this data grows; and (3) to be simple and robust in
operation. We then describe four key ideas that are uniquely combined in
Impliance to address these requirements, namely the ideas of: (a) integrating
software and off-the-shelf hardware into a generic information appliance; (b)
automatically discovering, organizing, and managing all data - unstructured as
well as structured - in a uniform way; (c) achieving scale-out by exploiting
simple, massive parallel processing, and (d) virtualizing compute and storage
resources to unify, simplify, and streamline the management of Impliance.
Impliance is an ambitious, long-term effort to define simpler, more robust, and
more scalable information systems for tomorrow's enterprises.Comment: This article is published under a Creative Commons License Agreement
(http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute,
display, and perform the work, make derivative works and make commercial use
of the work, but, you must attribute the work to the author and CIDR 2007.
3rd Biennial Conference on Innovative Data Systems Research (CIDR) January
710, 2007, Asilomar, California, US
The use of data-mining for the automatic formation of tactics
This paper discusses the usse of data-mining for the automatic formation of tactics. It was presented at the Workshop on Computer-Supported Mathematical Theory Development held at IJCAR in 2004. The aim of this project is to evaluate the applicability of data-mining techniques to the automatic formation of tactics from large corpuses of proofs. We data-mine information from large proof corpuses to find commonly occurring patterns. These patterns are then evolved into tactics using genetic programming techniques
- ā¦