13,976 research outputs found
Hierarchical Multi-Label Classification Using Web Reasoning for Large Datasets
Extracting valuable data among large volumes of data is one of the main challenges in Big Data. In this paper, a Hierarchical Multi-Label Classification process called Semantic HMC is presented. This process aims to extract valuable data from very large data sources, by automatically learning a label hierarchy and classifying data items.The Semantic HMC process is composed of five scalable steps, namely Indexation, Vectorization, Hierarchization, Resolution and Realization. The first three steps construct automatically a label hierarchy from statistical analysis of data. This paper focuses on the last two steps which perform item classification according to the label hierarchy. The process is implemented as a scalable and distributed application, and deployed on a Big Data platform. A quality evaluation is described, which compares the approach with multi-label classification algorithms from the state of the art dedicated to the same goal. The Semantic HMC approach outperforms state of the art approaches in some areas
Ontology of core data mining entities
In this article, we present OntoDM-core, an ontology of core data mining
entities. OntoDM-core defines themost essential datamining entities in a three-layered
ontological structure comprising of a specification, an implementation and an application
layer. It provides a representational framework for the description of mining
structured data, and in addition provides taxonomies of datasets, data mining tasks,
generalizations, data mining algorithms and constraints, based on the type of data.
OntoDM-core is designed to support a wide range of applications/use cases, such as
semantic annotation of data mining algorithms, datasets and results; annotation of
QSAR studies in the context of drug discovery investigations; and disambiguation of
terms in text mining. The ontology has been thoroughly assessed following the practices
in ontology engineering, is fully interoperable with many domain resources and
is easy to extend
Type prediction in RDF knowledge bases using hierarchical multilabel classification
Large Semantic Web knowledge bases are often noisy, incorrect, and incomplete with respect to type information. Automatic type prediction can help reduce such incompleteness, and, as previous works show, statistical methods are well-suited for this kind of data. Since most Semantic Web knowledge bases come with an ontology defining a type hierarchy, in this paper, we rephrase the type prediction problem as a hierarchical multilabel classification problem. We propose SLCN, a modification of the local classifier per node approach, which performs feature selection, instance sampling, and class balancing for each local classifier. Our approach improves scalability, facilitating its application on large Semantic Web datasets with high-dimensional feature and label spaces. We compare the performance of our proposed method with a state-of-the-art type prediction approach and popular hierarchical multilabel classifiers, and report on experiments with large-scale RDF datasets
Multi-Label Zero-Shot Learning with Structured Knowledge Graphs
In this paper, we propose a novel deep learning architecture for multi-label
zero-shot learning (ML-ZSL), which is able to predict multiple unseen class
labels for each input instance. Inspired by the way humans utilize semantic
knowledge between objects of interests, we propose a framework that
incorporates knowledge graphs for describing the relationships between multiple
labels. Our model learns an information propagation mechanism from the semantic
label space, which can be applied to model the interdependencies between seen
and unseen class labels. With such investigation of structured knowledge graphs
for visual reasoning, we show that our model can be applied for solving
multi-label classification and ML-ZSL tasks. Compared to state-of-the-art
approaches, comparable or improved performances can be achieved by our method.Comment: CVPR 201
- …