326,226 research outputs found
On Horizontal and Vertical Separation in Hierarchical Text Classification
Hierarchy is a common and effective way of organizing data and representing
their relationships at different levels of abstraction. However, hierarchical
data dependencies cause difficulties in the estimation of "separable" models
that can distinguish between the entities in the hierarchy. Extracting
separable models of hierarchical entities requires us to take their relative
position into account and to consider the different types of dependencies in
the hierarchy. In this paper, we present an investigation of the effect of
separability in text-based entity classification and argue that in hierarchical
classification, a separation property should be established between entities
not only in the same layer, but also in different layers. Our main findings are
the followings. First, we analyse the importance of separability on the data
representation in the task of classification and based on that, we introduce a
"Strong Separation Principle" for optimizing expected effectiveness of
classifiers decision based on separation property. Second, we present
Hierarchical Significant Words Language Models (HSWLM) which capture all, and
only, the essential features of hierarchical entities according to their
relative position in the hierarchy resulting in horizontally and vertically
separable models. Third, we validate our claims on real-world data and
demonstrate that how HSWLM improves the accuracy of classification and how it
provides transferable models over time. Although discussions in this paper
focus on the classification problem, the models are applicable to any
information access tasks on data that has, or can be mapped to, a hierarchical
structure.Comment: Full paper (10 pages) accepted for publication in proceedings of ACM
SIGIR International Conference on the Theory of Information Retrieval
(ICTIR'16
Modeling Meaning Associated with Documental Entities: Introducing the Brussels Quantum Approach
We show that the Brussels operational-realistic approach to quantum physics
and quantum cognition offers a fundamental strategy for modeling the meaning
associated with collections of documental entities. To do so, we take the World
Wide Web as a paradigmatic example and emphasize the importance of
distinguishing the Web, made of printed documents, from a more abstract meaning
entity, which we call the Quantum Web, or QWeb, where the former is considered
to be the collection of traces that can be left by the latter, in specific
measurements, similarly to how a non-spatial quantum entity, like an electron,
can leave localized traces of impact on a detection screen. The double-slit
experiment is extensively used to illustrate the rationale of the modeling,
which is guided by how physicists constructed quantum theory to describe the
behavior of the microscopic entities. We also emphasize that the superposition
principle and the associated interference effects are not sufficient to model
all experimental probabilistic data, like those obtained by counting the
relative number of documents containing certain words and co-occurrences of
words. For this, additional effects, like context effects, must also be taken
into consideration.Comment: 27 pages, 6 figures, Late
A design model for Open Distributed Processing systems
This paper proposes design concepts that allow the conception, understanding and development of complex technical structures for open distributed systems. The proposed concepts are related to, and partially motivated by, the present work on Open Distributed Processing (ODP). As opposed to the current ODP approach, the concepts are aimed at supporting a design trajectory with several, related abstraction levels. Simple examples are used to illustrate the proposed concepts
- …