50,928 research outputs found
mARC: Memory by Association and Reinforcement of Contexts
This paper introduces the memory by Association and Reinforcement of Contexts
(mARC). mARC is a novel data modeling technology rooted in the second
quantization formulation of quantum mechanics. It is an all-purpose incremental
and unsupervised data storage and retrieval system which can be applied to all
types of signal or data, structured or unstructured, textual or not. mARC can
be applied to a wide range of information clas-sification and retrieval
problems like e-Discovery or contextual navigation. It can also for-mulated in
the artificial life framework a.k.a Conway "Game Of Life" Theory. In contrast
to Conway approach, the objects evolve in a massively multidimensional space.
In order to start evaluating the potential of mARC we have built a mARC-based
Internet search en-gine demonstrator with contextual functionality. We compare
the behavior of the mARC demonstrator with Google search both in terms of
performance and relevance. In the study we find that the mARC search engine
demonstrator outperforms Google search by an order of magnitude in response
time while providing more relevant results for some classes of queries
Toward Entity-Aware Search
As the Web has evolved into a data-rich repository, with the standard "page view," current search engines are becoming increasingly inadequate for a wide range of query tasks. While we often search for various data "entities" (e.g., phone number, paper PDF, date), today's engines only take us indirectly to pages. In my Ph.D. study, we focus on a novel type of Web search that is aware of data entities inside pages, a significant departure from traditional document retrieval. We study the various essential aspects of supporting entity-aware Web search. To begin with, we tackle the core challenge of ranking entities, by distilling its underlying conceptual model Impression Model and developing a probabilistic ranking framework, EntityRank, that is able to seamlessly integrate both local and global information in ranking. We also report a prototype system built to show the initial promise of the proposal. Then, we aim at distilling and abstracting the essential computation requirements of entity search. From the dual views of reasoning--entity as input and entity as output, we propose a dual-inversion framework, with two indexing and partition schemes, towards efficient and scalable query processing. Further, to recognize more entity instances, we study the problem of entity synonym discovery through mining query log data. The results we obtained so far have shown clear promise of entity-aware search, in its usefulness, effectiveness, efficiency and scalability
Constructing experimental indicators for Open Access documents
The ongoing paradigm change in the scholarly publication system ('science is
turning to e-science') makes it necessary to construct alternative evaluation
criteria/metrics which appropriately take into account the unique
characteristics of electronic publications and other research output in digital
formats. Today, major parts of scholarly Open Access (OA) publications and the
self-archiving area are not well covered in the traditional citation and
indexing databases. The growing share and importance of freely accessible
research output demands new approaches/metrics for measuring and for evaluating
of these new types of scientific publications. In this paper we propose a
simple quantitative method which establishes indicators by measuring the
access/download pattern of OA documents and other web entities of a single web
server. The experimental indicators (search engine, backlink and direct access
indicator) are constructed based on standard local web usage data. This new
type of web-based indicator is developed to model the specific demand for
better study/evaluation of the accessibility, visibility and interlinking of
open accessible documents. We conclude that e-science will need new stable
e-indicators.Comment: 9 pages, 3 figure
Service-oriented coordination platform for technology-enhanced learning
It is currently difficult to coordinate learning processes, not only because multiple stakeholders are involved (such as students, teachers, administrative staff, technical staff), but also because these processes are driven by sophisticated rules (such as rules on how to provide learning material, rules on how to assess students’ progress, rules on how to share educational responsibilities). This is one of the reasons for the slow progress in technology-enhanced learning. Consequently, there is a clear demand for technological facilitation of the coordination of learning processes. In this work, we suggest some solution directions that are based on SOA (Service-Oriented Architecture). In particular, we propose a coordination service pattern consistent with SOA and based on requirements that follow from an analysis of both learning processes and potentially useful support technologies. We present the service pattern considering both functional and non-functional issues, and we address policy enforcement as well. Finally, we complement our proposed architecture-level solution directions with an example. The example illustrates our ideas and is also used to identify: (i) a short list of educational IT services; (ii) related non-functional concerns; they will be considered in future work
Towards an Intelligent Database System Founded on the SP Theory of Computing and Cognition
The SP theory of computing and cognition, described in previous publications,
is an attractive model for intelligent databases because it provides a simple
but versatile format for different kinds of knowledge, it has capabilities in
artificial intelligence, and it can also function like established database
models when that is required.
This paper describes how the SP model can emulate other models used in
database applications and compares the SP model with those other models. The
artificial intelligence capabilities of the SP model are reviewed and its
relationship with other artificial intelligence systems is described. Also
considered are ways in which current prototypes may be translated into an
'industrial strength' working system
A software toolkit for web-based virtual environments based on a shared database
We propose a software toolkit for developing complex web-based user interfaces, incorporating such things as multi-user facilities, virtual environments (VEs), and interface agents. The toolkit is based on a novel software architecture that combines ideas from multi-agent platforms and user interface (UI) architectures. It provides a distributed shared database with publish-subscribe facilities. This enables UI components to observe the state and activities of any other components in the system easily. The system runs in a web-based environment. The toolkit is comprised of several programming and other specification languages, providing a complete suite of systems design languages. We illustrate the toolkit by means of a couple of examples
Weaving Entities into Relations: From Page Retrieval to Relation Mining on the Web
With its sheer amount of information, the Web is clearly an important frontier for data mining. While Web mining must start with content on the Web, there is no effective ``search-based'' mechanism to help sifting through the information on the Web. Our goal is to provide a such online search-based facility for supporting query primitives, upon which Web mining applications can be built. As a first step, this paper aims at entity-relation discovery, or E-R discovery, as a useful function-- to weave scattered entities on the Web into coherent relations. To begin with, as our proposal, we formalize the concept of E-R discovery. Further, to realize E-R discovery, as our main thesis, we abstract tuple ranking-- the essential challenge of E-R discovery-- as pattern-based cooccurrence analysis. Finally, as our key insight, we observe that such relation mining shares the same core functions as traditional page-retrieval systems, which enables us to build the new E-R discovery upon today's search engines, almost for free. We report our system prototype and testbed, WISDM-ER, with real Web corpus. Our case studies have demonstrated a high promise, achieving 83%-91% accuracy for real benchmark queries-- and thus the real possibilities of enabling ad-hoc Web mining tasks with online E-R discovery
- …