127 research outputs found
Mapping Big Data into Knowledge Space with Cognitive Cyber-Infrastructure
Big data research has attracted great attention in science, technology,
industry and society. It is developing with the evolving scientific paradigm,
the fourth industrial revolution, and the transformational innovation of
technologies. However, its nature and fundamental challenge have not been
recognized, and its own methodology has not been formed. This paper explores
and answers the following questions: What is big data? What are the basic
methods for representing, managing and analyzing big data? What is the
relationship between big data and knowledge? Can we find a mapping from big
data into knowledge space? What kind of infrastructure is required to support
not only big data management and analysis but also knowledge discovery, sharing
and management? What is the relationship between big data and science paradigm?
What is the nature and fundamental challenge of big data computing? A
multi-dimensional perspective is presented toward a methodology of big data
computing.Comment: 59 page
Interactive semantics
Much research pursues machine intelligence through better representation of semantics. What is semantics? People in different areas view semantics from different facets although it accompanies interaction through civilization. Some researchers believe that humans have some innate structure in mind for processing semantics. Then, what the structure is like? Some argue that humans evolve a structure for processing semantics through constant learning. Then, how the process is like? Humans have invented various symbol systems to represent semantics. Can semantics be accurately represented? Turing machines are good at processing symbols according to algorithms designed by humans, but they are limited in ability to process semantics and to do active interaction. Super computers and high-speed networks do not help solve this issue as they do not have any semantic worldview and cannot reflect themselves. Can future cyber-society have some semantic images that enable machines and individuals (humans and agents) to reflect themselves and interact with each other with knowing social situation through time? This paper concerns these issues in the context of studying an interactive semantics for the future cyber-society. It firstly distinguishes social semantics from natural semantics, and then explores the interactive semantics in the category of social semantics. Interactive semantics consists of an interactive system and its semantic image, which co-evolve and influence each other. The semantic worldview and interactive semantic base are proposed as the semantic basis of interaction. The process of building and explaining semantic image can be based on an evolving structure incorporating adaptive multi-dimensional classification space and self-organized semantic link network. A semantic lens is proposed to enhance the potential of the structure and help individuals build and retrieve semantic images from different facets, abstraction levels and scales through time
Communities and emerging semantics in semantic link network:discovery and learning
The World Wide Web provides plentiful contents for Web-based learning, but its hyperlink-based architecture connects Web resources for browsing freely rather than for effective learning. To support effective learning, an e-learning system should be able to discover and make use of the semantic communities and the emerging semantic relations in a dynamic complex network of learning resources. Previous graph-based community discovery approaches are limited in ability to discover semantic communities. This paper first suggests the Semantic Link Network (SLN), a loosely coupled semantic data model that can semantically link resources and derive out implicit semantic links according to a set of relational reasoning rules. By studying the intrinsic relationship between semantic communities and the semantic space of SLN, approaches to discovering reasoning-constraint, rule-constraint, and classification-constraint semantic communities are proposed. Further, the approaches, principles, and strategies for discovering emerging semantics in dynamic SLNs are studied. The basic laws of the semantic link network motion are revealed for the first time. An e-learning environment incorporating the proposed approaches, principles, and strategies to support effective discovery and learning is suggested
Adding Logical Operators to Tree Pattern Queries on Graph-Structured Data
As data are increasingly modeled as graphs for expressing complex
relationships, the tree pattern query on graph-structured data becomes an
important type of queries in real-world applications. Most practical query
languages, such as XQuery and SPARQL, support logical expressions using
logical-AND/OR/NOT operators to define structural constraints of tree patterns.
In this paper, (1) we propose generalized tree pattern queries (GTPQs) over
graph-structured data, which fully support propositional logic of structural
constraints. (2) We make a thorough study of fundamental problems including
satisfiability, containment and minimization, and analyze the computational
complexity and the decision procedures of these problems. (3) We propose a
compact graph representation of intermediate results and a pruning approach to
reduce the size of intermediate results and the number of join operations --
two factors that often impair the efficiency of traditional algorithms for
evaluating tree pattern queries. (4) We present an efficient algorithm for
evaluating GTPQs using 3-hop as the underlying reachability index. (5)
Experiments on both real-life and synthetic data sets demonstrate the
effectiveness and efficiency of our algorithm, from several times to orders of
magnitude faster than state-of-the-art algorithms in terms of evaluation time,
even for traditional tree pattern queries with only conjunctive operations.Comment: 16 page
Discovering Patterns of Definitions and Methods from Scientific Documents
The difficulties of automatic extraction of definitions and methods from
scientific documents lie in two aspects: (1) the complexity and diversity of
natural language texts, which requests an analysis method to support the
discovery of pattern; and, (2) a complete definition or method represented by a
scientific paper is usually distributed within text, therefore an effective
approach should not only extract single sentence definitions and methods but
also integrate the sentences to obtain a complete definition or method. This
paper proposes an analysis method for discovering patterns of definition and
method and uses the method to discover patterns of definition and method.
Completeness of the patterns at the semantic level is guaranteed by a complete
set of semantic relations that identify definitions and methods respectively.
The completeness of the patterns at the syntactic and lexical levels is
guaranteed by syntactic and lexical constraints. Experiments on the self-built
dataset and two public definition datasets show that the discovered patterns
are effective. The patterns can be used to extract definitions and methods from
scientific documents and can be tailored or extended to suit other
applications
Probabilistic resource space model for managing resources in cyber-physical society
Classification is the most basic method for organizing resources in the physical space, cyber space, socio space and mental space. To create a unified model that can effectively manage resources in different spaces is a challenge. The Resource Space Model RSM is to manage versatile resources with a multi-dimensional classification space. It supports generalization and specialization on multi-dimensional classifications. This paper introduces the basic concepts of RSM, and proposes the Probabilistic Resource Space Model, P-RSM, to deal with uncertainty in managing various resources in different spaces of the cyber-physical society. P-RSM’s normal forms, operations and integrity constraints are developed to support effective management of the resource space. Characteristics of the P-RSM are analyzed through experiments. This model also enables various services to be described, discovered and composed from multiple dimensions and abstraction levels with normal form and integrity guarantees. Some extensions and applications of the P-RSM are introduced
An angle-based interest model for text recommendation
Building an interest model is the key to realize personalized text recommendation. Previous interest models neglect the fact that a user may have multiple angles of interests. Different angles of interest provide different requests and criteria for text recommendation. This paper proposes an interest model that consists of two kinds of angles: persistence and pattern, which can be combined to form complex angles. The model uses a new method to represent the long-term interest and the short-term interest, and distinguishes the interest on object and the interest on the link structure of objects. Experiments with news-scale text data show that the interest on object and the interest on link structure have real requirements, and it is effective to recommend texts according to the angles
- …