57,400 research outputs found
Mapping Big Data into Knowledge Space with Cognitive Cyber-Infrastructure
Big data research has attracted great attention in science, technology,
industry and society. It is developing with the evolving scientific paradigm,
the fourth industrial revolution, and the transformational innovation of
technologies. However, its nature and fundamental challenge have not been
recognized, and its own methodology has not been formed. This paper explores
and answers the following questions: What is big data? What are the basic
methods for representing, managing and analyzing big data? What is the
relationship between big data and knowledge? Can we find a mapping from big
data into knowledge space? What kind of infrastructure is required to support
not only big data management and analysis but also knowledge discovery, sharing
and management? What is the relationship between big data and science paradigm?
What is the nature and fundamental challenge of big data computing? A
multi-dimensional perspective is presented toward a methodology of big data
computing.Comment: 59 page
Unsupervised Learning via Total Correlation Explanation
Learning by children and animals occurs effortlessly and largely without
obvious supervision. Successes in automating supervised learning have not
translated to the more ambiguous realm of unsupervised learning where goals and
labels are not provided. Barlow (1961) suggested that the signal that brains
leverage for unsupervised learning is dependence, or redundancy, in the sensory
environment. Dependence can be characterized using the information-theoretic
multivariate mutual information measure called total correlation. The principle
of Total Cor-relation Ex-planation (CorEx) is to learn representations of data
that "explain" as much dependence in the data as possible. We review some
manifestations of this principle along with successes in unsupervised learning
problems across diverse domains including human behavior, biology, and
language.Comment: Invited contribution for IJCAI 2017 Early Career Spotlight. 5 pages,
1 figur
SciRecSys: A Recommendation System for Scientific Publication by Discovering Keyword Relationships
In this work, we propose a new approach for discovering various relationships
among keywords over the scientific publications based on a Markov Chain model.
It is an important problem since keywords are the basic elements for
representing abstract objects such as documents, user profiles, topics and many
things else. Our model is very effective since it combines four important
factors in scientific publications: content, publicity, impact and randomness.
Particularly, a recommendation system (called SciRecSys) has been presented to
support users to efficiently find out relevant articles
Chaotic Crystallography: How the physics of information reveals structural order in materials
We review recent progress in applying information- and computation-theoretic
measures to describe material structure that transcends previous methods based
on exact geometric symmetries. We discuss the necessary theoretical background
for this new toolset and show how the new techniques detect and describe novel
material properties. We discuss how the approach relates to well known
crystallographic practice and examine how it provides novel interpretations of
familiar structures. Throughout, we concentrate on disordered materials that,
while important, have received less attention both theoretically and
experimentally than those with either periodic or aperiodic order.Comment: 9 pages, two figures, 1 table;
http://csc.ucdavis.edu/~cmg/compmech/pubs/ChemOpinion.ht
- …