4,237 research outputs found
Information Extraction from Scientific Literature for Method Recommendation
As a research community grows, more and more papers are published each year.
As a result there is increasing demand for improved methods for finding
relevant papers, automatically understanding the key ideas and recommending
potential methods for a target problem. Despite advances in search engines, it
is still hard to identify new technologies according to a researcher's need.
Due to the large variety of domains and extremely limited annotated resources,
there has been relatively little work on leveraging natural language processing
in scientific recommendation. In this proposal, we aim at making scientific
recommendations by extracting scientific terms from a large collection of
scientific papers and organizing the terms into a knowledge graph. In
preliminary work, we trained a scientific term extractor using a small amount
of annotated data and obtained state-of-the-art performance by leveraging large
amount of unannotated papers through applying multiple semi-supervised
approaches. We propose to construct a knowledge graph in a way that can make
minimal use of hand annotated data, using only the extracted terms,
unsupervised relational signals such as co-occurrence, and structural external
resources such as Wikipedia. Latent relations between scientific terms can be
learned from the graph. Recommendations will be made through graph inference
for both observed and unobserved relational pairs.Comment: Thesis Proposal. arXiv admin note: text overlap with arXiv:1708.0607
Discriminative Predicate Path Mining for Fact Checking in Knowledge Graphs
Traditional fact checking by experts and analysts cannot keep pace with the
volume of newly created information. It is important and necessary, therefore,
to enhance our ability to computationally determine whether some statement of
fact is true or false. We view this problem as a link-prediction task in a
knowledge graph, and present a discriminative path-based method for fact
checking in knowledge graphs that incorporates connectivity, type information,
and predicate interactions. Given a statement S of the form (subject,
predicate, object), for example, (Chicago, capitalOf, Illinois), our approach
mines discriminative paths that alternatively define the generalized statement
(U.S. city, predicate, U.S. state) and uses the mined rules to evaluate the
veracity of statement S. We evaluate our approach by examining thousands of
claims related to history, geography, biology, and politics using a public,
million node knowledge graph extracted from Wikipedia and PubMedDB. Not only
does our approach significantly outperform related models, we also find that
the discriminative predicate path model is easily interpretable and provides
sensible reasons for the final determination.Comment: 17 pages, 4 Figures. To Appear in Knowledge Based System
Discriminative Subnetworks with Regularized Spectral Learning for Global-state Network Data
Data mining practitioners are facing challenges from data with network
structure. In this paper, we address a specific class of global-state networks
which comprises of a set of network instances sharing a similar structure yet
having different values at local nodes. Each instance is associated with a
global state which indicates the occurrence of an event. The objective is to
uncover a small set of discriminative subnetworks that can optimally classify
global network values. Unlike most existing studies which explore an
exponential subnetwork space, we address this difficult problem by adopting a
space transformation approach. Specifically, we present an algorithm that
optimizes a constrained dual-objective function to learn a low-dimensional
subspace that is capable of discriminating networks labelled by different
global states, while reconciling with common network topology sharing across
instances. Our algorithm takes an appealing approach from spectral graph
learning and we show that the globally optimum solution can be achieved via
matrix eigen-decomposition.Comment: manuscript for the ECML 2014 pape
A review of heterogeneous data mining for brain disorders
With rapid advances in neuroimaging techniques, the research on brain
disorder identification has become an emerging area in the data mining
community. Brain disorder data poses many unique challenges for data mining
research. For example, the raw data generated by neuroimaging experiments is in
tensor representations, with typical characteristics of high dimensionality,
structural complexity and nonlinear separability. Furthermore, brain
connectivity networks can be constructed from the tensor data, embedding subtle
interactions between brain regions. Other clinical measures are usually
available reflecting the disease status from different perspectives. It is
expected that integrating complementary information in the tensor data and the
brain network data, and incorporating other clinical parameters will be
potentially transformative for investigating disease mechanisms and for
informing therapeutic interventions. Many research efforts have been devoted to
this area. They have achieved great success in various applications, such as
tensor-based modeling, subgraph pattern mining, multi-view feature analysis. In
this paper, we review some recent data mining methods that are used for
analyzing brain disorders
A Concept-Centered Hypertext Approach to Case-Based Retrieval
The goal of case-based retrieval is to assist physicians in the clinical
decision making process, by finding relevant medical literature in large
archives. We propose a research that aims at improving the effectiveness of
case-based retrieval systems through the use of automatically created
document-level semantic networks. The proposed research tackles different
aspects of information systems and leverages the recent advancements in
information extraction and relational learning to revisit and advance the core
ideas of concept-centered hypertext models. We propose a two-step methodology
that in the first step addresses the automatic creation of document-level
semantic networks, then in the second step it designs methods that exploit such
document representations to retrieve relevant cases from medical literature.
For the automatic creation of documents' semantic networks, we design a
combination of information extraction techniques and relational learning
models. Mining concepts and relations from text, information extraction
techniques represent the core of the document-level semantic networks' building
process. On the other hand, relational learning models have the task of
enriching the graph with additional connections that have not been detected by
information extraction algorithms and strengthening the confidence score of
extracted relations. For the retrieval of relevant medical literature, we
investigate methods that are capable of comparing the documents' semantic
networks in terms of structure and semantics. The automatic extraction of
semantic relations from documents, and their centrality in the creation of the
documents' semantic networks, represent our attempt to go one step further than
previous graph-based approaches
Deep Representation Learning for Social Network Analysis
Social network analysis is an important problem in data mining. A fundamental
step for analyzing social networks is to encode network data into
low-dimensional representations, i.e., network embeddings, so that the network
topology structure and other attribute information can be effectively
preserved. Network representation leaning facilitates further applications such
as classification, link prediction, anomaly detection and clustering. In
addition, techniques based on deep neural networks have attracted great
interests over the past a few years. In this survey, we conduct a comprehensive
review of current literature in network representation learning utilizing
neural network models. First, we introduce the basic models for learning node
representations in homogeneous networks. Meanwhile, we will also introduce some
extensions of the base models in tackling more complex scenarios, such as
analyzing attributed networks, heterogeneous networks and dynamic networks.
Then, we introduce the techniques for embedding subgraphs. After that, we
present the applications of network representation learning. At the end, we
discuss some promising research directions for future work
Relation Extraction : A Survey
With the advent of the Internet, large amount of digital text is generated
everyday in the form of news articles, research publications, blogs, question
answering forums and social media. It is important to develop techniques for
extracting information automatically from these documents, as lot of important
information is hidden within them. This extracted information can be used to
improve access and management of knowledge hidden in large text corpora.
Several applications such as Question Answering, Information Retrieval would
benefit from this information. Entities like persons and organizations, form
the most basic unit of the information. Occurrences of entities in a sentence
are often linked through well-defined relations; e.g., occurrences of person
and organization in a sentence may be linked through relations such as employed
at. The task of Relation Extraction (RE) is to identify such relations
automatically. In this paper, we survey several important supervised,
semi-supervised and unsupervised RE techniques. We also cover the paradigms of
Open Information Extraction (OIE) and Distant Supervision. Finally, we describe
some of the recent trends in the RE techniques and possible future research
directions. This survey would be useful for three kinds of readers - i)
Newcomers in the field who want to quickly learn about RE; ii) Researchers who
want to know how the various RE techniques evolved over time and what are
possible future research directions and iii) Practitioners who just need to
know which RE technique works best in various settings
Person Re-Identification by Camera Correlation Aware Feature Augmentation
The challenge of person re-identification (re-id) is to match individual
images of the same person captured by different non-overlapping camera views
against significant and unknown cross-view feature distortion. While a large
number of distance metric/subspace learning models have been developed for
re-id, the cross-view transformations they learned are view-generic and thus
potentially less effective in quantifying the feature distortion inherent to
each camera view. Learning view-specific feature transformations for re-id
(i.e., view-specific re-id), an under-studied approach, becomes an alternative
resort for this problem. In this work, we formulate a novel view-specific
person re-identification framework from the feature augmentation point of view,
called Camera coRrelation Aware Feature augmenTation (CRAFT). Specifically,
CRAFT performs cross-view adaptation by automatically measuring camera
correlation from cross-view visual data distribution and adaptively conducting
feature augmentation to transform the original features into a new adaptive
space. Through our augmentation framework, view-generic learning algorithms can
be readily generalized to learn and optimize view-specific sub-models whilst
simultaneously modelling view-generic discrimination information. Therefore,
our framework not only inherits the strength of view-generic model learning but
also provides an effective way to take into account view specific
characteristics. Our CRAFT framework can be extended to jointly learn
view-specific feature transformations for person re-id across a large network
with more than two cameras, a largely under-investigated but realistic re-id
setting. Additionally, we present a domain-generic deep person appearance
representation which is designed particularly to be towards view invariant for
facilitating cross-view adaptation by CRAFT.Comment: To Appear in IEEE Transactions on Pattern Analysis and Machine
Intelligence, 201
Human Action Recognition and Prediction: A Survey
Derived from rapid advances in computer vision and machine learning, video
analysis tasks have been moving from inferring the present state to predicting
the future state. Vision-based action recognition and prediction from videos
are such tasks, where action recognition is to infer human actions (present
state) based upon complete action executions, and action prediction to predict
human actions (future state) based upon incomplete action executions. These two
tasks have become particularly prevalent topics recently because of their
explosively emerging real-world applications, such as visual surveillance,
autonomous driving vehicle, entertainment, and video retrieval, etc. Many
attempts have been devoted in the last a few decades in order to build a robust
and effective framework for action recognition and prediction. In this paper,
we survey the complete state-of-the-art techniques in the action recognition
and prediction. Existing models, popular algorithms, technical difficulties,
popular action databases, evaluation protocols, and promising future directions
are also provided with systematic discussions
cvpaper.challenge in 2015 - A review of CVPR2015 and DeepSurvey
The "cvpaper.challenge" is a group composed of members from AIST, Tokyo Denki
Univ. (TDU), and Univ. of Tsukuba that aims to systematically summarize papers
on computer vision, pattern recognition, and related fields. For this
particular review, we focused on reading the ALL 602 conference papers
presented at the CVPR2015, the premier annual computer vision event held in
June 2015, in order to grasp the trends in the field. Further, we are proposing
"DeepSurvey" as a mechanism embodying the entire process from the reading
through all the papers, the generation of ideas, and to the writing of paper.Comment: Survey Pape
- …